Golang Runtime and Concurrency

  • Golang uses a user-space component (runtime) linked to the executable.
  • The runtime is written in C.
  • It has implementation of scheduler, goroutine management and OS-threads management.
  • Per go process, there is a max limit of OS threads.
  • Go runtime schedules N goroutines on M OS threads
  • One goroutine runs exactly on one thread.
  • A goroutine can get blocked (e.g. on a syscall) and blocks the OS-thread too.

References

Advertisements

Linux Device Driver Development: Block Device Driver

It is my very first interaction with Linux kernel at device driver level. My objective is to develop a block device driver, very simple, that just forward I/O requests to a virtual device. This post explains my observations limited to attacking the problem.

Block v/s Character Device

Linux support block and character device drivers. Only block devices can host and support a filesystem. Block devices support random read/write operations. Each block is composed of sectors, usually 512 bytes long and uniquely addressable. Block is a logical entity. Filesystems usually use 4096 bytes blocks (8*512) or 8 sectors. In Linux kernel, a block device is represented as a logical entity (actually just a C structure). So, we can export anything as a device as long as we can facilitate read/writes operations on sector level.

Device driver is the layer that glues Linux kernel and the device. Kernel receives device targeted I/O requests from an application. All I/O requests pass through buffer cache and I/O scheduler. The latter arranges I/O requests optimally to improve seek time, assuming requests would run on a disk. In fact, Linux kernel has various I/O schedulers and hence multiple type of I/O request order could exist.

A device driver always implement a request queue. The Linux I/O scheduler enqueues requests in driver’s queue. How to serve these requests? That is device driver’s headache. The request queue is represented by the request_queue structure and is defined in “blkdev.h". Driver dequeues requests from this queue and send them to device. It then acknowledgement to each requests with error status.

If a device do not need optimal I/O order, it may opt for direct handing of I/O requests. An excellent example of such driver is loopback driver (loop.c, loop,h). It handles struct bio that stands for block I/O. A bio structure is a scatter gather list of page aligned buffer (usually 4K). Handling of bio structure is almost same as a struct req.

What are requirements for my driver

 

  • Runs on flash storage drives
  • Perform plain I/O forwarding
  • Minimal overhead, minimal code size

In my next post, I will discuss design of my driver.

Mongo DB: Good to know things

  • Mongo DB is a No-SQL, free, open-source solution that is highly scalable, highly available and high performance solution.
  • Engine is coded in C++
  • Works in a client-server model
  • Major components:
    • mongod: The storage server
    • mongos: The sharding server
    • config server(s):
      • Stores metadata that accomplish sharding
      • Is actually a mongod process
  • Mongo provides write operations durability with journaling (write ahead logging)
  • User data is seen as a database of collection of records
    • Collection is roughly similar to a table in RDBMS
    • Record could be map to a row in a table (incorrect but helps understanding)
  • Mongo stores data in BSON format (on-wire and on-disk)

Rules for abort handling

An “Abort” is a special type of error in a system, usually injected by an external actor. In a multi-threaded application, managing abort requests becomes pain. I am sharing a few observations that could improve/minimize mistakes.

  • Implement one single handler for abort requests
  • Outside the handler, if a thread is going to wait, and abort may arrive in the meanwhile; then thread should check for abort as the first task
  • Use locks if we are determining abort with a flag
  • Never mix the abort path with a regular path in application. It is not wise to scatter abort related functionality among other threads

A design of abort handling module

The handling of abort of a operation is essential for a software. An abort represents:

  • An error condition
    • Internal errors
    • Subsystem errors
  • A user requested abort

The requirements of abort handling are:

  • Quickness: The ability to respond to an abort
  • Reliability: The measure of abort getting accepted at any phase of software execution
  • Robust: The resource cleanup and rendering system in an usable, stable state (no panic, exception or error)

Assume that there are two threads/ processes. One thread performs the job and another accepts request from the user. Abort can be requested from the job thread or job thread itself get aborted (error or another subsystem/layer error).

There are at least two approaches for abort handling that I have found in my experience:

  1. Using a check at multiple points in the execution path
  2. if (true == is_aborted) {
    goto exit_fun1;
    }
    
  3. Treating abort as an event and creating an event-handler
abort_me()
{
  enqueue(abort_event);
}
abort_handler(event e)
{
  case USER_ABORT:
    ....
    break;
  case INTERNAL_ABORT:
    ....
    break;
  default:
}

The first approach results litter in the code. Also, we need a variable acting as a flag to figure out occurrance of abort event. The use of flag needs a lock to get atomic value of abort flag. It make your code messy in a long run and things become difficult to manage and understand. A sample code could look as follows:

a()
{
  is_aborted();
  b();
  is_aborted();
  c();
  is_aborted();
}
b()
{
  is_aborted();
}

The latter approach is based on event handling. It needs more work. But it is cleaner and more intuitive. One possible implementation is by representing the software as a state machine. In a state machine, all events are managed by at least one queue. The state handler handles an abort event by switching to a state (named as “abort”/”error”). The handling of cleanup and appropriate action finds place in this handler.

state_machine()
{
   e = dequeue_event();

   switch(cur_state) {
        state_A:
            A_handler();
            break;
        state_B:
            B_handler();
            break;
        state_Abort:
            Abort_handler();
            break;
        default:
            assert(0);
   }
}

This model is discrete, predictable and manageable. We do not require a lock and flag checks. Thus we have better control and visibility of where an abort happened and where it was handled.