Category Archives: programming

Design Internals of UNIX Non Blocking Sockets

Fundamental of asynchronous programming: Non Blocking Sockets

Why async socket programming

  • An application can create thousands of sockets but creating a thread per socket becomes infeasible for a large number (e.g. 100k sockets)
  • An async framework consists non-blocking sockets and an event loop
    * The kernel allows applications to seek for an event
    * Application registers a callback to be called when the event occur
    * Callback runs by the same application thread after kernel intimates
      that an event happened on a socket
  • This model uses a single thread to poll many sockets for various events and calls the callback.
  • It is concurrent that is runs many tasks but not parallel execution of tasks.

How do a socket turn into non-blocking?

  • A socket is a file, read and written with standard POSIX calls.
  • A system call for read operation uses the kernel BIO (Block I/O). A BIO provides an interface to access a block device.

    FIONBIO int
    Set non-blocking I/O mode if the argument is non-zero. In non-
    blocking mode, read(2) or write(2) calls return -1 and set errno
    to EAGAIN immediately when no data is available (This is
    equivalent to fcntl() F_SETFL O_NONBLOCK and the fcntl() form
    should be preferred)

Written with StackEdit.


Golang Essentials for C & C++ Developers: Part I

I have been a C/C++ developer for more than a decade and now getting a chance to work in Go. There are many past references of C, C++, and Python to learn the language quickly. but there are some new syntax that appear unfamiliar. We share such language specific nuances, used widely in production quality code, to help us browsing Golang code.


  • A method is similar to a member function of a class (user defined type).
package main
import "fmt"

type Node struct {
    val int64

// Returns a pointer of Node type
func (n *Node) addNode() *Node {
    newNode := new(Node)
    newNode.val = 10

    return newNode

func main() {
        n := &Node {10}
        m := n.addNode()


A sample program of a linked list in Go with comments explaining the code.

package main

import "fmt"

type Node struct {
        prev *Node
        next *Node

        // key can assume any standard type
        key  interface{}

type List struct {
        head *Node
        tail *Node
// A method defined for type List, returns nothing
// and accepts a key
func (L *List) Insert(key interface{}) {
        // Create a new node with 'next' pointing
        // to Head. The 'prev' is by default zero.
        list := &Node{
                next: L.head,
                prev: nil,
                key:  key,

        // Attaching the new node at the head by
        // updating prev of current head to the
        // newly created node.
        if L.head != nil {
                L.head.prev = list

        // List is the first node so prev is nil
        list.prev = nil

        // Update the head
        L.head = list

        l := L.head

        // Now find the last node and update its tail link
        // to the newly added node.
        for != nil {
                l =

        L.tail = l

        fmt.Printf("head->%d\n", L.head.key)
        fmt.Printf("tail->%d\n", L.tail.key)
// Another method defined for type List
func (l *List) Display() {
        list := l.head

        for list != nil {
                fmt.Printf("%+v->", list.key)
                list =

// A function defined for type List
func Display(l *List) {
        list := l.head

        for list != nil {
                fmt.Printf("%+v->", list.key)
                list =
func main() {
        link := List{}

        // Calling a method

        // Calling a function


Written with StackEdit.

Linux Device Driver Development: Block Device Driver

It is my very first interaction with Linux kernel at device driver level. My objective is to develop a block device driver, very simple, that just forward I/O requests to a virtual device. This post explains my observations limited to attacking the problem.

Block v/s Character Device

Linux support block and character device drivers. Only block devices can host and support a filesystem. Block devices support random read/write operations. Each block is composed of sectors, usually 512 bytes long and uniquely addressable. Block is a logical entity. Filesystems usually use 4096 bytes blocks (8*512) or 8 sectors. In Linux kernel, a block device is represented as a logical entity (actually just a C structure). So, we can export anything as a device as long as we can facilitate read/writes operations on sector level.

Device driver is the layer that glues Linux kernel and the device. Kernel receives device targeted I/O requests from an application. All I/O requests pass through buffer cache and I/O scheduler. The latter arranges I/O requests optimally to improve seek time, assuming requests would run on a disk. In fact, Linux kernel has various I/O schedulers and hence multiple type of I/O request order could exist.

A device driver always implement a request queue. The Linux I/O scheduler enqueues requests in driver’s queue. How to serve these requests? That is device driver’s headache. The request queue is represented by the request_queue structure and is defined in “blkdev.h". Driver dequeues requests from this queue and send them to device. It then acknowledgement to each requests with error status.

If a device do not need optimal I/O order, it may opt for direct handing of I/O requests. An excellent example of such driver is loopback driver (loop.c, loop,h). It handles struct bio that stands for block I/O. A bio structure is a scatter gather list of page aligned buffer (usually 4K). Handling of bio structure is almost same as a struct req.

What are requirements for my driver


  • Runs on flash storage drives
  • Perform plain I/O forwarding
  • Minimal overhead, minimal code size

In my next post, I will discuss design of my driver.

Dissecting Python List allocator (CPython)

List is one of the most frequently used data structure in Python applications. The code for Python list object allocator is present in Python-2.7.5/Objects/listobject.c.

Python allocates list objects in heap and uses underlying malloc based APIs. List is a container of Python object references. In plain C terminology. it is a dynamic array of pointers.

Let’s understand basic creation and insertion APIs.

  • list_resize(): A list is of type PyListObject and is equivalent to PyObject. Every resize request allocates more than asked size.

Allocation unit size is : (size_requested + (size_requested << 3 + (size_requested < 9 : 3 or 6) ).

It gives a growth pattern of (0, 4, 8, 16, 25, 35, 46, 58, 72, 88).

  • free_list: Python keeps PyList_MAXFREELIST (defined to value 80) of free lists to save on lists allocation and deallocation.
  • PyList_New(): A new list allocation requests is served from either a free list or a new GC tracked object creation of type PyListObject. Next, a malloc request is made for requested list size.

Size calculated: nbytes = size * sizeof(PyObject *);

Object allocation: PyListObject *op = free_list[numfree] OR PyObject_GC_New(PyListObject, &PyList_Type);

Memory allocation for references: op->ob_item = (PyObject **) PyMem_MALLOC(nbytes);

Openstack swift: EADDRNOTAVAIL error


swift-bench keeps EADDRNOTAVAIL error with a highly concurrency setting and multiple swift-bench clients.


Ubuntu 13, Swift single machine installation (refer SAIO), swift-client runs local with no-proxy mode.


  • EADDRNOTAVAIL stands for either unavailability of ephemeral ports and a known kernel bug.
  • Check your range of ports: $cat /proc/sys/net/ipv4/ip_local_port_range
  • swift-bench in no-proxy mode uses direct client class based on Python’s HTTPLib. I saw that code for object write and read did not have connection close call. So, I added that. Please refer swift/common/
  • The kernel bug is based on usage of bind() call during a connection setup from client. swift-bench so not use bind. So this possibility is ruled out.
  • Swift administration guide advises of following setting:
The following settings should be in /etc/sysctl.conf:
# disable TIME_WAIT.. wait..
net.ipv4.tcp_tw_reuse=1# disable syn cookies
net.ipv4.tcp_syncookies = 0
To load the updated sysctl settings, run $sudo sysctl -p

The above mentioned solutions reduced the problem significantly. If there is a better solution, let me know.

Best Python memcache client: Python-memcached v/s Pylibmc

Recently, I tried to access memcached server on local and remote setup in a Python application. I am sharing my experience with two of most popular clients.


  • The first choice was a pure  Python solution to access memcached, that is python-memcached. It is a simple to install, understand and use solution.
  • It do not offer many customization setting, at least on its README 🙂
  • Very less or non-existent, useful documentation
  • Fails consistently for a high frequency request to memcached server. Failures are for simultaneous read on a file xxx.  I could not find an easy fix for this problem.
  • I do not suggest using it for a highly scalable application.


  • Bare minimum Python wrapper to C/C++ API to memcached
  • Installation was simple similar to Python-memcached
  • Offers many useful options during connection setup (such as non-blocking requests, TCP no-delay disable)
  • Shows no problem in demanding and highly scalable application
  • Shows better performance than Python-memcached

My verdict: Pylibmc is a clear winner 🙂