Netlink Sockets: Addind a new socket family

  • Kernel version 3.13, Ubuntu 14.04
  • $ uname -a
    Linux ubuntu 3.13.0-37-generic #64-Ubuntu SMP Mon Sep 22 21:28:38 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
  • We need two changes:
  • User space
    • Locate the file “netlink.h”
    • $ sudo locate “netlink.h”
    • You would get the header file at “/usr/include/linux/netlink.h”
    • Add the new family
    • #define NETLINK_MY     17
    • Keep the number less than 32
    • In you user application, add “#define NELINK_MY 17”
  • Kernel space
    • Locate the netlikn file for kernel space
    • /usr/src/linux-headers-3.13.0-24/include/uapi/linux/netlink.h
    • Add exact definition here as well
  • No recompilation of kernel is required.
  • References:
Advertisements

Linux Device Driver Development: Block Device Driver

It is my very first interaction with Linux kernel at device driver level. My objective is to develop a block device driver, very simple, that just forward I/O requests to a virtual device. This post explains my observations limited to attacking the problem.

Block v/s Character Device

Linux support block and character device drivers. Only block devices can host and support a filesystem. Block devices support random read/write operations. Each block is composed of sectors, usually 512 bytes long and uniquely addressable. Block is a logical entity. Filesystems usually use 4096 bytes blocks (8*512) or 8 sectors. In Linux kernel, a block device is represented as a logical entity (actually just a C structure). So, we can export anything as a device as long as we can facilitate read/writes operations on sector level.

Device driver is the layer that glues Linux kernel and the device. Kernel receives device targeted I/O requests from an application. All I/O requests pass through buffer cache and I/O scheduler. The latter arranges I/O requests optimally to improve seek time, assuming requests would run on a disk. In fact, Linux kernel has various I/O schedulers and hence multiple type of I/O request order could exist.

A device driver always implement a request queue. The Linux I/O scheduler enqueues requests in driver’s queue. How to serve these requests? That is device driver’s headache. The request queue is represented by the request_queue structure and is defined in “blkdev.h". Driver dequeues requests from this queue and send them to device. It then acknowledgement to each requests with error status.

If a device do not need optimal I/O order, it may opt for direct handing of I/O requests. An excellent example of such driver is loopback driver (loop.c, loop,h). It handles struct bio that stands for block I/O. A bio structure is a scatter gather list of page aligned buffer (usually 4K). Handling of bio structure is almost same as a struct req.

What are requirements for my driver

 

  • Runs on flash storage drives
  • Perform plain I/O forwarding
  • Minimal overhead, minimal code size

In my next post, I will discuss design of my driver.

Extents in “ext3” file system

My Ubuntu Linux ship with ext3 file-system. This FS is very similar to classical model explained in UNIX OS. A file is logically arranged in a set of blocks, managed through an array of block pointers. In ext3, each inode has an array of fifteen elements. Twelve elements of this array point to a disk block. Usually, a disk block is configured to 4KB. Thus twelve such 4KB blocks (L0/ level 0) could be pointed by these array entries (i.e. a file of 48BK is always contained in L0 blocks).

As soon as a new block is allocated for this file, thirteenth element of this array comes to play. It is called L1/level1 block and keep pointers to 1024 L0 blocks. Thirteenth request creates an entry in L1 blocks. Fourteenth and fifteenth entry serve L2 and L3 blocks.

Instead of allocating a block at a time, Linux has optimized the way disk blocks are allocated for a file with “extents”. An extent is a contiguous set of disk blocks. Kernel allocates an extent of blocks.

Lets create a file and see how does Linux handle block-allocation:

1 #include <fcntl.h>
2 int main()
3 {
4   int i;
5   char buf[4096];
6
7   memset(buf, ‘a’, 4096);
8   int fd = open(“foo.txt”, O_CREAT|O_WRONLY|O_TRUNC);
9
10   for (i =0; i < 2; i++) {
11     write(fd, buf, 4096);
12   }
13   write(fd, “end”, 3);
14   close(fd);
15 }

We have created a 8192 +3  = 8195 bytes file.

kanaujia@ubuntu:~/Desktop/ToKeep/cprogs$ ls -l foo.txt
———- 1 kanaujia kanaujia 8195 2011-10-16 10:42 foo.txt

How much space this file occupy on disk?

kanaujia@ubuntu:~/Desktop/ToKeep/cprogs$ du -h !$
du -h foo.txt
12K    foo.txt

That means I need three disk blocks of 4KB to store this file. Now let’s see how Linux allot these block in a single extent. The ioctl() call has FS_IOC_FIEMAP flag that provides facility to get access to this information from user space. A file extent map structure is defined as follows:

[include/linux/fiemap.h]

struct fiemap {
  28        __u64 fm_start;         /* logical offset (inclusive) at
  29                                           * which to start mapping (in) */
  30        __u64 fm_length;        /* logical length of mapping which
  31                                              * userspace wants (in) */
  32        __u32 fm_flags;         /* FIEMAP_FLAG_* flags for request (in/out) */
  33        __u32 fm_mapped_extents; /* number of extents that were mapped (out) */
  34        __u32 fm_extent_count;  /* size of fm_extents array (in) */
  35        __u32 fm_reserved;
  36        struct fiemap_extent fm_extents[0]; /* array of mapped extents (out) */
  37};

A simple C program to fill-up this structure would fetch you extent information.

For my file, I got the following data:

kanaujia@ubuntu:~/Desktop/ToKeep/cprogs$ ./fiemap ./foo.txt
File ./foo.txt has 1 extents:
#    Logical                        Physical                           Length           Flags
0:    0000000000000000 0000000000000000 0000000000003000 0007

We have only one extent to accommodate this file. This extent spans three disk block as the length is 0x3000 or 12KB.

Ref: