Find a String in Files Excluding Some Files


tags: ‘shell, linux, command, development, grep,find’
categories: development

I use the following command to find a string in a Go repo. The vendor directory is not needed, so skip it.

$ find . -type f |grep -v vendor|cut -d":" -f2|xargs grep -n my_search_string

The command find all files in current directory. Next, it removes all files that have vendor in their path.

401:./.git/HEAD
402:./.git/info/exclude
403:./.git/logs/HEAD

The cut command picks the filename and passed to xargs. Each file is then processed by grep for the search string.

Advertisements

Fixing Terminal Row & Columns Display on Docker and Linux

Ever faced garbled screen on a terminal, text wrapping over and screen command string messed up!
It happens because the terminal is using default row and column value (e.g. columns = 80).

The following command fixes it (tested on Docker container’s terminal)

docker exec -e COLUMNS="`tput cols`" -e LINES="`tput lines`" -ti 

How it Works?

  • tput initializes or reset a terminal by providing terminal info to the shell.
  • From man page of tput:
    The tput utility uses the terminfo database to make the
    values of terminal-dependent capabilities and information available to the shell
    
  • Let’s find the type of terminal
    # echo $TERM
      xterm
    
  • Let’s see what’s suggested rows and columns for this terminal.
    # tput cols
    167
    # tput lines
    49
    
  • While accessing the container terminal, we passed the number of columns and rows to COLUMNS & LINES variable.
  • The terminal database is present at /usr/share/terminfo.

References

Written with StackEdit.

xargs: ls: terminated by signal 13

The following command hits a problem with sig 13.

$ find . -type f|xargs ls -altr|head
-rw-r--r-- 1 root root  254111 Mar 17  2018 ./60/62
-rw-r--r-- 1 root root  135111 Mar 17  2018 ./60/66
xargs: ls: terminated by signal 13

Why did it fail?

$ strace find . -type f|xargs ls -altr|head
newfstatat(AT_FDCWD, "xyxyx", {st_mode=S_IFREG|0644, st_size=80241, ...}, AT_SYMLINK_NOFOLLOW) = 0
write(1, "40ee30e7c76a7541d61acb6ec4\n./60/"..., 4096) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=376666, si_uid=1190} ---
+++ killed by SIGPIPE +++

xargs process is writing to a closed pipe, gets a SIGPIPE and gets killed.

The head command is getting finished before xargs could finish. It closes the read end of the pipe and xargs get a SIGPIPE as the pipe is no more being read.

Written with StackEdit.

Linux Device Driver Development: Block Device Driver

It is my very first interaction with Linux kernel at device driver level. My objective is to develop a block device driver, very simple, that just forward I/O requests to a virtual device. This post explains my observations limited to attacking the problem.

Block v/s Character Device

Linux support block and character device drivers. Only block devices can host and support a filesystem. Block devices support random read/write operations. Each block is composed of sectors, usually 512 bytes long and uniquely addressable. Block is a logical entity. Filesystems usually use 4096 bytes blocks (8*512) or 8 sectors. In Linux kernel, a block device is represented as a logical entity (actually just a C structure). So, we can export anything as a device as long as we can facilitate read/writes operations on sector level.

Device driver is the layer that glues Linux kernel and the device. Kernel receives device targeted I/O requests from an application. All I/O requests pass through buffer cache and I/O scheduler. The latter arranges I/O requests optimally to improve seek time, assuming requests would run on a disk. In fact, Linux kernel has various I/O schedulers and hence multiple type of I/O request order could exist.

A device driver always implement a request queue. The Linux I/O scheduler enqueues requests in driver’s queue. How to serve these requests? That is device driver’s headache. The request queue is represented by the request_queue structure and is defined in “blkdev.h". Driver dequeues requests from this queue and send them to device. It then acknowledgement to each requests with error status.

If a device do not need optimal I/O order, it may opt for direct handing of I/O requests. An excellent example of such driver is loopback driver (loop.c, loop,h). It handles struct bio that stands for block I/O. A bio structure is a scatter gather list of page aligned buffer (usually 4K). Handling of bio structure is almost same as a struct req.

What are requirements for my driver

 

  • Runs on flash storage drives
  • Perform plain I/O forwarding
  • Minimal overhead, minimal code size

In my next post, I will discuss design of my driver.

Linux FUSE Internals for developers

In this post, I will cover FUSE internals for FUSE 2.9.3.

  • Install package fuse and fuse-devel on CentOS.
  • getattr() is a must in a FUSE file-system. Any lame implementation is okay;
    • Just be careful of the file size in stat structure. If you forgot to compile user file system with 64-bit flags on. Otherwise the statst_size is signed int (32 – 1 bit field).
    • Your file size should not exceed > 2GB. Otherwise, it will be overflowed to zero.
  • In the user application, be careful with file I/O operations. A read () immediately followed by a write() would fetch you nothing. You should first lseek() to beginning of the file in your application.
  • FUSE has two modes of operation:
    • Single thread (very low performance, easy to debug)
    • Multi-thread (default operation)
  • Multi-thread spawns multiple threads during read operation. I observed almost single thread like behavior for writes.
  • Code to multi-thread I/O implementation is in lib/fuse_loop_mt.c.
  • FUSE uses worker threads to handle I/O requests, using struct fuse_worker. A worker thread is created in fuse_start_thread().
  • Each worker run fuse_do_work() function. This is an infinite loop and terminates only on session exit OR if number of active threads exceed than required.
  • User implementation of file system APIs are populated in const struct fuse_operations. It has address of all implemented APIs. FUSE ultimately calls these APIs for file system operations.
  • FUSE 2.7 reads 8K data by default, in two 4K chunks
    • Read happens in last 4K and the first 4K data block
  • An example:

    I had set the file size as 4MB in getattr () implementation. If you forget to compile with 64-bit flags, you will get zero length files.

    int bb_getattr(const char *path, struct stat *statbuf)
    {
        int retstat = 0;
        memset(statbuf, 0, sizeof(struct stat));
        if (strcmp(path, "/") == 0) {
            statbuf->st_mode = S_IFDIR | 0755;
            statbuf->st_nlink = 2;
        } else {
            statbuf->st_mode = S_IFREG | 0444;
            statbuf->st_nlink = 1;
            statbuf->st_size = 4 * 1024* 1024;
        }   
        return retstat;
    }

    The sequence of calls and their arguments is as follows:

    bb_getattr(path="/abcd.txt", statbuf=0xc5387960)
        rootdir = "/tmp", path = "/abcd.txt"
    bb_open(path"/abcd.txt", fi=0xc5daaa50)
        rootdir = "/tmp", path = "/abcd.txt"
        fi:
        flags = 0x00008002
        fh_old = 0x00000000
        writepage = 0
        direct_io = 0
        keep_cache = 0
        fh = 0x0000000000000001
        lock_owner = 0x0000000000000000
    bb_write(path="/abcd.txt", buf=0xc4966050, size=10, offset=0, fi=0xc5387a50)
        fi:
        flags = 0x00000000
        fh_old = 0x00000001
        writepage = 0
        direct_io = 0
        keep_cache = 0
        fh = 0x0000000000000001
        lock_owner = 0x0000000000000000
    bb_read(path="/abcd.txt", buf=0x06ccbd90, size=12288, offset=4096, fi=0xc5daaa50)  <- Here
        fi:
        flags = 0x00000000
        fh_old = 0x00000001
        writepage = 0
        direct_io = 0
        keep_cache = 0
        fh = 0x0000000000000001
        lock_owner = 0x0000000000000000
    
    bb_read(path="/abcd.txt", buf=0x06ccbd90, size=4096, offset=0, fi=0xc5daaa50)
        fi:
        flags = 0x00000000
        fh_old = 0x00000001
        writepage = 0
        direct_io = 0
        keep_cache = 0
        fh = 0x0000000000000001
        lock_owner = 0x0000000000000000
WRITE stack trace
================
(gdb) bt
#0  bb_write (path=0x7ffc68000990 "/test_file.0", buf=0x7ffff6f42060 "", size=4096, offset=4096, fi=0x7ffff6f40550) at bbfs.c:136
#1  0x00007ffff7dc885f in fuse_fs_write_buf (fs=0x280f090, path=0x7ffc68000990 "/test_file.0", buf=0x7ffff6f40580, off=4096, fi=0x7ffff6f40550)
    at fuse.c:1878
#2  0x00007ffff7dccb37 in fuse_lib_write_buf (req=0x7ffc680008c0, ino=2, buf=0x7ffff6f40580, off=4096, fi=0x7ffff6f40550) at fuse.c:3278
#3  0x00007ffff7dd461b in do_write_buf (req=0x7ffc680008c0, nodeid=2, inarg=0x7ffff6f42038, ibuf=0x7ffff6f40800) at fuse_lowlevel.c:1300
#4  0x00007ffff7dd7369 in fuse_ll_process_buf (data=0x280f220, buf=0x7ffff6f40800, ch=0x280ece0) at fuse_lowlevel.c:2437
#5  0x00007ffff7dd9aa5 in fuse_session_process_buf (se=0x280ed30, buf=0x7ffff6f40800, ch=0x280ece0) at fuse_session.c:87
#6  0x00007ffff7dd0f6a in fuse_do_work (data=0x7ffff00008c0) at fuse_loop_mt.c:117
#7  0x00000037bc2079d1 in start_thread () from /lib64/libpthread.so.0
#8  0x00000037bbee8b6d in clone () from /lib64/libc.so.6

READ stack trace
=================
(gdb) bt
#0  bb_read (path=0x7ffff38c55b0 "/test_file.0", buf=0x7ffff38c56c0 "", size=4096, offset=8192, fi=0x7ffff79635d0) at bbfs.c:111
#1  0x00007ffff7dc841e in fuse_fs_read_buf (fs=0x280f090, path=0x7ffff38c55b0 "/test_file.0", bufp=0x7ffff7963578, size=4096, off=8192, 
    fi=0x7ffff79635d0) at fuse.c:1794
#2  0x00007ffff7dcca1d in fuse_lib_read (req=0x7ffff002a1e0, ino=2, size=4096, off=8192, fi=0x7ffff79635d0) at fuse.c:3252
#3  0x00007ffff7dd42c7 in do_read (req=0x7ffff002a1e0, nodeid=2, inarg=0x7ffff7965038) at fuse_lowlevel.c:1232
#4  0x00007ffff7dd73ce in fuse_ll_process_buf (data=0x280f220, buf=0x7ffff7963800, ch=0x280ece0) at fuse_lowlevel.c:2441
#5  0x00007ffff7dd9aa5 in fuse_session_process_buf (se=0x280ed30, buf=0x7ffff7963800, ch=0x280ece0) at fuse_session.c:87
#6  0x00007ffff7dd0f6a in fuse_do_work (data=0x280ee30) at fuse_loop_mt.c:117
#7  0x00000037bc2079d1 in start_thread () from /lib64/libpthread.so.0
#8  0x00000037bbee8b6d in clone () from /lib64/libc.so.6

Compiling FUSE based file system with your FUSE build

Suppose hello.c has implementation of file system APIs and your FUSE installation resides in /home/k/Desktop/my_fuse_2.9.3.

$gcc -g hello.c -o hi -D_FILE_OFFSET_BITS=64 -I/home/k/Desktop/my_fuse_2.9.3/include -lpthread
-L/home/k/Desktop/my_fuse_2.9.3/lib -lfuse -LLIBDIR=/home/k/my_fuse_2.9.3/lib
-Wl,-rpath -Wl,/home/k/my_fuse_2.9.3/lib

File system in userspace (FUSE) on Linux : C implementation

  • BBFS is a good starting point to develop a file system in C
  • Application should keep in mind file offset before issuing a request. A write should be followed with a seek to offset zero, before issuing read.
  • FUSE 2.7 reads 8K data by default, in two 4K chunks
  • Read happens in last 4K and the first 4K data block

Example:

I had set the file size as 4MB in getattr () implementation.

int bb_getattr(const char *path, struct stat *statbuf)
{
    int retstat = 0;
    char fpath[PATH_MAX];
    bb_fullpath(fpath, path);

    memset(statbuf, 0, sizeof(struct stat));
    if (strcmp(path, "/") == 0) {
        statbuf->st_mode = S_IFDIR | 0755;
        statbuf->st_nlink = 2;
    } else {
        statbuf->st_mode = S_IFREG | 0444;
        statbuf->st_nlink = 1;
        statbuf->st_size = 4 * 1024* 1024;
    }   
    return retstat;
}

The sequence of calls and their arguments is as follows:

bb_getattr(path="/abcd.txt", statbuf=0xc5387960)
    bb_fullpath:  rootdir = "/tmp", path = "/abcd.txt", fpath = "/tmp/abcd.txt"
bb_open(path"/abcd.txt", fi=0xc5daaa50)
    bb_fullpath:  rootdir = "/tmp", path = "/abcd.txt", fpath = "/tmp/abcd.txt"
    fi:
    flags = 0x00008002
    fh_old = 0x00000000
    writepage = 0
    direct_io = 0
    keep_cache = 0
    fh = 0x0000000000000001
    lock_owner = 0x0000000000000000
bb_write(path="/abcd.txt", buf=0xc4966050, size=10, offset=0, fi=0xc5387a50)
    fi:
    flags = 0x00000000
    fh_old = 0x00000001
    writepage = 0
    direct_io = 0
    keep_cache = 0
    fh = 0x0000000000000001
    lock_owner = 0x0000000000000000
bb_read(path="/abcd.txt", buf=0x06ccbd90, size=12288, offset=4096, fi=0xc5daaa50)  <- Here
    fi:
    flags = 0x00000000
    fh_old = 0x00000001
    writepage = 0
    direct_io = 0
    keep_cache = 0
    fh = 0x0000000000000001
    lock_owner = 0x0000000000000000

bb_read(path="/abcd.txt", buf=0x06ccbd90, size=4096, offset=0, fi=0xc5daaa50)
    fi:
    flags = 0x00000000
    fh_old = 0x00000001
    writepage = 0
    direct_io = 0
    keep_cache = 0
    fh = 0x0000000000000001
    lock_owner = 0x0000000000000000

Forward declaration of a structure in C

What do you think of following code?

/*
 * decl.h
 */
struct junk {
    int a;
};

----------------------------------------------
/*
 * fwd.c
 * We have not included the header file decl.h.
 */
#include 
struct junk;

int main()
{
    struct junk *ptr;
    printf("%d", ptr->a);
}

You will get compilation error that structure object ptr is incomplete. This is misleading if you do not go through all included header files. cscope will make you believe that definition exists. You just wonder till you find out that structure was never defined in fwd.c!

XMPP, Openfire and Pidgin: A weekend buffet

My friend Pritam and I have embarked on a small walk to chat-world. The objects are XMPP based Openfire, and Pidgin. Our objective is to create a XMPP based private chat server that we would later host on a private IP. We plan to develop a plugin for customized behavior of server towards a client.

Why XMPP?

Because it is widely used standard for message-oriented communication. XMPP wiki is here.

Why Openfire? Because it is popular, free and open.

Our first day

– Installing Openfire is simple and so is configuring, adding users, and going online. In your local network, it is advisable to turn-off firewall. Read more here about cryptic Openfire parameters.

– Installing Pidgin is simple. You should be able to connect to your local chat server in an hour.

– We wanted our server to interact with all users and we found “Message of the Day” plugin. This plugin is part of Openfire sources. Code is in Java. We played with the code of this plugin and enjoyed different behaviors.

– Building Openfire sources requires Apache ANT and JRE 1.5 or higher. You can find more details here.

After downloading the Openfire sources, you would find “plugin” directory that has sources for “Message of the Day”. We have to build it. A build for plugin is as following which builds all plugins:

xyz@localhost:~/Downloads/openfire/src/openfire_src/build> ~/apps/netbeans-6.9.1/java/ant/bin/ant plugins

– This run will build all plugins. Our plugin is created as “motd.jar”. Check if the “JAR” is created.

xyz@localhost:~/Downloads/openfire/src/openfire_src/build> ll ../target/openfire/plugins/motd.jar

– Now we have your plugin ready and we deploy it in our Openfire instance. Deployment is trivial. We have to copy the new JAR file to plugins directory of Openfire. It will automatically create the necessary directory structure for our plugin.

xyz@localhost:~/Downloads/openfire/src/openfire_src/build> cp ../target/openfire/plugins/motd.jar ~/Downloads/openfire/plugins/
xyz@localhost:~/Downloads/openfire/src/openfire_src/build> ll !$
ll ~/Downloads/openfire/plugins/
total 60
drwxr-xr-x 3 xyz users 4096 2011-10-02 03:22 admin
drwxr-xr-x 5 xyz users 4096 2012-05-06 20:16 motd
-rw-r--r-- 1 xyz users 12621 2012-05-06 20:28 motd.jar
xyz@localhost:~/Downloads/openfire/src/openfire_src/build> ll ~/Downloads/openfire/plugins/
total 60
drwxr-xr-x 3 xyz users 4096 2011-10-02 03:22 admin
drwxr-xr-x 5 xyz users 4096 2012-05-06 20:28 motd
-rw-r--r-- 1 xyz users 12621 2012-05-06 20:28 motd.jar

– We played with different type of messages that we can send to a newly joined client. Then we dissected the Session type, hostname and many other useful attributes including Packets, Messages. One example is that the message of the day now tells the user his name 😛

(8:30:27 PM) localhost: vishal@localhost/59b71259

This is the message from the Openfire server to a client as soon as the client creates a session.

– You should keep Openfire API handy with you.

– We have planned to do a few more experiments in coming weeks to understand:

  • How XMPP server processes client connections?
  • How to receive and send packet to a client?
  • Could it be possible to get a slimmer Openfire 😉

Synchronize two threads to print ordered even and odd numbers in C

Problem:You have two threads, that independently print even and odd values. The goal is to synchronize these threads so as to get an non-decreasing ordered set of numbers and order should preserve natural ordering of numbers. So the output should be 1,2,3,4,5,6,7,8…..

It is an interesting problem and could be solved with a conditional variable.

Following is C implementation of the solution:

#include "stdio.h"
#include "stdlib.h"
#include "pthread.h"

pthread_mutex_t count_mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t condition_var = PTHREAD_COND_INITIALIZER;
void *functionCount1();
void *functionCount2();
int count = 0;
#define COUNT_DONE 200

void main()
{
 pthread_t thread1, thread2;
 pthread_create( &thread1, NULL, &functionCount1, NULL);
 pthread_create( &thread2, NULL, &functionCount2, NULL);
 pthread_join( thread1, NULL);
 pthread_join( thread2, NULL);
 exit(0);
}

// Print odd numbers
void *functionCount1()
{
  for(;;) {
   // Lock mutex and then wait for signal to relase mutex
   pthread_mutex_lock( &count_mutex );
   if ( count % 2 != 0 ) {
     pthread_cond_wait( &condition_var, &count_mutex );
   }
   count++;
   printf("Counter value functionCount1: %d\n",count);
   pthread_cond_signal( &condition_var );
   if ( count >= COUNT_DONE ) {
     pthread_mutex_unlock( &count_mutex );
     return(NULL);
   }
   pthread_mutex_unlock( &count_mutex );
 }
}

// print even numbers
void *functionCount2()
{
  for(;;) {
  // Lock mutex and then wait for signal to relase mutex
  pthread_mutex_lock( &count_mutex );
  if ( count % 2 == 0 ) {
    pthread_cond_wait( &condition_var, &count_mutex );
  }
  count++;
  printf("Counter value functionCount2: %d\n",count);
  pthread_cond_signal( &condition_var );
  if( count >= COUNT_DONE ) {
    pthread_mutex_unlock( &count_mutex );
    return(NULL);
  }
  pthread_mutex_unlock( &count_mutex );
 }
}

GNU cflow: A tool to analyze C code flow

I was having a look at a code piece authored by someone else. And I wondered if we could have a tool that would at least find out the code flow. I googled for such tool and found “GNU cflow”.

“cflow” is an open source tool, that can facilitate static analysis in form of code-flow of your program. cflow reports are hierarchical, hence very intuitive. If you desire a GUI form of this data, we have another open source utility called Graphviz.

I found it very useful for:

o) Reviewing a new code
o) Understanding legacy C code
o) Remember, it can’t handle function pointers 😦
 

A typical cflow invocation look like this:

cflow --format=posix --omit-arguments --level-indent='0=\t'
--level-indent='1=\t' --level-indent=start='\t' "source_file" > "out.cf"

Output looks as follows:

$ ./cflow ./my2.c
main() :
    open()
    printf()
    read()

By default, “cflow” shows all symbols present in your program. We (Srikanth and I) didn’t like that, and hence changed the “cflow” to accommodate our request to ignore-symbols.

So, you just mention all symbols you’d like to ignore in a file “/tmp/skip_list.txt” in a format like this

Symbol_1
Symbol_2

The source code and binary are available here.

After collecting the profile, a browser based GUI can be generated with cflow2dot utility.