Category: Uncategorized

Notes on YouTube Architecture

Web servers are usually not the bottleneck Caching levels: Database Serialized Python objects HTML pages Videos are sharded in the cluster to share load. Instead of single process Apache, lighthttp was used because it was multi-process. Showing thumbnails is challenging. A thumbnail is 5KB image. -DB sharding is the key. Written with StackEdit.

Notes on UNIX Pipes

A pipe call creates two file descriptors – read and a write $ ls -l|more if write from ls is too much to handle by more, the write waits till read drains it. If there is nothing to read, the pipe read waits. Pipe is a kernel resource and is basically just a buffer. A…

Notes on Signals in UNIX

Pressing a key generates an interrupt and kernel has a keyboard interrupt handler module. The KB interrupt handler would send a signal to all processes associated with the terminal. Once a signal handler is called, it is deleted from memory. So we call it recursively. void handler() { signal(SIGINT, handler); } Common Signals to know:…

Notes on Linux Process Management

All processes have a PID and a group ID. The group leader is your shell (terminal). So for a process to qualify as background, we change the process group usingsetgrp(). nohup ./a,out & chmod 4750 file1.txt 4 means SUID permission. An ordinary user can run this file with privileges of the actual owner of the…

Notes on fork()

We can avoid orphans by Parent process calling wait( ) call. wait(int *p) * p will have the return code of the child. So we can find if process was terminated normally or not. Child process has its own copy of globals too. The file descriptor table of the parent is shared with child. It…

Notes on exec( ) system call

The new process shared old process’s file descriptor table A printf() used before exec might not work because its buffers were not flushed. Use fflush(). exec’ed process too gets access to environ. Written with StackEdit.

Notes on Ceph RADOS paper

Why PGs? PGs enable a balanced distribution of objects. Without PGs, there are two ways to distribute objects: a. Mirroring on other OSDs b. Keep a copy of object on all the nodes in the cluster (declustering) PGs provide a way to replicate a set of objects in a fault-tolerant manner. Written with StackEdit.

Lifecycle of a URL access on a Browser

Step 1 DNS lookup Browser cache OS cache Router cache ISP DNS lookup Step 2 Connection setup 3 phase TCP connection Step 3 Browser sends HTTP request GET POST (for auth and form submission) Step 4 Server handles the request and prepares a response. Server could be a web server (Apache, IIS) Handler parses the…

What is a Udev rule?

Linux identifies everything in the system as a file, even devices. A device attached to a system is listed under /dev/. Managing these devices (renaming, persistent naming, permission modifications) requires a manager. Udev is one such manager. A device in Linux is recognised by the system under /sys. e.g. :/sys/block# ls -l total 0 lrwxrwxrwx…

Updating systemctl limits on Debian

Become root vi /lib/systemd/system/ceph-osd@.service Change the values of proc and files to the following (extracted from ulimit -a). [Service] LimitNOFILE=78452 LimitNPROC=80248 $ sudo systemctl daemon-reload Restart OSDs $ cat update-systemctl.sh for ip in $(cat ip.list) do scp ceph-osd@.service $ip:/tmp ssh $ip sudo cp /tmp/ceph-osd@.service /lib/systemd/system/ceph-osd@.service ssh $ip sudo systemctl daemon-reload ssh $ip sudo sudo systemctl…