Reactive Extensions: Asynchronous Reactive Systems

Reactive Extensions: Asynchronous Reactive Systems

Reactive Extension is a paradigm to develop systems using the react model. It’s an event-driven state machine for functions. It’s possible to represent either synchronous or asynchronous communication with Reactive Extensions.

All we need is a message broker, producer of events (Observable), and consumers(Observer).

Typically, the consumer processes an event with a set of transformation/ processing. Each of the transformations is a step in the reactive pipeline. Each stage processes the data from the previous state.

In a simple sense, a typical function flow in a listener component is as following:

1. Get message
2. Unpack message
3. Process the payload
4. Log the result

With reactive experience, each step can become a thread, having a pipe for reading/writing data. Each step can be synchronous or asynchronous. We can have many pipelines, defined with a different set of tools (filters, map).

Reference


Discussion on Microservices Integration

Discussion on Microservices Integration

Microservices is a distributed system pattern and components need to communicate through the network.

If communication data includes internal technical details of participating components, the system loses the property of loose-coupling.

If a change in one service causes changes in many other services, we lose the property of high-cohesion.

An unreliable network and added latency for data communication dictate choices for services integration.

There are two major types:

  • Direct Communication
  • Event-based, asynchronous communication

Direct communication is a request/response-based. It is useful for low-latency and immediate consumption scenarios. It is prone to failures (server unresponsive, network latency) and hence caller/clients need to retry. This communication is not easily extensible and soon become brittle.

Asynchronous mode is a “fire and forget” approach. An event is generated and it is up to the consumers to handle it. This model scales very well: the publisher and consumers have no coupling. Both are independently deployable. However, it is hard to monitor the status of each event handler.

References

Written with StackEdit.


Domain Driven Design: Bounded Context

Microservice architecture borrows many classical concepts. Bounded Context is one of the most important ideas of domain-driven design.

A context means specific responsibility. Bounded Context implies a responsibility with explicit boundaries.

Reading from a few blogs, the bounded context means more of real-world identification of responsibilities. It’s seeing things from different altitudes: An application is a bounded context. A level below, it’s UI, backend are BC.

Overall, seeing a BC as a cell (like an organic cell) came more naturally. A cell defines a specific set of responsibilities. It has an explicit interface to accept/ reject model (communication data) requests.

Why Bother Bounded Context

  • BC promotes loses coupling and high cohesion.

References

Written with StackEdit.


Why BitCoin Uses BloomFilters?

Bitcoin uses a blockchain record of transactions among the participating nodes. Since the size of the record is huge, there is a thin participating node that is interested in a subset of the records.

But how to find the subset in the complete list of records. Bloom Filters help here. BF answers “does exist” query on a set.

Does A exist in a set S?
Yes (may not be true, so double-check with the DB or another mechanism)
No (definitely true)

Remember, Bloom Filter never returns the wrong and negative (False Positive) answer.

How it Works?

So the thin nodes in a Blockchain share a BloomFilter of transactions with Full Nodes (ones that have a full copy of blockchain records). Full Node lookup its blockchain transaction records with the passed BloomFilter.

Each transaction is checked in the Bloom Filter. All positives are sent to the thin node. The thin node eliminates false positives.

Reference

Written with StackEdit.


How Push Notifications Arrive on Your Phone?

The commonplace notifications on our iPhone/Android devices follow an interesting journey.

What is a Push Notification?

A push notification is a message sent to a user. The message is sent even if the app is not in use or the device is offline.

Who Sends the Message?

The app is controlled from a provider service e.g. 500px.com sends you a notification if your pictures are liked. But how exactly 500px server knows your device?
It doesn’t. I am explaining the flow with Apple and the iPhone. Android has a similar flow.

Work Flow

Assuming the app has push notifications and the user has enabled them for the app. after installing an app and logging in, the app requests Apple Push Notification Service (APNS) for a globally unique device token. The token is the unique key for all notifications for the given app and device.

The token is sent to the app’s server (500px.com) and the server stores them. Now, when the server wants to send a push notification to a user device, it creates message payload and sends the request along with the device token to APNS.

 

The APNS then sends the notification to the device.

Push Notifications with APNS

A Software Design Document Template

The design document is a key piece of a project and used throughout the lifecycle of the product. The following is a sample template for a software design document:

  1. Introduction – A paragraph about project/product
  2. Objective – A paragraph on the problem being solved
  3. Requirements – Specifications, expectations
  4. High-Level Design – How the new product/feature fits/interacts with existing systems
  5. Low-Level Design – Discusses proposed components and their interactions
  6. Implementation Plan – Discusses incremental, phased or any strategy followed for development
  7. Deployment Plan – How the new product/feature deployed, rolled out
  8. Contradictions – Known behavior, exceptions and special situations. Also limitations and boundary conditions

More ideas to enhance the template are welcome.

Reference


MVC Explained

MVC is an architecture to separate an application in three cohesive, loosely coupled verticals.

  1. Model: The data of your application and methods to access it.
  2. View: The final output/expected result.
  3. Controller: The interface that handles requests from the model

I’m trying to map it to a Linux Filesystem (e.g. ext2).

Model: The file system block manager and allocator for the storage device.
View: read()/write() methods.
Controller: Maps a file descriptor to file system blocks.


Ceph RGW Internals: Cache Coherence & Bucket Life Cycles

RGW Cache Coherence

Why RGW have a control pool? We will try to understand its use case and purpose in RGW for cache synchronization.

RGWRados::init_watch()

Creates watcher objects in RGW control pool

$ sudo rados ls -p .in-abc-1.rgw.control
notify.1
notify.6
notify.3
notify.7
notify.2
notify.4
notify.5
notify.0

The common assumption is that these objects are watched for any change and the threads sync their caches if there is a change on these objects.

RGWObjTags

class RGWPutLC: public RGWOp
- RGW_OP_PUT_LC
- RGW_OP_TYPE_WRITE
class RGWPutLC_ObjStore_S3: public RGWPutLC_ObjStore

    RGWHandler_REST_Bucket_S3::op_put()
      if(is_lc_op()) {
        return new RGWPutLC_ObjStore_S3;
class RGWHandler_REST_Bucket_S3 : public RGWHandler_REST_S3 {
protected:
  bool is_acl_op() {
    return s->info.args.exists("acl");
  }
  bool is_cors_op() {
      return s->info.args.exists("cors");
  }
  bool is_lc_op() {
      return s->info.args.exists("lifecycle");
  }
  bool is_obj_update_op() override {
    return is_acl_op() || is_cors_op();
  }
  bool is_request_payment_op() {
    return s->info.args.exists("requestPayment");
  }
  bool is_policy_op() {
    return s->info.args.exists("policy");
  }

RGW OP Handler

RGWOp* RGWHandler_REST::get_op(RGWRados* store)
{
  RGWOp *op;
  switch (s->op) {
   case OP_GET:
     op = op_get();
     break;
   case OP_PUT:
     op = op_put();
     break;
   case OP_DELETE:
     op = op_delete();
     break;
   case OP_HEAD:
     op = op_head();
     break;
   case OP_POST:
     op = op_post();
     break;
   case OP_COPY:
     op = op_copy();
     break;
   case OP_OPTIONS:
     op = op_options();
     break;
   default:
     return NULL;
  }

  if (op) {
    op->init(store, s, this);
  }
  return op;
} /* get_op */

Takes an exclusive lock on the LC object for a given bucket shard. Next, it sets OID in OMAP Uses: rgw_cls_lc_set_entry()

RGWLC Invocation

Entry for LC: RGWRados::init_complete()

This function reads zone and zone group config and creates
connection to zone endpoint. It also creates io context with
root, GC, LC, objexp (log) and reshard pool.
GC used the RGWObjectExpirer object which uses the objexp (log) pool.

 lc = new RGWLC();
 lc->initialize(cct, this);

 if (use_lc_thread)
 lc->start_processor();

RGWLC class has LCWorker class

void RGWLC::initialize(CephContext *_cct, RGWRados *_store)

- creates LC object names as lc.0, lc.1,...,lc.31
- creates a cookie buffer
void RGWLC::start_processor()
  • Spawns LCWorker threads
  • Each thread calls lc->process()
lc->process()

// src/cls/rgw/cls_rgw_const.h
// The above file has CLS functions of RGW.

It stores the operation meta in struct cls_rgw_lc_obj_head.
This structure has two fields: time and a marker string.

The list of objects is retrieved from OMAP in rgw_cls_lc_list_entries()
The input is op (cls_rgw_lc_list_entries_op op) marker, filter prefix
and max entries. The function rgw_cls_lc_list_entries() get the list.
MAX_LC_LIST_ENTRIES in one read is 100.

The list entries get the bucket ID and for each entry bucket state
is set to uninitial in OMAP.

RGWLC::bucket_lc_process(string& shard_id)

Shard ID has tenant, bucket name and bucket ID.
A bucket must have RGW_ATTR_LC set for LC processing.


Design: Brave Device Sync

The Brave Browser offers a sync facility that keeps bookmarks and browsing history across Brave browser installations. So your data from phone, laptop or iPad could all become one. All at the same time respecting your privacy.

The current design of sync uses a device ID to identify a client. A client calls Brave server to store data. The first client creates the seed phrase of the sync chain.

The Brave server listens to these requests using a serverless component.

Brave Sync Design

Reference


Why Redis Pipelining is a Good Idea?

What is PipeLining?

Pipelining is a form of asynchronous task execution. A Pipeline is a task that is composed of many subtasks. Each subtask may be dependent on its previous subtasks.

t1 => t2 => t3
            ^
        t4 =|

t2 depends on t1. t3 is dependent on t2 & t4.

While we are executing subtask t3 for primary task S1, we can also execute subtask t1 for another primary task S2. This is a pipelined execution.

How Redis Pipelining helps in Client-Server communication?
Each network request between the client and server involves a latency named Round-Trip-Time. Redis server reads a request from a client and client waits till the server writes the response.
We can rather ask the client to send many requests at once and collectively wait for the responses.

This helps us achieve:

  • Ability to counter RTT of a network.
  • Better throughput for client
  • Saves multiple reads from server to client.
  • The server sends single write with all responses

Classic Usage

  • POP3 uses pipelining

References