Design Problems of PostGres- Part II

The problems in PostGres DB are solved in MySQL to an extent. The design merits of MySQL are the following:

  • The primary index has a mapping of key to an offset in the disk. But all secondary index tables have a mapping of key to the primary index’s key.
Primary index  
--------------------  
| key | Disk offset|  
--------------------  
Secondary index  
---------------------  
| key | primary key |  
---------------------  
  • The drawback is that a lookup of the secondary table needs two hops.
  • On the other hand, a change in a row only needs modification of the primary index. It avoids changing the secondary index for every change in row’s data
  • MySQL replication uses logical information instead of data with physical information used in PostGres.
  • Logs are compact compared to PG.
  • MySQL manages its own cache, unlike PG that uses buffer cache.
  • PG buffer cache use is inefficient since a read operation would have to land in the buffer cache and hence a context switch.
  • PG uses a process per connection. It is expensive. MySQL uses threads for a connection and scales without too much resource consumption.
Advertisements

Design Problems of PostGres- Part I

This post is a quick summary of Why Uber moved from PostGres to MySQL.

PostGres Rows and CTID

  • PostGres provides transactions. Transactions need multiple versions of data. So PG is Multi Versioned DB.
  • PG considers each row immutable. Any change to a row creates a new row version.
  • A row is represented as an offset in disk, called ctid
  • Every row will have a unique ctid because they occupy a different space. However, multiple rows (ctid) can share same disk offset (e.g. multiple verisons of a row)
------------------
| Row 0 (ctid 1) |
------------------
                       ----------> Disk block offset x
------------------
| Row 1 (ctid 2) |
------------------

PostGres Index Management

  • Each index has key and values as CTID
  • Any change in a row’s data creates a new CTID. That would need changing all indexes.
  • This is expensive because
    • PG uses WAL technique so each write is at least twice
    • PG replicates WAL to secondary nodes.
    • The WAL has CTID and disk offset level information.
    • Replication across geographies is expensive since data volume is high.

PostGres Replication

  • At a secondary, while a transaction is in progress on a row, the WAL copy will wait.
  • If transaction runs for a long time, PG will terminate the transaction after a timeout.
  • So there are two problems:
    • Transactions can terminate unexpectedly
    • Replicas may be lagging the master more than exptected

Written with StackEdit.