MemSQL's recipe for a SQL database that's as fast as memcached

This post summarises the points I found interesting in the longer High Scalability post entitled "MemSQL Architecture - The Fast (MVCC, InMem, LockFree, CodeGen) And Familiar (SQL)" that gives an overview of the architecture of MemSQL, which is a SQL database reinvented for 21st-century hardware.  MemSQL's developers claim it to be the fastest database in the world, or at least so fast that you won't benefit from memcached, thus simplifying your app architecture.

Below are what I find to be the most interesting points about its architecture:
  • All queries are served from memory: db size limited by RAM. The maximum RAM is malloc'd on startup and never paged out.
  • Uses MySQL protocol: works with existing MySQL client libraries.  That's a stroke of genius right there.
  • Generates C++ from SQL templates and retains the generated code between restarts.  MemSQL employs an ex-Facebook dev who knows Facebook's HipHop PHP-to-C++ compiler.
  • Tables and indexes built on lock-free lists, skip-lists and hash tables, permitting full concurrency on 64-core systems (but does take a row lock to handle write-write conflicts)
  • By default transactions queue the writes and return immediately, but you can force full durability by setting transaction buffer size to 0.
  • Background thread groups the writes and appends to logfiles as fast as the disk will allow. Logfiles on reaching 2GB are consolidated into structured snapshots for faster startup.  Sequential writes only.
  • Shards are shared-nothing, so each write transaction can only write to one shard. Aggregation queries across shards work much like MapReduce.
  • Uses multi-version concurrency (MVCC): write transactions add new versions of the row to a linked list, so that concurrent transactions still see the old values.
There are some limitations though:
  • Restricted SQL support for now: no support for outer joins, views, stored procs, user functions, triggers, foreign keys or non-utf8 charsets. 
  • READ COMMITTED isolation only.
  • Not free, currently licensed by cpu-hours
And commenters point out some more:
  • Skip-lists are not friendly to CPU caches - memory access jumps around a lot during traversal. Counter-comment: benefit of caching is in the hit ratio.  Randomly querying a 64GB in-memory database when the CPU has a few MB cache, the hit ratio will be low no matter what data structure you use.
  • C++ compilation will make the first query for each pattern really slow
  • Will take many years to catch up to the features of existing SQL databases, if they can even be supported efficiently.

Popular posts from this blog

A comparison of file synchronisation software

Growing large crystals of Copper (II) Sulphate