Yes, I DO wanna be your monkey wrench

Google publishes details of internal database ‘Spanner’

Elliot Bentley
google-spanner1

New paper reveals details of next-generation database system, which is underpinned by excellent timekeeping.

For a
company dedicated to making world’s information available, Google
are surprisingly secretive about their own backend. However, this
week they shed a small amount of light on some of the technology
that powers their infrastructure, a database system called
Spanner.

Detailed in a technical
paper
, which is to be officially presented at OSDI (Operating
System Design and Implementation) conference in Hollywood next
month, Spanner is described as “a scalable, globally-distributed
database […] designed to scale up to millions of machines across
hundreds of datacenters and trillions of database rows”.

Managing huge amounts of data is obviously crucial to Google’s
business plan; after all, with 425 million Gmail users – not to
mention YouTube, Google+, and their other myriad services –
existing options generally aren’t good enough. In the paper, the
authors report issues that have arisen with use of BigTable and
Megastore in existing projects, and how they have spent five years
building their own database system to power F1, their advertising
backend.

Spanner is able to provide high availability “even in the face of
wide-area natural disasters” by replicating data “within or even
across continents.” Its TrueTime API, dependent on GPS and atomic
clocks to keep perfect time, is described as the “linchpin” for two
unique features of Spanner: replication configurations can be
controlled dynamically by applications, to control read and write
latency; and providing externally consistent reads and
writes.

The paper’s 26 authors conclude:

We have shown that reifying clock uncertainty in the time API
makes it possible to build distributed systems with much stronger
time semantics. In addition, as the underlying system enforces
tighter bounds on clock uncertainty, the overhead of the stronger
semantics decreases. As a community, we should no longer depend on
loosely synchronized clocks and weak time APIs in designing
distributed algorithms.

There’s some interesting lessons to learn here, particularly in
regards to data replication, which could easily be applied to other
systems. While Spanner’s source is still very much closed, the
paper still provides some interesting solutions to problems faced
by engineers everywhere.

For more info, ZDNet have a
fairly good breakdown
, or take a gander at the paper
itself
.

Photo by Les
Chatfield
.

Author
Comments
comments powered by Disqus