King of Silicon Alley

MongoDB’s Dwight Merriman: “We think all the NoSQL products are competitors”

Lucy Carey
dwight1

We meet the king of the Silicon Alley startup scene to discuss NoSQL, breaking records, and why he believes the success of the humongous database is down to more than just good marketing.

 

NoSQL database MongoDB had even non-tech
journalists biting for a scoop this fall when it emerged that it
had received an eye-watering cash injection to the tune of $150
million following its latest round of funding. JAX Magazine caught
up with the co-founder of what‘s been dubbed ‘the King of New York
startups’, Dwight Merriman, at October’s JAX London 2013 conference
to hear firsthand his take on the NoSQL scene. (This article was
originally published in JAX Magazine).

JAXenter: Your JAX London talk was very
focused on agile development. Why do you feel that NoSQL databases
go hand in hand with agile development?

Well…I think they should go hand in hand. In the
past they haven’t, because the traditional databases we use are
relational and the theory was invented 40 years ago  – and the
product started to emerge 30 years ago. Agile development is much
newer. It’s perhaps not fair to relational, but you can design for
something that doesn’t exist yet very easily. I think it is
challenging to agile development when your backend database for
your online system is relational. It just wasn’t designed for that,
and it couldn’t have been, as mentioned. So, I think for a lot of
reasons today, we’re due for some new technologies on the datastore
and database side.

Just the computer architectures we use today, the
cloud computing – the software development that we were just
talking about, the programming languages we use today – none of
these things existed when relational was invented, so it doesn’t
automatically fit elegantly with these things we use today for
everything else which is newer. It’s a compliment to relational
database theory that it’s the longest lived technology on the
software side that we really use – which is amazing. But, on the
other hand, everything else we use changes more often, and there’s
new technologies. We’re not using the programming languages we were
using 25 years ago typically. So I think it’s time for some new
tools that just fit better with the rest of the stack that we use
today. In both the literal stack, and the kind of metastack –
methodologies such as iteration and agile development.

With MongoDB reaching record breaking news
this year (the company was recently crowned ‘King of New York
startups by Bloomberg after a bar setting $1.2 billion valuation)
do you feel this will change you as a company? Do you think
perceptions of you will change?

I think perceptions are changing…independent of the
financing. The amount of enterprise usage of the product is getting
to be pretty high – quite high. So that’s great. I think
internally, nothing’s changed. I think our goal is
consistent…

So the financing won’t affect your
strategy?

No, I mean, we’ll use some of the proceeds from the
financing to accelerate R&D, invest more in engineering of the
core product. You know, databases are, I believe, large in scope
projects, and require a lot of work – you know, how long did it
take some of the relational products to get to full maturity? To
that point where the next release of the product feels kind of like
the last release. It took a while – I think it might have taken 15
years. So, there’s a lot of work to do just in that sense, to make
the product optimal.

What more do you think you need to do to
optimise MongoDB?

There’s so much – so many things. One is just maturing
the technology as it stands. And the second is just adding
additional capabilities or features. For maturing the technology,
it’s things like more granular concurrency, or we want to do a
revision of the storage engine to improve things like
fragmentation, improve aspects there, changes there to improve
performance. So, there’s a lot of core work on the kernel of the
product that we want to do.

In addition, there’s features or new capabilities. It
can mean little things like these new operators, or it could be big
things like, an example from the past was when we added the
aggregation framework, or pipeline – aggregation pipeline is, I
think, is what we call it now. It provides you a way to
declaratively run aggregations, or aggregate querying, or reporting
on MongoDB, much like you would do with the GROUP BY operator in
SQL. So that was a whole new subsystem, if you will.

I think there’s a whole lot of work that we want to do
regarding integration with other products. There’s a Hadoop
connector for MongoDB today. I think we will continue to improve
that, and do more and more integrations with other products like
we’re doing today, with things like WebSphere Informatica.

And then, a very important thing we want to do a lot
of work on, and are working on, is operational management of large
MongoDB clusters. And there’s a lot of facets to that, like one
thing we’ve released recently was the backup service. So it’s
MongoDB backed up to the cloud. It’s continuous backups with point
and time recovery. It’s a system with a lot of functionality, and
we’ll also be offering an on premise version of that too.

When you’re looking at MongoDB, one thing you
can’t escape is how hugely successfully it’s been – and a part of
that is how astute it’s been at marketing and branding itself. What
do you say to people who criticise the company for winning over
people with advertising over capabilities?

I think we’re not marketing driven at all. I mean,
maybe we do a good job at it, but I think that the organisation is
pretty product driven, and driven by needs and requests of users
and the community and customer users. That would be the biggest
driver of what we work on.

On virtually all the metrics I look at, MongoDB is the
most popular of all the NoSQL databases. So it’s not surprising
then, that a lot of people come to our booths at big events. A big
reason that’s true is that developers like using the product.

One of the goals – when we started the project – we
really felt like there was a need on the database. And there were a
couple of needs. One ways for scale – the ability to scale
horizontally. The other part was, there was just a need for
something, we felt, that worked better with the way that we write
code today. So to me, it’s not all about scale – we also want to
create something that has a certain ‘elegance’ to it (we often use
that word internally). So we think a lot about how you interact
with the database. It’s not all about, “Can I scale 2000 server
clusters?”. Yes, you can, but that is, in our minds necessary but
not sufficient. We also want it to be faster and easier to write
production applications and systems than it was before – in
addition to being able to scale horizontally.

So, because of that, we’ve really focused on that, and
I think developers like using the product when writing apps,
independent of the scale. We have a lot of users who have very
large MongoDB clusters – but we also have developers who have
written applications many times using MongoDB where they only need
one server, and they’ll never need more than that for that
particular application. But one that’s big enough on the scale
side. Why did they do that? Well, the reason that they did it is
because it was the best and easiest way to write the app. And
fastest way to write the app. It wasn’t about scale in that
case.

So I think that’s what’s unique about the product. It
gives you this mix of two capabilities. One is the scale out
property, and the other is making development easy and productive.
And not in a prototyping sense, but in a production systems
sense.

You recently changed from 10Gen to MongoDB. Do
you think that’s changed the overall approach to the MongoDB
database by your company? Or do you think it’s served to strengthen
the community around it?

I think the change, was in hindsight, a bit of a
no-op. It was very important to us that the project has a separate
identity to it than the company. We definitely think of them as
separate entities.

For a long time, the company name and the product were
different for legacy reasons. It was sort of an accident. But we
kind of liked that – it was clear, this was the product, and this
was the company. But at some point it was starting to become
confusing. bAnd the only thing we do is work on MongoDB related
products and services. It’s all we do. So we thought we should just
change the name and it would be less confusing.

Before MongoDB, you’re most well known work
was on DoubleClick. What would you say the big differences are
working at each organisation?

I enjoyed working on both a lot – and still work on
MongoDB, of course. I was CTO and co-founder of DoubleClick. I was
CTO for the first ten years there, designing and working on the
ad-serving systems with the team there. And a think a lot of desire
to create MongoDB came from those experiences there.

We were working with massive scale, serving 30 billion
ads a day – it could never go down. And not really having the tools
to deal with that scale, so we ended up writing them ourselves. So
after that, there’s a feeling of myself and Elliot – our CTO and
co-founder – that it’s time for some new things. And that was a lot
of the catalyst for the project. It’s really what I wish I had when
building DoubleClick. But I do really enjoy working on MongoDB,
basically because I’ve been a developer for my whole adult life. I
really like technology – it’s kind of my favourite thing to work
on.

Is it true you’re actually involved in coding
MongoDB still?

Yes, absolutely. The time I spend has varied over
time. As it’s grown, there’s a lot of demand for my time on the
business side, so sometimes it gets squeezed to a very small
percentage. There’s been times in the past when it was one third or
one half of my time – like in the early days, I’d spend half of my
time coding on MongoDB, and it’s dropped as we’ve gotten bigger.
I’d like to actually get it back up to be a little higher than it
is right now, which is some, but small amounts, because I do enjoy
doing that.

You must have so many teams of engineers that
can code at light speed for the database – is it more something you
do for love these days than expediency?

Well, I like doing it – but hopefully it’s also
useful! There’s also things like product definition, on the
technical side. What is it, and what should it do, and those kind
of things too which are not coding, but somewhat technical.

Going back to Google a little bit – Google
arguably pioneered the use of NoSQL databases, but recently, it’s
been returning to relational databases. What’s your opinion on
this?

I don’t have a good visibility on what Google’s doing,
but I think that they use a lot of products on the data layer
internally. So they kind of invented MapReduce, and they did
BigTable, and they also use relational things, so, I think they
still use a lot of non-relational stuff internally, and they always
had some relational too.

I’m not sure that they’re returning to it – I think
that they’ve always had it in the mix – but you bring up a good
[point], which is, Google, and a lot of these internet companies –
Amazon, LinkedIn, Yahoo!, DoubleClick – had all written, basically
internally, a NoSQL database of some sort, to deal with the scale
that they needed to deal with. But these were not, at the time we
started MongoDB, publicly available open source
projects.

Part of what we wanted to do when we started MongoDB
was start something similar to what these folks were using
internally, that was generally available to everyone. And then at
the same time, we had some ideas that were new on how to make it
developer friendly.

You talked a little bit about how NoSQL has
evolved – what do you think the future has in store for it? Will we
see a reduction in diversity?

I think ‘one size fits all’ is over. It’s not going to
consolidate down to one thing – but there will be some
consolidation, or reduction there. We already have – even before
NoSQL – we were already using more than one tool. In an enterprise,
for example, you would have your relational database management
system for both TP and online things, and you would have data
warehouse technology for both sys-intelligence and reporting and
analytics. So you already had two tools.

They were both relational, and they were both
different code bases typically. I think what people are doing now
is that they are adding into the mix an addition – a NoSQL
database. A given company would probably evaluate several and then
pick one that they want to standardise on internally, and they
would add that to their tool box in addition to those other
tools.

 What we’re seeing though is that, for
writing applications, they are often for new projects, using the
NoSQL tool more than any of the others. What we’re seeing, and what
our goal is, is that MongoDB is, whether it’s a startup or a large
company. their default for building new applications. It won’t for
every application be the optimal tool for that use case, but we
think it could be, for the majority of these cases, the right
tool.

You’re going to default to something, right? We’ve
seen organisations doing this. So for example, the Guardian [UK
newspaper], when they write applications, they have a ‘Mongo first’
policy, which means that, by default, when they are writing a new
app, the backing data store is MongoDB.

You can use something else if you have a good reason,
but we’re default. It’s not going to be optimal for 100% of use
cases, but, if we put 100 post-its on the wall, and we write a use
case on each one, the best choice for more of those post-its than
any other will be MongoDB.

Cassandra’s creators have said that, in the
future, they’re going to be encroaching more into your territory.
Do you perceive any of the other databases out there as a threat to
MongoDB?

We think all the NoSQL products are competitors. So, I
think there’s always been competition there, and that’s always been
the case – it’s not new. I think that most of the products are
doing very well, so the whole space is growing.

One good thing is that you’ve got a lot of products, a
lot of vendors, and they’re all growing. So from a bedding on
technologies point of view, that’s a really good thing. It means
there’s a real space receptor there. When you have a technology and
there’s only one vendor in the whole space, you have to ask
yourself well, why is that? Why is there no diversity there at
all?

You’ve worked your magic with DoubleClick, and
now MongoDB. Could you ever imagine moving into yet another
field?

I think – I hope – to keep working on MongoDB for the
next ten years and beyond. That’s my plan for the moment. I
wouldn’t be surprised if I invest in some start ups and things like
that, in addition, in parallel. But that’s really what  I want
to focus on. It takes a long time for software technologies to
reach maturity.

Because of legacy technologies, it can take a decade
or two for something to have replaced it completely. Look at
Cobalt, how long did it take for that go away? A long time right!
And in addition, how long does it take to make these products
completely mature? It doesn’t mean that they’re not useful today
though. We want to keep working on it, and just stay focused – and
it will take time.

Image by Eric Auchard

Author
Comments
comments powered by Disqus