Scaling for Big Data: An introduction
Database tech can make or break an app – but getting the fundamentals can be daunting. In this new series, Cory Isaacson breaks it all down in easily digestible chunks.
Welcome to the first article in my new column Scaling for
Big Data. This column will be an exciting project, covering a
variety of topics and techniques on scaling your database to meet
the ever-challenging requirements of the rapid growth in
transaction and data volumes.
You could say that we are currently experiencing
a data boom, with databases growing faster and larger than ever
conceived even just a few years ago. Every application depends on
its data. If you are one of those very clever developers building
the next great social media app, working on the hottest new game
technology, or concentrating on core business functions like
e-commerce or traditional enterprise functionality, then database
technology is critical to everything you do.
I know from hard work and experience, that
understanding database technology is often the key to success in an
application, or the cause of untold frustration, long hours and
outright failures. Doing it right can make you a huge success, and
missing the mark can spell disaster. It can seem a daunting task to
learn everything you need to know about database technology so that
you can make the right decisions, yet the truth is all database
management systems (DBMS) work on the same principles and share the
same concepts. Once you know the fundamentals, you can understand
and utilize any database technology in an effective manner,
delivering on the promise of your application.
By way of introduction, I have been in the
software industry for over 20 years, and have run many companies,
from start-ups to established businesses. My focus has been on
database technology, either in professional services firms or
managing product companies. I have had the good fortune of working
with and learning from some of the smartest technologists in the
world, from data architects to application developers in a wide
variety of fields. From the start I have always had a passion for
database technology – the most critical element of any successful
application, and often the one that presents the most technical
In my career I’ve worked with just about every
database you can imagine, starting with Sybase, Microsoft, SQL
Server, Oracle, and in recent years focusing on the open source
offerings which of course include MySQL and PostgreSQL. And now
with the global move to Big Data, I find it important to understand
newer database technology options, including products such Hadoop,
MongoDb, Cassandra, and column databases like MonetDb and
InfoBright… the list really goes on and on.
Why is this important to you as a developer?
Because at the very heart of your application is your database
tier. It can make or break your application, and I hope that this
column will help make your database a winner.
Today we are incredibly fortunate given the wide
number of strong choices for database technology. Now there are so
many options available, giving application developers the ability
to scale data like never before. This array of options presents an
incredible number of opportunities, but also many
- How do you know which option is best for your
- Should you stick with traditional Relational Database
Management (RDBMS) options, or will newer offerings in the NoSQL or
data analytics space provide a better fit?
- How and when should you use Index engines?
- What about keeping your database reliable and operational
for a 24X7 application?
- Where does caching fit into the mix?
The truth is that no single database technology
can meet all requirements, and indeed I find that most applications
need to use more than one database technology. The reason is simple
enough – different aspects of your application have different
needs, and each DBMS engine is good at a particular type of job.
Thus I find myself in the incredibly fortunate position of having
an almost limitless number of topics available when covering
Scaling for Big Data.
My objectives of the column are simple: to
provide as much useful information as possible, information that
you can directly apply to your database requirements. More
specifically, I’ll be covering topics such as:
- Why databases slow down.
- Scaling with traditional Relational DBMS
- Database performance optimization techniques.
- Database design for maximum performance and
- Using non-relational DBMS engines.
- Big Data analytics.
- Indexing engines, why they are important and how to use
- Database caching opportunities and
- Keeping your database highly available.
- Database disaster recovery strategies.
The focus will be on practical articles that you
can use to conquer your database challenges. I would also like to
hear from you, what topics you would find most helpful. Further, if
you have a great idea for an entry in the column, I will review and
consider it. Just email me at:firstname.lastname@example.org,
I’d really like to hear from you.
I hope you enjoy Scaling for Big Data, and that
you find it helpful to your application development efforts. With a
scalable database tier you can accomplish almost anything in the
application development world, and together we can make that a