Wild data

NoSQL vs. Postgres

Pierre Fricke
Sheriff image via Shutterstock

Who’s the sheriff in today’s data centre wild west? Postgres advocate Pierre Fricke looks at the risks that NoSQL will pose in years to come, while doing his best to deflate the Hadoop hype.

Today’s agile software development model and web-based applications could be compared to the Wild West given the data stores that spring up all over the enterprise full of unstructured and structured data. Once upon a time, database administrators were the sheriffs in the data centre, ensuring data protection guidelines and business rules remained intact.

Not so anymore, as these professionals are losing in popularity as IT developers create a new generation of data silos and overly complex systems of applications using disparate NoSQL solutions. A smart DBA dares not wave his sheriff’s gold star and challenge developers to a shoot out by suggesting caution or trying to add processes that may slow down the application development timeline.

SEE ALSO: Siloed data – the new Postgres feature

This is potentially damaging long-term. Gartner estimates that by 2017, 50 percent of data stored in NoSQL DBMSs will be damaging to the business due to a lack of applied information governance policies and programs. Database administrators have always been the ones responsible for maintaining data flow, stability and integrity within an organisation. The hype surrounding Hadoop and NoSQL continues to climb, yet there emerges a sense that IT departments have sacrificed important workflow data protection processes simply to prove to the business they can be responsive and agile after many years of being anything but.

Yet this new reality creates a variety of challenges and someone, mostly like the DBA, will have to address them. The most important of these is the choice of database and how that affects the data environment. NoSQL-only solutions are not ACID compliant. While some solutions claim ACID compliance in a single document, using them to achieve the robust data integrity that enterprises require across their entire data store requires developing complex applications that do the heavy lifting that a relational database already handles.

Storing vs. processing

NoSQL-only solutions also don’t process data, they only store data. Data has to be brought to the application for analysis. The application (and hence, each individual application developer) is responsible for accessing data, implementing business rules, and data consistency. This is complicated, because each NoSQL database product uses a different representation for its data and its data access/manipulation language. So, organisations may find themselves with multiple but incompatible NoSQL solutions.

NoSQL-only solutions also cannot support stored procedures, and NoSQL solutions do not represent any one single technology. As a result, it’s very difficult for organisations to reuse code or establish standards as well as find talented resources. Ultimately, enterprises using NoSQL-only solutions can end up battling a proliferation of data silos because each application requires its own data store and enterprise data becomes fragmented and loosely governed. That is what is behind Gartner’s dire prediction of future losses in the value of company data.

This is the time bomb that could have serious consequences. There have been a great many examples of what happens when regulators respond to significant IT systems failures. Luckily, in Postgres, there is a technical solution, which goes a long way to addressing the problem. For instance, you can combine unstructured data with relational tables, all the while maintaining ACID compliance and centralised business processing rules and logic.

A flexible solution

Postgres provides flexible data models that developers need to build applications capable of evolving with changing business needs. The tools are JSON/JSONB for supporting and processing document data and the HStore data type for key/ value pairs. Postgres also allows developers to utilize JSON oriented document syntax directly inside SQL statements, complete with a large supply of functions for manipulating JSON data and converting it back and forth with relational data. The ability to create unstructured data stores and combine components with relational tables on the fly is a powerful capability – unheard of in a relational database.

Postgres supports unstructured data stores but then enables developers to apply schema rules to selected data according to business needs. With Foreign Data Wrappers (FDWs), developers can integrate external structured and unstructured data within Postgres and enable Postgres to read and write SQL queries to foreign data sources. There are FDWs for MongoDB, Hadoop, CouchDB, MySQL, Redis, Neo4j and even Twitter and more.

Corralling this Wild West is not just a technical challenge. The innovations in Postgres go a long way toward addressing some of the challenges, by providing a solution that eliminates the complexity and data silos that come with adding disparate solutions. This is also an issue of culture. Postgres will help us avoid this latest stage in the evolution of the developer-DBA relationship becoming the last shoot out. Both teams must work together to ensure IT is seen as a key source of competitive advantage and supports business strategy, as businesses battle for greater agility as a competitive edge.

Author
Pierre Fricke
Pierre Fricke is Vice President of Product Marketing at EnterpriseDB Corporation and was previously a Red Hat JBoss executive.

Comments
comments powered by Disqus