Perspective on the release of PostgreSQL 11
If software is eating the world, then PostgreSQL has the potential to eat a large chunk of the database management market. In this article, Dave Page, Chief Architect Tools and Vice President at EnterpriseDB summarises the theme for PostgreSQL 11 and explains why it is increasingly attractive to a wide range of customers.
Looking back at recent releases of the PostgreSQL database it is interesting to speculate whether there is an overall theme and if you can tie it to the changing way people are using information technology. In past releases, we have seen a greater focus on enterprise functionality, as more users in large organisations have turned to PostgreSQL as an alternative to their existing commercial relational databases. At other times, the big theme has been analytics, which ties in nicely with the old trend of Big Data and a desire of users to find a more cost-efficient, flexible database to support their data warehousing requirements.
In reality, the reason the Postgres community is so vibrant is because there are many independent contributors. They provide new code for each release because it is a personal area of interest for them or they are addressing a real need among users. That approach has helped greatly to enrich PostgreSQL to the point, where it is the only real alternative to the established players as a comprehensive data management system for both relational and unstructured requirements. Indeed, as my colleague Bruce Momjian has speculated, we are now at the point where PostgreSQL is driving innovation in the database market at a faster pace than the traditional commercial vendors. That is due in large part to the combined efforts of everyone in the community and that work continues in Postgres 11; and it is also perhaps why PostgreSQL maintains such a strong position in the DB Engines rankings.
When Bruce asks if “Postgres will live forever” it highlights the stark difference between open and closed source software life cycles, especially the motives behind both approaches. Clearly, commercial software is precisely that – it ultimately must turn a profit for its producer and the level of investment that company is prepared to make is predicated on its profitability. Indeed, in time commercial vendors will look to increase the returns they get from their investment by dialling down the resources they commit to the software. On the contrary, if open source software is to remain vibrant and sustainable it must be run by an active, diverse and independent community. This creates an inverse correlation of innovation between closed and open source software. As the former dials down innovation to extract more profit the latter dials up innovation to attract more users. Indeed, effective open source projects will never run out of willingness to innovate and that is precisely what I see in PostgreSQL 11.
PostgreSQL 11: Highlights
If I were trying to summarise the theme for PostgreSQL 11 then I would suggest the focus is around performance, which makes sense if you consider Cisco’s estimate that humanity will generate 850 zettabytes of data by 2021. That is a huge amount of data and if databases are going to avoid being overwhelmed then performance is crucial, just as much as scalability, reliability and security.
The three areas that stand out to me are partitioning, parallelism and just-in-time (JIT) compilation, where Postgres 11 has added significant features. Last year, the community introduced native partitioning and this year sees further functionality being added to increase its flexibility and performance. For example, partitioning by hash keys and the ability to have indexes on partition sub-tables gives users opportunities to improve performance. Foreign keys allow users to maintain referential integrity between partitioned tables to ensure there are no orphaned rows in hierarchical data structures. Partition elimination makes it possible to ignore partitions when doing a search at runtime, not just when the query is planned. This improves performance, because if a user has a database full of orders the database can examine the order headers to better understand which partitions containing order lines to exclude before beginning the search. In some senses this is just smarter query execution, but it all adds up to improved performance.
Parallelism has been important to PostgreSQL for some time and this release makes it easier to run processes in parallel with a query, such as joins or scanning tables for data, so that results for a query are delivered more quickly. BTree indexes can be added utilising parallelism, which means PostgreSQL can operate faster and use multiple CPU cores for improved efficiency. It will also be possible to push down LIMIT clauses to parallel workers so a worker only needs to send back 10 rows rather than 100 for example, thus reducing inter-process communication overhead.
New JIT compilation functionality addresses various types of SQL queries, particularly those with complex expressions. When you run a query it is not always about selecting data from a table – for instance, you could be doing a complex calculation which utilises data from multiple locations. JIT compilation compiles the expression to machine code at runtime to make the calculation faster.
We have also been adding enhancements to other areas, such as more fine-grained control to role permissions. It is now possible to set permissions for specific tasks like allowing individuals to read files from a file system or execute programmes. Previously, the only option was to create a super user with access to all of the system, so this update will be helpful for enterprises that need more sophisticated controls for their strategy around roles and authorisation. We are also adding stored procedure support, which is important as it enables users to control transactions within application logic, unlike functions. Functions occur within a transaction, whereas a procedure happens outside, so a procedure can be used to do a complex validation of a query before deciding whether to commit or roll it back; ultimately, this gives users a lot more flexibility for server-side programming.
If software is eating the world, then PostgreSQL has the potential to eat a large chunk of the database management market. As such, I agree with Bruce that PostgreSQL will live forever as long as we continue to maintain the pace of innovation that we see today; we are already going far beyond simply matching the established commercial database vendors to provide new innovations in scalability, security, analytics and performance. It is increasingly attractive to a wide range of customers, who need far greater flexibility and cost efficiency as they deploy database platforms fit for the digital era. Increasing the performance capabilities in PostgreSQL 11 only serves to underline the vibrancy of this database platform!