What’s new in Apache Hive 1.0.0?
A major leap has been made for the newest version of Apache Hive, now proudly stepping up to version 1.0.0. After nine years of work, Hive can now embrace the new naming structure and all that comes with it.
The data warehouse software for Hadoop, Apache Hive has reached its first major release. Contrary to what one might expect in a major release, the latest features are keeping Apache Hive 1.0.0 within limits.
The update to the code base should have been released as version 0.14.1, however the view of the broader community after nine years in the development process was that the time had come to make subsequent releases part of the 1.x naming structure.
Small in scope
The main changes in the new Apache Hive 1.0.0 release are concerned with beginning to define the public API as well as the removal of HiveServer 1. The API documentation has only started and will be continued in HIVE-9363. The latter is an important step, putting Hive on the path to greater enterprise adoption and use.
Donated some time ago by Cloudera, HiveServer 2 enables support for JDBC and ODBC, with authorization for Apache Sentry currently in progress. CLI users who want to use HiveServer 2 must migrate to Beeline. As Brock Noland states, Cloudera and Intel have continued to invest heavily in Beeline and HiveServer2 to make the transition easier:
One feature I am particularly excited about is retrieval of query logs via the JDBC API and Beeline query status, which is implemented using that API. This will make it easier for Hive developers to use Beeline to develop their future Hive jobs.
With the new naming convention in place, the next major release of Hive will appear as version 1.1.0. The release process has apparently already begun; main features slated for the next version include the long-awaited Hive-on-Apache Spark as a third execution back-end (next to MapReduce and Tez).
The official release announcement from Cloudera can be read in full here.