Hadoop gets a new guardsmen in Cloudera's Sentry
The enterprise wishlist for Hadoop is lengthy, but undoubtedly towards the top for any company thinking of giving the big data technology a whirl is security. With more and more reams of sensitive data being spewed out of the big data processing framework, keeping the infrastructure secure is a top priority for those in financial services, healthcare and government.
Hadoop vendor Cloudera has recognised the importance of filling this gap with the release of Sentry, an open source role-based authorisation framework that grants precise access levels to the right users and applications. The module integrates with SQL query engines Apache Hive and Cloudera’s recent real-time effort Impala, giving access control at the server, database, table and view scopes levels. Though only these two components are officially supported at the moment, users could extend the pluggable architecture to secure projects like Pig, according to the Cloudera blogpost accompanying the release.
Sentry could open the door for companies wanting granular control, who previously couldn’t create data systems without the functionality or stringent security regulations in place. WIth Sentry meeting Role Based Access Control requirements, there should theoretically be a number of new use cases and customers lining up to use Hadoop. Multi-tenant administration is also possible, as Sentry gives permissions on different datasets to be delegated to different administrators. The platform uses authenticator Hadoop Kerberos to secure the data.
Cloudera CEO, Tom Reilly says security is “a top priority” for large enterprises using Hadoop.
“With Sentry and future releases in our product roadmap, we are continuing to address the complete security picture around Hadoop, delivering on our vision to make the platform safe and compliant for enterprise use, in even the most highly regulated industries,” he added.
Although currently only shipping as a Cloudera Hadoop Distribution 4.3 add-on, Sentry is available under an Apache 2 license, meaning you can fork to your heart’s content. The company intend to bring this crucial piece of security kit to the Apache Incubator in the near future too, where the majority of Hadoop projects are housed.
Should the first of its kind project move to the open source foundation, it could quite feasibly become an important cog in Hadoop’s security puzzle and help the technology overcome a severe enterprise obstacle along the way.
Image courtesy of Pot Noodle, Cloudera