New Importing Tool

Cloudera’s Distribution for Hadoop Version 3

Jessica Thornsby

A beta of Cloudera’s Distribution for Hadoop Version 3 (CDH3,) is now available with a new tool for importing SQL database information, into Apache Hadoop.

CDH is based on Apache Hadoop, with additional patches backported from future releases, and improvements implemented by Cloudera.

This new release adds a ZooKeeper service to the cloud scripts, and the option to prohibit jars from unpacking, and support for EBS storage on EC2. Another big update for CDH3, is ‘Sqoop,’ a database import tool for Hadoop. Sqoop uses Java Database Connectivity to connect to a relational database, before examining each table’s schema and then automatically creating the extra classes needed to import said data into the Hadoop Distributed File System.

There are also numerous bug fixes. Please see the CDH3 Release Notes for more information.