Apache Drill - interactive, analytics for large-scale datasets - Part II (Michael Hausenblas - MapR)
Apache Drill is a distributed system for interactive analysis of large-scale datasets, inspired by Google’s Dremel technology. It is designed to scale to thousands of servers and able to process Petabytes of data in seconds, enabling SQL-on-Hadoop and supporting a variety of data sources. Since its inception a year ago, Apache Drill has gained widespread interest in the community, attracting hundreds of people.
We will discuss how Apache Drill enables ad-hoc interactive query at scale,
review the system architecture and walk through use case. We then focus on Apache Drill's unique support for a variety of back-ends (HDFS, HBase, MySQL, MongoDB, CouchDB, etc.), its extensibility points, and show a demo of the system.