Pig is a Hadoop sub-project that provides a higher level programming language to describe parallel computation atop Map Reduce. It implements common relational operators such as join and filter and provides ways to incorporate custom user code via user defined functions and streaming. The result is much simpler and more compact code, increased user productivity, and reduced maintenance time compared to writing Java for Map Reduce. At the same time, Pig stays faithful to the spirit of Map Reduce whereby a user program specifies a simple sequence of steps for the system to obey. The talk will introduce Pig and its programming model, contrast it with Hadoop¹s model and provide motivation to use Pig as the preferred programming paradigm for most applications. The performance tradeoffs will also be discussed.
Mahalo to ThinkTech Hawaii (http://www.thinktechhawaii.com) and Panopto (http://www.panopto.com) for their video recording services.