Web and online social graphs have been rapidly growing in size and scale during the past decade. Processing such graphs which consist of millions of vertices and hundreds of millions to billions of edges is a huge technical challenge. Experience shows that the Map/Reduce paradigm is no good fit for working with large graph structures. This talk will give an introduction to Apache Giraph, a system that implements the bulk-synchronous parallel (BSP) model in a way that is extremely well suited for implementing graph algorithms. Giraph is a very young project which is currently developed by people working at leading internet platforms such as Facebook, Twitter, LinkedIn and can be run on a standard Hadoop infrastructure.

Sebastian Schelter is a PhD student from the Database Systems and Information Management Group (DIMA) of TU Berlin. He is also committer and PMC member of Apache Mahout and Apache Giraph.

Loading more stuff…

Hmm…it looks like things are taking a while to load. Try again?

Loading videos…