
Cloudera Hadoop Training: MapReduce and HDFS
2 years ago
These tools provide the core functionality to allow you to store, process, and analyze big data. This lecture "lifts the curtain" and explains how the technology works. You'll understand how these components fit together and build on one another to provide a scalable and powerful system.
This video is part of a larger series of free online training from Cloudera. Check out our website for more videos, a virtual machine pre-loaded with exercises and more. cloudera.com/hadoop-training-basic
*Note an updated video can be found at cloudera.com/videos/introduction-to-apache-mapreduce-and-hdfs
This video is part of a larger series of free online training from Cloudera. Check out our website for more videos, a virtual machine pre-loaded with exercises and more. cloudera.com/hadoop-training-basic
*Note an updated video can be found at cloudera.com/videos/introduction-to-apache-mapreduce-and-hdfs
-
Vimeo: About / Blog / Developers / Jobs /
Community Guidelines /
Help Center / Video School / Music Store / Site Map
/ Vimeo
or
-
Legal: TM + ©2012 Vimeo, LLC. All rights reserved. / Terms of Service / Privacy Statement / Copyright

Prev week
Anyone listening at Cloudera?
Mmm accessibility.
>>Anyone listening at Cloudera?
Apparently not.
Thank you.
Nicely edited also.
Early on he mentions the sorting of the keys before sending to the reducer, the reasoning why is around @21:00.
I find it also interesting that there's so much going on to collect the results from the mappers and feed them to reducers, that process is treated mostly as a black box here, but I can imagine it has some interesting performance characteristics.
It would also be interesting to know what the memory requirements are of the namenode given a desired chunksize, replicationlevel and dataset size.
I also would like to have seen something about how hadoop tries to leverage data locality (i.e. run reducers on those systems that have many of the required chunks on their disks)