00:00
3026
More
See all Show me
5. Cloudera Overview
2 weeks ago
4. Cloudera Careers
1 month ago
1. Cloudera Hadoop Training: MapReduce and HDFS
2 years ago
These tools provide the core functionality to allow you to store, process, and analyze big data. This lecture "lifts the curtain" and explains how the technology works. You'll understand how these components fit together and build on one another to provide a scalable and powerful system.

This video is part of a larger series of free online training from Cloudera. Check out our website for more videos, a virtual machine pre-loaded with exercises and more. cloudera.com/hadoop-training-basic

*Note an updated video can be found at cloudera.com/videos/introduction-to-apache-mapreduce-and-hdfs

Credits

Likes

See all likes
  • Peter Goodall 2 years ago
    Do these videos really have to be 1/2 gigabyte? Can you repost a more highly compressed version. Really slams the home monthly band-width. Thanks
  • Kavan Patil 1 year ago
    Really agree. Even at office you will have to keep a tab on your IT guy's movements as you hijack the bandwidth for these videos...you never know when he will throw you out!

    Anyone listening at Cloudera?
  • Jamie Picken 1 year ago
    I have a day time limit of 20Gb a month. wget can't login (with sane effort) so I can't set a script to do it after 12.

    Mmm accessibility.

    >>Anyone listening at Cloudera?
    Apparently not.
  •  
  • Kavan Patil 2 years ago
    Thank you for a great starter on MapReduce and HDFS.
  •  
  • Peter McArthur staff 2 years ago
    HDFS starts @ 28:39
  •  
  • Kevin Ortman 1 year ago
    Fantastic training, succinct and on-topic!
    Thank you.
  •  
  • Dieter P. 11 months ago
    Well executed talk, clear and interesting.
    Nicely edited also.

    Early on he mentions the sorting of the keys before sending to the reducer, the reasoning why is around @21:00.
    I find it also interesting that there's so much going on to collect the results from the mappers and feed them to reducers, that process is treated mostly as a black box here, but I can imagine it has some interesting performance characteristics.
    It would also be interesting to know what the memory requirements are of the namenode given a desired chunksize, replicationlevel and dataset size.
    I also would like to have seen something about how hadoop tries to leverage data locality (i.e. run reducers on those systems that have many of the required chunks on their disks)
  •  
This conversation is missing your voice. Take five seconds to join Vimeo or log in.

Advertisement

About this video

MOV
00:50:26
  • 640x480, 592.74MB
  • Uploaded Wed March 11, 2009
  • Please join or log in to download

Statistics

Date Plays Comments
Totals 60.1K 50 7
Feb 15th 13 0 0
Feb 14th 112 0 0
Feb 13th 132 0 0
Feb 12th 79 0 0
Feb 11th 90 0 0
Feb 10th 121 0 0
Feb 9th 126 0 0