1. Cloudera Hadoop Training: MapReduce and HDFS


    from Cloudera / Added

    80.1K Plays / / 11 Comments

    These tools provide the core functionality to allow you to store, process, and analyze big data. This lecture "lifts the curtain" and explains how the technology works. You'll understand how these components fit together and build on one another to provide a scalable and powerful system. This video is part of a larger series of free online training from Cloudera. Check out our website for more videos, a virtual machine pre-loaded with exercises and more. http://www.cloudera.com/hadoop-training-basic *Note an updated video can be found at http://www.cloudera.com/videos/introduction-to-apache-mapreduce-and-hdfs

    + More details
    • NOSQL - Introduction


      from martind / Added

      1,791 Plays / / 0 Comments

      Todd Lipcon, Cloudera http://cloudera.com/ http://blog.oskarsson.nu/2009/06/nosql-debrief.html

      + More details
      • A NOSQL Evening in Palo Alto


        from InfiniteGraph / Added

        2,494 Plays / / 0 Comments

        On October 26, 2010, InfiniteGraph conducted a live event. "A NOSQL Evening in Palo Alto" was one of the larger gatherings of speakers and attendees, and is part of a series of conferences on "NOSQL" (or "Not Only SQL") technologies sponsored by InfiniteGraph and Scality. Tim Anglade, founder of A NOSQL Summer, hosted the evening discussion and Q&A with representatives from some of the most prominent and new NOSQL vendors and projects. The round-table conversation looked back on the origins of the NOSQL movement and its growing pains, discussed the current technological and business states, as well as looked ahead to the opportunities for expansion and adoption of these new classes of alternative technologies.

        + More details
        • Digg Technical Talks - Jeff Hammerbacher


          from Digg Development / Added

          803 Plays / / 0 Comments

          Jeff Hammerbacher talks about analytical data platforms. Jeff describes what these platforms are and why you should care. He updates us on what's happening in that space and illustrates how Cloudera are building such a platform around Hadoop and HDFS

          + More details
          • An Introduction to Impala – Low Latency Queries for Apache Hadoop


            from Chicago Hadoop User Group / Added

            630 Plays / / 0 Comments

            Want to hear about Impala? Marcel Kornacker will start out with an overview of Impala from the user's perspective, followed by a presentation of Impala's architecture and implementation, and will conclude with a comparison of Impala with Apache Hive, commercial MapReduce alternatives, and traditional data warehouse infrastructure. This presentation was given on December 13th, 2012 at the offices of 1871 in the Merchandise Mart in Chicago, IL. To view this presentation on slideshare: http://www.slideshare.net/ChicagoHUG/an-introduction-to-impala-low-latency-queries-for-apache-hadoop

            + More details
            • Jeff Hammerbacher on Evolving a New Analytical Platform - Orbitz IDEAS


              from Orbitz IDEAS / Added

              780 Plays / / 0 Comments

              * Note - This is a large video and takes a minute or so to load * Handling massive amounts of data as is done at places like Facebook, Cloudera, and our very own Orbitz is no simple task. Jeff has been focused on this for years. This talk dives into what works and what is left to be done. In the speakers own words: “At Cloudera, we augment existing analytical platforms with some new tools for data management and analysis. In this talk, we'll share some experiences of what has worked across industries and workloads, and what new software components might help complete a new analytical platform.” Speaker: Jeff Hammerbacher was an Entrepreneur in Residence at Accel Partners immediately prior to Cloudera. Before Accel, he conceived, built, and led the Data team at Facebook. The Data team was responsible for driving many of the applications of statistics and machine learning at Facebook, as well as building out the infrastructure to support these tasks for massive data sets. The Data team produced open source projects such as Hive and Cassandra and their work was recognized at conferences such as CHI, ICWSM, SIGMOD, and VLDB. Before joining Facebook, Jeff was a quantitative analyst on Wall Street. Jeff earned his Bachelor's Degree in Mathematics from Harvard University and recently served as a Managing Editor for O'Reilly's "Beautiful Data".

              + More details
              • Data Driven DC: Apache Flume by Ted Malaska, Sr Solution Architect at Cloudera


                from Nation Conferences / Added

                383 Plays / / 0 Comments

                Video sponsored by BigConf - Data Driven DC Conference, March 28th, 2014, Washington, DC. See http://bigconf.io/ for details. Presentation hosted by DCJUG/Data Driven DC, Cloud DC, Nova Hadoop. Description: As organizations move more applications to the cloud, there is an increased need for logging and monitoring of a heterogeneous software and infrastructure stack. In this meetup we want to explore some of the tools and technologies that can be used to perform "BigOps" - the ability to collect and process large amounts of data using some robust open source tools such as Apache Flume and its ability to integrate with Hadoop/HDFS. Overview of Apache Flume - What is Flume (going through the parts) - Common Flume Architectures including use of Hadoop/HDFS as a sink - Performance tuning tips with Flume - Architecting for different levels of guarantees - Working through different types of sinks and what they can offer. Speaker: Ted Malaska, Sr Solution Architect at Cloudera Ted has worked on close to 60 Clusters over 2-3 dozen clients with over 100's of use cases. He has 18 years of professional experience working for start-ups, the US government, a number of the worlds largest banks, commercial firms, bio firms, retail firms, hardware appliance firms, and the US’s largest non-profit financial regulator. He has architecture experience across topic such as Hadoop, Web 2.0, Mobile, SOA (ESB, BPM), and Big Data. Ted is a regular committer to Flume, Avro, Pig and YARN.

                + More details
                • Experian Marketing Services leaps forward in operational efficiency with Cloudera


                  from Cloudera / Added

                  1,585 Plays / / 0 Comments

                  Experian Marketing Services represents global suite of products and platforms that help marketers connect to customers. In this video, several members of the Experian team share their perspectives on the Experian Marketing Services use case for Hadoop, reasons for partnering with Cloudera, and impact to the business that has resulted from Cloudera-empowered gains in operational efficiency.

                  + More details
                  • Hadoop Tutorial: Deploy the Spark server and calculate Pi in Hue


                    from The Hue Team / Added

                    2,820 Plays / / 0 Comments

                    Hue ships with Spark Application that lets you submit Scala and Java Spark jobs directly from your Web browser. Hue relies on the open source Spark Job Server for communicating with Spark (e.g. for listing, submitting Spark jobs, retrieving the results, creating contexts...). Read more at http://gethue.com

                    + More details
                    • What is a data scientist?


                      from Juku / Added

                      255 Plays / / 0 Comments

                      Josh Wills (director of data science at Cloudera) talks about his job and the meaning of data scientist.

                      + More details

                      What are Tags?


                      Tags are keywords that describe videos. For example, a video of your Hawaiian vacation might be tagged with "Hawaii," "beach," "surfing," and "sunburn."