1. Aaron Beppu - Profiling and performance-tuning your Hadoop pipelines

    22:47

    from newthinking / Added

    201 Plays / / 0 Comments

    In the Hadoop ecosystem, there are now several tools which allow developers to quickly produce pipelines of MapReduce jobs without descending to the verbose level of the Java MapReduce apis. Unfortunately, these concise, higher-level tools often produce pipelines which are initially slow, and difficult to optimize. This talk will describe Etsy's pipeline of hundreds of Cascading flows (and thousands of daily Hadoop jobs), and our approach to profiling and performance-tuning them. More info: http://berlinbuzzwords.de/sessions/profiling-and-performance-tuning-your-hadoop-pipelines

    + More details
    • A Billion Records a Month Isn't Exactly a

      50:32

      from Wes Hunt / Added

      19 Plays / / 0 Comments

      Colt and Leif will present a high-level overview of a production big data system that uses Hadoop MapReduce, HDFS, and HBase. As well as describe how to use those technologies to create a horizontally scalable system to continually process lots of data and make it available for nearly instant access.

      + More details
      • A billion records a month isn't exactly a lot of data these days

        01:18:40

        from Wes Hunt / Added

        30 Plays / / 0 Comments

        For our main February meetup Colt and Leif presented a high-level overview of a production bigdata system that uses Hadoop MapReduce, HDFS, and HBase. As well as described how to use those technologies to create a horizontally scalable system to continually process lots of data and make it available for nearly instant access. www.montanaprogrammers.org

        + More details
        • Actian SQL on Hadoop

          03:01

          from Actian Corporation / Added

          4 Plays / / 0 Comments

          Transform Hadoop from a data lake into a high performance, fully functional analytics platform with the Actian Analytics Platform -- Hadoop SQL Edition. The platform is the first end-to-end analytics platform to run 100% natively in Hadoop. Run sophisticated data science analytics natively in Hadoop up to 30 times faster. Give business users interactive SQL access to Hadoop data with the highest performing SQL in Hadoop capability.

          + More details
          • Alex Baranau - Real-time Analytics with HBase

            19:44

            from newthinking / Added

            244 Plays / / 0 Comments

            HBase can store massive amounts of data and allow random access to it - great. MapReduce jobs can be used to perform data analytics on a large scale - great. MapReduce jobs are batch jobs - not so great if you are after Real-time Analytics. Meet append-only writes approach that allows going real-time where it wasn't possible before. More info: http://berlinbuzzwords.de/sessions/real-time-analytics-hbase

            + More details
            • Amazon Elastic MapReduce Workshop Talk at BigDataCamp

              21:28

              from Dave Nielsen / Added

              67 Plays / / 0 Comments

              Richard Cole gives his Amazon Elastic MapReduce Workshop Talk at BigDataCamp

              + More details
              • Analyzing Social Media Data with Transactional Data

                05:52

                from Karmasphere / Added

                1,218 Plays / / 0 Comments

                A coffee shop chain use case demo.

                + More details
                • An Introduction to Apache Hadoop MapReduce

                  01:41

                  from Mike Frampton / Added

                  43 Plays / / 0 Comments

                  An Introduction to Apache Hadoop MapReduce, what is it and how does it work ? What is the map reduce cycle and how are jobs managed. Why should it be used and who are big users and providers ?

                  + More details
                  • An Introduction to Apache Hadoop Yarn

                    02:03

                    from Mike Frampton / Added

                    56 Plays / / 0 Comments

                    An Introduction to Apache Hadoop Yarn, what is it and why is it important ? What does it improve in Apache Hadoop ?

                    + More details
                    • An Introduction to Apache Pig

                      01:31

                      from Mike Frampton / Added

                      13 Plays / / 0 Comments

                      An Introduction to Apache Pig, what is it used for ? How does it work and why use it compared to Map Reduce native code ?

                      + More details

                      What are Tags?

                      Tags

                      Tags are keywords that describe videos. For example, a video of your Hawaiian vacation might be tagged with "Hawaii," "beach," "surfing," and "sunburn."