1. At AppNexus, we've experienced explosive growth over the last three years. Our data pipeline, horizontally scaled in Hadoop and Hbase, now processes more than 15 terabytes every day. This has meant the rapid scaling and iteration of our optimization tools used for big data exploration and aggregations. Unlike other more complicated programming languages, Python's versatility allows us to use it both for offline analytical tasks as well as production system development. Doing so allows us to bridge the gap between prototypes and production by relying on the same code libraries and frameworks for both, thereby tightening our innovation loop.

    We'd like to share our best practices and lessons learned when iterating and scaling with Python. We'll discuss rapid prototyping and the importance of tightly integrating research with production. We'll explore specific tools including Pandas, numpy, and ipython and how they have enabled us to quickly data-mine across disparate data sources, explore new algorithms, and rapidly bring new processes into production.

    This talk was presented at PyData NYC 2012: nyc2012.pydata.org/. If you are interested in this topic, be sure to check out PyData Silicon Valley in March of 2013: sv2013.pydata.org/

    # vimeo.com/53053331 Uploaded 3,232 Plays 1 Comment
  2. Van is a lawyer at Haynes and Boone, where he spends most of his time helping clients with patent defense and open source questions. For a lawyer, though, he spends an inordinate amount of time working at a Python prompt, trying to automate all the tedious parts of his job and advancing his hobby of computational linguistics.

    In the rest of his time, Van works as chairman of the Python Software Foundation where he speaks and writes on open source issues. His first book on open source software and intellectual property law was published by O'Reilly and he is working on a second book about the economics of open source.

    This talk was presented at PyData NYC 2012: nyc2012.pydata.org/. If you are interested in this topic, be sure to check out PyData Silicon Valley in March of 2013: sv2013.pydata.org/

    # vimeo.com/53058803 Uploaded 552 Plays 0 Comments
  3. This talk was presented at PyData NYC 2012: nyc2012.pydata.org/. If you are interested in this topic, be sure to check out PyData Silicon Valley in March of 2013: sv2013.pydata.org/

    # vimeo.com/53043236 Uploaded 3,701 Plays 0 Comments
  4. This talk was presented at PyData NYC 2012: nyc2012.pydata.org/. If you are interested in this topic, be sure to check out PyData Silicon Valley in March of 2013: sv2013.pydata.org/

    # vimeo.com/53046117 Uploaded 2,213 Plays 0 Comments
  5. Since v0.8, the pandas library has greatly expanded its timeseries functionality. This tutorial will give an introduction to working with timeseries data in pandas. We'll cover how to create date ranges, convert between point (Timestamp) and interval (Period) representations, convenient indexing and time shifting, changing frequencies, resampling, filtering, and how to work with timezones. Attendees should be familiar with Python, Numpy, and pandas basics.

    This talk was presented at PyData NYC 2012: nyc2012.pydata.org/. If you are interested in this topic, be sure to check out PyData Silicon Valley in March of 2013: sv2013.pydata.org/

    # vimeo.com/53065093 Uploaded 2,851 Plays 0 Comments

PyData

PyData PRO

Videos from PyData Conferences and related to PyData tools and topics

Browse This Channel

Shout Box

Heads up: the shoutbox will be retiring soon. It’s tired of working, and can’t wait to relax. You can still send a message to the channel owner, though!

Channels are a simple, beautiful way to showcase and watch videos. Browse more Channels.