1. Python has long been used as a language for crawling the web -- perhaps the most successful example being the early web crawlers built for the Google search engine. In recent times, open source libraries have improved dramatically for doing large-scale web crawling tasks. Further, the web has also matured in that many HTML pages now offer various metadata that can be extracted by well-equipped spiders, beyond the basics such as the text content or document title. This talk will cover Parse.ly's use of the open source Scrapy project and its own work on standardizing metadata extraction techniques on news stories.

    This talk was presented at PyData NYC 2012: nyc2012.pydata.org/. If you are interested in this topic, be sure to check out PyData Silicon Valley in March of 2013: sv2013.pydata.org/

    # vimeo.com/53109189 Uploaded 974 Plays / / 0 Comments Watch in Couch Mode
  2. Web-based, data-intensive applications have historically been limited in the types of interactive visualizations they can present to end-users, relying either on server-side applications or plugins for rendering charts and plots. The addition of HTML5, Canvas, and SVG to most modern browsers, along with the large performance improvements in JavaScript interpreters, has made it possible to create highly interactive visualizations directly in the browser. In this talk, Chris will show how to create a fully interactive visualization system for exploring large data sets using Python and JavaScript. Chris will also introduce some of the common JavaScript libraries that streamline client-side development and help developers create well-architected user interfaces in the browser.

    This talk was presented at PyData NYC 2012: nyc2012.pydata.org/. If you are interested in this topic, be sure to check out PyData Silicon Valley in March of 2013: sv2013.pydata.org/

    # vimeo.com/53063185 Uploaded 1,422 Plays / / 0 Comments Watch in Couch Mode
  3. Within the past decade, the amount of DNA sequencing data generated from next-generation sequencing platforms has exploded. As a result, biology has been propelled as a field in need of better scaling algorithms and data structures to efficiently analyze data. Jason Pell will present features of khmer, a software package developed in the GED Lab at Michigan State University, to efficiently filter and analyze data generated by next-generation sequencing platforms. More specifically, he will present the use of the Bloom filter and Counting Bloom filter data structures for assembly graph traversal and k-mer counting, respectively. The khmer software package is written primarily in C++ and wrapped in Python. It is released under the BSD license and is available at github.com/ged-lab/khmer.

    This talk was presented at PyData NYC 2012: nyc2012.pydata.org/. If you are interested in this topic, be sure to check out PyData Silicon Valley in March of 2013: sv2013.pydata.org/

    # vimeo.com/53108478 Uploaded 340 Plays / / 0 Comments Watch in Couch Mode
  4. Why use GPUs from Python? This workshop will provide a brief introduction to GPU programming with Python, including run-time code generation and use of high-level tools like PyCUDA and PyOpenCL, and Loo.py.

    This talk was presented at PyData NYC 2012: nyc2012.pydata.org/. If you are interested in this topic, be sure to check out PyData Silicon Valley in March of 2013: sv2013.pydata.org/

    # vimeo.com/53052481 Uploaded 9,660 Plays / / 0 Comments Watch in Couch Mode
  5. Panelists: Andy Terrel, Thomas Wiecki, Andreas Klockner, Brian Granger
    Moderator: Travis Oliphant

    # vimeo.com/53108179 Uploaded 1,655 Plays / / 0 Comments Watch in Couch Mode

PyData

PyData PRO

Videos from PyData Conferences and related to PyData tools and topics

Browse This Channel

Shout Box

Channels are a simple, beautiful way to showcase and watch videos. Browse more Channels. Channels