1. This talk was presented at PyData NYC 2012: nyc2012.pydata.org/. If you are interested in this topic, be sure to check out PyData Silicon Valley in March of 2013: sv2013.pydata.org/

    # vimeo.com/53039281 Uploaded 649 Plays / / 1 Comment Watch in Couch Mode
  2. Wikipedia’s corpus makes it ideal for doing some natural language procesing tasks (NLP). This talk will cover how to extract data out of Wikipedia for your own use using Python, MongoDB and Solr; it will also cover how to use this data to do familiar NLP tasks such as named entity recognition and suggesting related articles.

    This talk was presented at PyData NYC 2012: nyc2012.pydata.org/. If you are interested in this topic, be sure to check out PyData Silicon Valley in March of 2013: sv2013.pydata.org/

    # vimeo.com/53091620 Uploaded 337 Plays / / 0 Comments Watch in Couch Mode
  3. Blaze is a next-generation NumPy sponsored by Continuum Analytics. It is designed as a foundational set of abstractions on which to build out-of-core and distributed algorithms. Blaze generalizes many of the ideas found in popular PyData projects such as Numpy, Pandas, and Theano into one generalized data-structure. Together with a powerful array-oriented virtual machine and run-time, Blaze will be capable of performing efficient linear algebra and indexing operations on top of a wide variety of data backends.

    This talk was presented at PyData NYC 2012: nyc2012.pydata.org/. If you are interested in this topic, be sure to check out PyData Silicon Valley in March of 2013: sv2013.pydata.org/

    # vimeo.com/53031980 Uploaded 883 Plays / / 1 Comment Watch in Couch Mode
  4. Working with data at large scales requires parallel computing to access large amounts of RAM and CPU cycles. Users need a quick and easy way to leverage these resources without becoming an expert in parallel computing. IPython has parallel computing support that addresses this need by providing a high level parallel API that covers a wide range of usage cases with excellent performance. This API enables Python functions, along with their arguments to be scheduled and called on parallel computing resources using a number of different scheduling algorithms. Programs written using IPython Parallel scale across multicore CPUs, cluster and supercomputers with no modification and can be run, shared and monitored in a web browser using the IPython Notebook. In this talk I will cover the basics of this API and give examples of how it can be used to parallelize your own code.

    This talk was presented at PyData NYC 2012: nyc2012.pydata.org/. If you are interested in this topic, be sure to check out PyData Silicon Valley in March of 2013: sv2013.pydata.org/

    # vimeo.com/53056634 Uploaded 1,121 Plays / / 1 Comment Watch in Couch Mode
  5. Shapely is a Python library for performing geometric calculations. It is most commonly used to process and analyze geographic data, like geo-tagged media or shapefiles. In this talk, we'll take publicly available geo-tagged data, visualize it, and perform spatial analysis to find trends.

    This talk was presented at PyData NYC 2012: nyc2012.pydata.org/. If you are interested in this topic, be sure to check out PyData Silicon Valley in March of 2013: sv2013.pydata.org/

    # vimeo.com/53041159 Uploaded 903 Plays / / 1 Comment Watch in Couch Mode

PyData

PyData PRO

Videos from PyData Conferences and related to PyData tools and topics

Browse This Channel

Shout Box

Channels are a simple, beautiful way to showcase and watch videos. Browse more Channels. Channels