Python is quickly becoming the glue language which holds together data science and related fields like quantitative finance. Zipline is a new, BSD-licensed quantitative trading system which allows easy backtesting of investment algorithms on historical data. The system is fundamentally event-driven and a close approximation of how live-trading systems operate. Moreover, Zipline comes "batteries included" as many common statistics like moving average and linear regression can be readily accessed from within a user-written algorithm. Input of historical data and output of performance statistics is based on Pandas DataFrames to integrate nicely into the existing Python eco-system. Furthermore, statistic and machine learning libraries like matplotlib, scipy, statsmodels, and sklearn integrate nicely to support development, analysis and visualization of state-of-the-art trading systems.
Zipline is currently used in production as the backtesting engine powering Quantopian.com -- a free, community-centered platform that allows development and real-time backtesting of trading algorithms in the web browser. Zipline will be released in time for PyData NYC'12.
The talk will be a hands-on IPython-notebook-style tutorial ranging from development of simple algorithms and their analysis to more advanced topics like portfolio and parameter optimization. While geared towards quantitative finance, the talk is a case study of how modern, general-purpose pydata tools support application-specific usage scenarios including statistical simulation, data analysis, optimization and visualization. We believe the talk to be of general interest to the diverse pydata community.# vimeo.com/53064082 Uploaded 5,306 Plays 10 Likes 0 Comments
Have a data science problem in Python? Need to do some ML or NLP, but find the options daunting? In this whirlwind tour, we'll go over some common use-cases, and explain where to start. More importantly, you'll learn what to avoid, and what WON'T be a valuable use of your time.# vimeo.com/53058140 Uploaded 1,590 Plays 28 Likes 0 Comments
Python's Natural Language Toolkit is one of the most widely used and actively developed natural language processing libraries in the open source community. This workshop will introduce the audience to NLTK -- what problems its aims to solve, how it differs from other natural language libraries in approach, and how it can be used for large-scale text analysis tasks. Concrete examples will be taken from Parse.ly's work on news article analysis, covering areas such as entity extraction, keyword collocations, and corpus-wide analysis.# vimeo.com/53062324 Uploaded 1,561 Plays 12 Likes 1 Comment
Are you interested in working with social data to map out communities and connections between friends, fans and followers? In this session I'll show ways in which we use the python networkx library along with the open source gephi visualization tool to make sense of social network data. We'll take a few examples from Twitter, look at how a hashtag spreads through the network, and then analyze the connections between users posting to the hashtag. We'll be constructing graphs, running stats on them and then visualizing the output.# vimeo.com/53061411 Uploaded 6,375 Plays 29 Likes 0 Comments
The Message Passing Interface (MPI) has been called the assembly language of distributed parallel computing. It is the de facto message passing standard for effectively and portably utilizing the world's largest (and smallest) supercomputers. In this workshop, we will discuss how MPI can be utilized via several Python implementations, e..g mpi4py and pupyMPI, as the messaging strategy between your parallel programs.# vimeo.com/53060517 Uploaded 322 Plays 5 Likes 0 Comments
Videos from PyData Conferences and related to PyData tools and topics
Browse This Channel
More stuff from “PyData”
Heads up: the shoutbox will be retiring soon. It’s tired of working, and can’t wait to relax. You can still send a message to the channel owner, though!