Scikit-learn is a popular Python machine learning library. In this tutorial, I'll give an introduction to the core concepts of machine learning, using scikit-learn to demonstrate applications of these concepts on real-world datasets. We'll cover some of the most powerful and popular supervised and unsupervised learning techniques, including classification and regression models like Support Vector Machines and Random Forests, clustering models like K Means and Gaussian Mixtures, and dimensionality reduction models like PCA and manifold learning. Throughout, I'll emphasize the key features of the scikit-learn API, so that participants will be well-poised to begin exploring their own datasets using the wide array of algorithms implemented in scikit-learn.
Jake Vanderplas is an NSF Postdoctoral fellow working jointly in the Computer Science and Astronomy departments at the University of Washington. His research involves large-scale machine learning applications within astronomy and astrophysics. He is a maintainer of the Python packages Scikit-learn and Scipy, and regularly contributes to several of the other packages within the numpy/scipy ecosystem. He occasionally blogs about Python-related topics at Pythonic Perambulations - http://jakevdp.github.com.
What is PyData?
PyData.org is the home for all things related to the use of Python in data management and analysis. This site aims to make open source data science tools easily accessible by listing the links in one location. If you would like to submit a download link or any items to be listed in PyData News, please let us know at: email@example.com
PyData conferences are a gathering of users and developers of data analysis tools in Python. The goals are to provide Python enthusiasts a place to share ideas and learn from each other about how best to apply the language and tools to ever-evolving challenges in the vast realm of data management, processing, analytics, and visualization.
We aim to be an accessible, community-driven conference, with tutorials for novices, advanced topical workshops for practitioners, and opportunities for package developers and users to meet in person.
A major goal of PyData events and conferences is to provide a venue for users across all the various domains of data analysis to share their experiences and their techniques, as well as highlight the triumphs and potential pitfalls of using Python for certain kinds of problems.
PyData is organized by NumFOCUS with the generous help and support of our sponsors. Proceeds from PyData are donated to NumFOCUS and used for the continued development of the open-source tools used by data scientists If you would like to volunteer to be a part of the PyData team contact us at: firstname.lastname@example.org