Data mining is the study of efficiently finding structures and patterns in data sets. This class focuses on modern problems with a statistical and probabilistic basis, and techniques that scale to extreme large sizes. It will stress the trade-offs between detailed modeling and scalable computation. Students will learn the fundamentals behind the science and will work on projects with large real datasets.
Video produced by the SCI Institute and Lexie Floor.