Presentation, Deep Learning for Natural Language Processing, by Stephen Pulman, University of Oxford and TheySay, at the March 6, 2014 Sentiment Analysis Symposium (sentimentsymposium.com) in New York.
Deep learning and unsupervised feature learning offer the potential to transform many domains such as vision, speech, and NLP. However, these methods have been fundamentally limited by our computational abilities, and typically applied to small-sized problems. In this talk, I describe the key ideas that enabled scaling deep learning algorithms to train a very large model on a cluster of 16,000 CPU cores (2000 machines). This network has 1.15 billion parameters, which is more than 100x larger than the next largest network reported in the literature.
Such network, when applied at the huge scale, is able to learn abstract concepts in a much more general manner than previously demonstrated. Specifically, we find that by training on 10 million unlabeled images, the network produces features that are very selective for high-level concepts such as human faces and cats. Using these features, we also obtain significant leaps in recognition performance on several large-scale computer vision tasks.
Learn how to build and run a system that gets smarter over time by learning from data. If you already took a class in machine learning this talk will show you how to translate those skills into a working system. If you don’t know any machine learning you will get a gentle introduction and gain a better understanding for how the data-driven sausage is made.
This talk will teach you what goes into building and running a machine learning system, as seen from a Java developer’s perspective. This will prepare you for the next phase of Big Data, where data is used not just for informing decisions but to drive system behavior. Instead of going into the details of algorithms we will discuss the overall system and project lifecycle (modeling, training, serving, re-training). Along the way you will get practical tips and advice about useful tools, some that you already know and some you probably haven’t heard of. Lastly you will get some tips for how can start playing around with this stuff even if nobody is paying you for it.
This talk mainly targets Java developers with an interest in machine learning or just an interest in learning something new. The talk should also be useful for entrepreneurs with ideas for data-driven apps or services, or for anyone working on such projects. To work with machine learning it is useful to know some statistics and probability theory, but for this talk we’ll keep the math to a minimum.