1. Spotify uses a range of large scale machine learning methods to find interesting music recommendations. Using large amounts of implicit data, collaborative filtering is behind features such as radio, related artists, and a number of soon to be released features. These are powered by matrix factorization and other methods that have been scaled up to 100s of billions of data points.

    # vimeo.com/57900625 Uploaded 1,973 Plays 1 Comment
  2. Bill Pottenger of Rutgers was nice enough to come all the way from PA to speak to us on the topic

    # vimeo.com/72623007 Uploaded 193 Plays 0 Comments
  3. Natural Language Search in Solr

    Presented by Tommaso Teofili, Sourcesense

    This presentation aims to showcase how to build and implement a search engine which is able to understand a query written in a way much nearer to spoken language than to keyword-based search using Apache Lucene/Solr and Apache UIMA. A system which can recognize semantics in natural language can be very handy for non expert users, e-learning systems, customer care systems, etc. With such a system it's possible to submit queries such as "hotels near Rome" or "people working at Google" without having to manually transform a user entered natural language query to a Lucene/Solr query.
    The Solr - UIMA integration (since Solr 3.1.0) can help on building such intelligent systems using NLP / Text mining algorithms on documents being indexed and on queries written by the user.
    This module gives Solr the ability of calling UIMA pipelines when documents are indexed to trigger automatic extraction of metadata (i.e. named entities like people, places, organizations, etc.) using existing and custom algorithms as UIMA analysis engines. The talk will cover:
    The Solr - UIMA integration
    Introducing UIMA to Lucene's analysis phase
    Running existing open source NLP algorithms in Lucene/Solr
    Orchestrating blocks to build a sample system able to understand natural language queries
    We'll introduce these points using examples (architectures & code) and a sample demo system.

    # vimeo.com/32406329 Uploaded 1,804 Plays 0 Comments
  4. # vimeo.com/76211227 Uploaded 978 Plays 0 Comments
  5. Slides:

    Jeff Ullman is the Stanford W. Ascherman Professor of Computer Science (Emeritus). His interests include database theory, database integration, data mining, and education using the information infrastructure.

    Some of the most profound ways in which the Web changes our lives would not have happened without a heavy dose of computer-science theory. PageRank, and how it makes Google work, is a well-known example, but there are many others. We shall explore briefly some of the interesting algorithms, such as PageRank variants, minhashing, and locality-sensitive hashing that have given us surprising capabilities.

    # vimeo.com/2658924 Uploaded 1,785 Plays 0 Comments



Browse This Channel

Shout Box

Heads up: the shoutbox will be retiring soon. It’s tired of working, and can’t wait to relax. You can still send a message to the channel owner, though!

Channels are a simple, beautiful way to showcase and watch videos. Browse more Channels.