I will present work on sentiment analysis for social media data. I will frame the sentiment task and discuss a commonly used, simple (yet effective) approach -- proximity sentiment. I will discuss challenges for proximity sentiment and motivate the use of syntactic information, coreference and meronymy. I will present a fast and accurate dependency parser and its use for sentiment. We challenge the assumption that sentiment expressions can be matched through direct dictionary lookup and demonstrate that a statistical system for identifying sentiment expressions performs better. To empirically guide our sentiment work we have built and released publicly a large sentiment corpus containing detailed annotations -- semantic types of mentions, coreference, meronymy, comparison, sentiment expressions, intensifiers, negators, etc. I will also discuss our clustering and multilingual efforts. Time permitting I will comment on NLP projects at Microsoft.
Dr. Nicolas Nicolov is a Principal Researcher at Microsoft, Seattle currently working on technologies for local search. He was the Senior Science Director at the Web Intelligence division at J.D.Power and Associates (formerly Chief Scientist at Boulder-based Umbria, Inc.). The team focused on marketing intelligence by mining social media data for real-time insights into companies, products, people, and issues. Before joining JDPA/Umbria he was a research staff member at IBM's T.J. Watson Research Center. His research has focused on robust, efficient and scalable techniques for processing social media forums: sentiment/opinion analysis, dependency parsing, spam identification, demographic analysis (age, gender), ranking, topic identification through clustering, multilingual analysis. He received an MSci from the University of Sofia. He then joined the Department of Artificial Intelligence at the University of Edinburgh where he did his PhD in the area of Natural Language Generation (memoization in sentence generation from conceptual graphs using lexicalized grammars). He was the developer of the PROTECTOR NLG system. He was a Research Fellow at the School of Cognitive Science, University of Sussex working on grammar engineering and wide-coverage parsing for English. He joined IBM to work on dialog systems, statistical multilingual automatic content extraction (entity extraction, coreference, relation identification), quote analysis, time analysis. The time expression system he built for TERN-2004 was the best at the NIST-run evaluation. He has worked for Apple on OS localization and was a visiting scholar at LIMSI-CNRS, French National Center for Scientific Research, and IMS - Institute for NLP, University of Stuttgart. He is a co-organizer of the biennial European conference on Recent Advances in Natural Language Processing. He is on the NLP advisory board of John Benjamins Publishers. He will be the general co-chair of the Social Media conference (ICWSM-2011) in Barcelona.
Loading more stuff…
Hmm…it looks like things are taking a while to load. Try again?