Keynote Title: Feature Selection for High-Dimensional Data Analysis
Keynote Lecturer: Dr. Michel Verleysen
Keynote Chair: Dr. Witold Pedrycz
Presented on: 24-10-2011
Abstract: Machine learning is used nowadays to build models for classification and regression tasks, among others. The learning principle consists in designing the models based on the information contained in the dataset, with as few as possible a priori restriction about the class of models of interest.
While many paradigms exist and are widely used in the context of machine learning, most of them suffer from the "curse of dimensionality". The curse of dimensionality means that some strange phenomena appear when data are represented in a high-dimensional space. These phenomena are most often counter-intuitive: the conventional geometrical interpretation of data analysis in 2- or 3-dimensional spaces cannot be extended to much higher dimensions.
Among the problems related to the curse of dimensionality, the feature redundancy and concentration of the norm are probably those that have the largest impact on data analysis tools. Feature redundancy means that models will lose the identifiability property (for example they will oscillate between equivalent solutions), will be difficult to interpret, etc.; although it is an advantage on the point of view of information content in the data, the redundancy will make the learning of the model more difficult. The concentration of the norm is a more specific unfortunate property of high-dimensional vectors: when the dimension of the space increases, norms and distance will concentrate, making the discrimination between data more difficult.
Feature selection is a key challenge in machine learning, which helps fighting the curse of dimensionality. Feature selection allows us to reduce the number of features effectively used by the models, either beforehand (filter approaches), or during learning (wrapper and embedded approaches). This talk will present state-of-the-art approaches to feature selection, with a particular emphasis on information-theoretic filter approaches. It will also be shown that information-theoretic filter approaches are particularly suited to accommodate for non-standard data (structured data, data with missing values, infinite-dimensional data, etc.), opening the way to new research and application areas in data analysis.
Presented at the following Conference: IJCCI, International Joint Conference on Computational Intelligence
Conference Website: ijcci.org
Loading more stuff…
Hmm…it looks like things are taking a while to load. Try again?