Choi, Hee Youl (2010-05). Manifold Integration: Data Integration on Multiple Manifolds. Doctoral Dissertation. Thesis uri icon

abstract

  • In data analysis, data points are usually analyzed based on their relations to other points (e.g., distance or inner product). This kind of relation can be analyzed on the manifold of the data set. Manifold learning is an approach to understand such relations. Various manifold learning methods have been developed and their effectiveness has been demonstrated in many real-world problems in pattern recognition and signal processing. However, most existing manifold learning algorithms only consider one manifold based on one dissimilarity matrix. In practice, multiple measurements may be available, and could be utilized. In pattern recognition systems, data integration has been an important consideration for improved accuracy given multiple measurements. Some data integration algorithms have been proposed to address this issue. These integration algorithms mostly use statistical information from the data set such as uncertainty of each data source, but they do not use the structural information (i.e., the geometric relations between data points). Such a structure is naturally described by a manifold. Even though manifold learning and data integration have been successfully used for data analysis, they have not been considered in a single integrated framework. When we have multiple measurements generated from the same data set and mapped onto different manifolds, those measurements can be integrated using the structural information on these multiple manifolds. Furthermore, we can better understand the structure of the data set by combining multiple measurements in each manifold using data integration techniques. In this dissertation, I present a new concept, manifold integration, a data integration method using the structure of data expressed in multiple manifolds. In order to achieve manifold integration, I formulated the manifold integration concept, and derived three manifold integration algorithms. Experimental results showed the algorithms' effectiveness in classification and dimension reduction. Moreover, for manifold integration, I showed that there are good theoretical and neuroscientific applications. I expect the manifold integration approach to serve as an effective framework for analyzing multimodal data sets on multiple manifolds. Also, I expect that my research on manifold integration will catalyze both manifold learning and data integration research.
  • In data analysis, data points are usually analyzed based on their relations to
    other points (e.g., distance or inner product). This kind of relation can be analyzed
    on the manifold of the data set. Manifold learning is an approach to understand
    such relations. Various manifold learning methods have been developed and their
    effectiveness has been demonstrated in many real-world problems in pattern recognition and signal processing. However, most existing manifold learning algorithms
    only consider one manifold based on one dissimilarity matrix. In practice, multiple
    measurements may be available, and could be utilized. In pattern recognition systems, data integration has been an important consideration for improved accuracy
    given multiple measurements. Some data integration algorithms have been proposed
    to address this issue. These integration algorithms mostly use statistical information
    from the data set such as uncertainty of each data source, but they do not use the
    structural information (i.e., the geometric relations between data points). Such a
    structure is naturally described by a manifold.
    Even though manifold learning and data integration have been successfully used
    for data analysis, they have not been considered in a single integrated framework.
    When we have multiple measurements generated from the same data set and mapped
    onto different manifolds, those measurements can be integrated using the structural
    information on these multiple manifolds. Furthermore, we can better understand the
    structure of the data set by combining multiple measurements in each manifold using data integration techniques.
    In this dissertation, I present a new concept, manifold integration, a data integration method using the structure of data expressed in multiple manifolds. In order
    to achieve manifold integration, I formulated the manifold integration concept, and
    derived three manifold integration algorithms. Experimental results showed the algorithms' effectiveness in classification and dimension reduction. Moreover, for manifold
    integration, I showed that there are good theoretical and neuroscientific applications.
    I expect the manifold integration approach to serve as an effective framework for
    analyzing multimodal data sets on multiple manifolds. Also, I expect that my research
    on manifold integration will catalyze both manifold learning and data integration
    research.

publication date

  • May 2010