A least squares formulation for canonical correlation analysis Conference Paper uri icon

abstract

  • Canonical Correlation Analysis (CCA) is a well-known technique for finding the correlations between two sets of multi-dimensional variables. It projects both sets of variables into a lower-dimensional space in which they are maximally correlated. CCA is commonly applied for supervised dimensionality reduction, in which one of the multi-dimensional variables is derived from the class label. It has been shown that CCA can be formulated as a least squares problem in the binary-class case. However, their relationship in the more general setting remains unclear. In this paper, we show that, under a mild condition which tends to hold for high-dimensional data, CCA in multi-label classifications can be formulated as a least squares problem. Based on this equivalence relationship, we propose several CCA extensions including sparse CCA using 1-norm regularization. Experiments on multi-label data sets confirm the established equivalence relationship. Results also demonstrate the effectiveness of the proposed CCA extensions. Copyright 2008 by the author(s)/owner(s).

name of conference

  • Proceedings of the 25th international conference on Machine learning - ICML '08

published proceedings

  • Proceedings of the 25th international conference on Machine learning - ICML '08

author list (cited authors)

  • Sun, L., Ji, S., & Ye, J.

citation count

  • 63

complete list of authors

  • Sun, Liang||Ji, Shuiwang||Ye, Jieping

publication date

  • January 2008