Collaborative Research: Statistical Inference for Functional and High Dimensional Data with New Dependence Metrics
- View All
Due to the rapid development of information technologies and their applications in many scientific fields such as climate science, medical imaging, and finance, statistical analysis of high-dimensional data and infinite-dimensional functional data has become increasingly important. A key challenge associated with the analysis of such big data is how to measure and infer complex dependence structure, which is a fundamental step in statistics and becomes more difficult owing to the data''s high dimensionality and huge size. The main goal of this research project is to develop new dependence measures for quantifying dependence of large scale data sets such as temporally dependent functional data and high dimensional data, and utilize these new measures to develop novel statistical tools for conducting sparse principal component analysis, dimensional reduction, and simultaneous hypothesis testing. Building on the new dependence metrics that can capture nonlinear and non-monotonic dependence, the methodologies under development are expected to lead to more accurate prediction and inference, as well as more effective dimension reduction in the analysis of functional and high dimensional data.The research consists of three projects addressing different challenges in the analysis of functional and high dimensional data. In Project 1, the investigators introduce a new operator-valued quantity to characterize the conditional mean (in)dependence of one function-valued random element given another, and apply the newly developed dependent metrics to do dimension reduction for functional time series under a new framework of finite dimensional functional data. In Project 2, the investigators explore a new dimension reduction framework for regression models with high dimensional response, which requires less stringent linear model assumptions and is more flexible in terms of capturing possible nonlinear dependence between the response and the covariates. In Project 3, the investigators develop new tests for the mutual independence of high dimensional data via distance covariance and rank distance covariance using both sum of squares and maximum type test statistics. Overall, the three lines of research are all related to big data, and they touch upon various aspects of modern statistics; the project aims to push the current frontiers in areas including sparse principal component analysis, inference for dependent functional data, and high dimensional multivariate analysis to another level.