Feature Selection with Interactions in Logistic Regression Models using Multivariate Synergies for a GWAS Application Conference Paper uri icon


  • © 2017 Copyright held by the owner/author(s). Identifying "synergistic" interactions with respect to the outcome of interest can help accurate phenotypic prediction and understand the underlying mechanism of system behavior. Many statistical measures for estimating synergistic interactions have been proposed in the literature for such a purpose. However, except for empirical performances, there is still no theoretical analysis on the power and limitation of these synergistic interaction measures. In this paper, we provide a rigorous theoretical analysis on how the information-theoretic multivariate synergy helps with identifying genetic risk factors via synergistic interactions. When genotype-phenotype relationships can be modeled with logistic regression, it is shown that the multivariate synergy depends on a small subset of the interaction parameters in the model, sometimes on only one interaction parameter. We further conduct the experiments over both a simulated data set and a real-world Genome-Wide Association Study (GWAS) data set to show the effectiveness.

author list (cited authors)

  • Xu, E. L., Qian, X., Yu, Q., Zhang, H., & Cui, S.

citation count

  • 1

publication date

  • August 2017


  • ACM  Publisher