Xing, Yue (2018-12). Identification of Copy Number Variants in the Nellore and Angus Founders of a Beef Cattle Mapping Population and Their Effects on Growth and Production Traits. Doctoral Dissertation. Thesis uri icon

abstract

  • Copy number variants (CNV) are insertions or deletions of 1 kb or larger in a genome with variable number of copies compared to a reference genome that can affect phenotypic expression. Methods for identifying and applying CNV are less well developed than those for single nucleotide polymorphisms (SNP). Because CNV can encompass genes or their regulatory regions and contribute to genetic variation of traits of economic importance in beef cattle, it is of interest to study their effects; for example, on birth and weaning weights. This study identified and characterized bovine CNV in founders of a beef cattle mapping population, compared the performance of CNV identification methods, proposed ways to obtain CNV sets with fewer false discoveries, developed an approach to use SNP having high linkage disequilibrium with CNV to analyze association of CNV to economically important traits using genome-wide association studies (GWAS), and developed approaches to incorporate CNV into genomic selection for economically important traits. The performance of read-pair based methods highly rely on the depth of coverage of the tested genome compared to the control genome, selection of a control animal, and selection of window size. Using the consensus set of CNV regions (CNVR) from different control animals may lower the false discovery rate. Read-pair and split read based methods were relatively more stable, but could not identify large insertions. Split read based methods also had difficulty identifying other kinds of large-scale structural variants. Because any method alone was not comprehensive enough, and may result in a high false discovery, it was better to focus on combined methods and the common set of CNVR. GWAS identified the association of CNVR with birth and weaning weights, and predictive modeling helped phenotype prediction by CNVR. Random forest and Bayesian sparse linear mixed models were the best models with highest prediction accuracy. The additive SNP model had slight advantages over dominance and recessive SNP models. Some novel genes that may have effects on birth and weaning weight were discovered. Further analysis will be required to determine if the gene effects discovered are real and how they affect these traits.
  • Copy number variants (CNV) are insertions or deletions of 1 kb or larger in a genome with variable number of copies compared to a reference genome that can affect phenotypic expression. Methods for identifying and applying CNV are less well developed than those for single nucleotide polymorphisms (SNP). Because CNV can encompass genes or their regulatory regions and contribute to genetic variation of traits of economic importance in beef cattle, it is of interest to study their effects; for example, on birth and weaning weights. This study identified and characterized bovine CNV in founders of a beef cattle mapping population, compared the performance of CNV identification methods, proposed ways to obtain CNV sets with fewer false discoveries, developed an approach to use SNP having high linkage disequilibrium with CNV to analyze association of CNV to economically important traits using genome-wide association studies (GWAS), and developed approaches to incorporate CNV into genomic selection for economically important traits.
    The performance of read-pair based methods highly rely on the depth of coverage of the tested genome compared to the control genome, selection of a control animal, and selection of window size. Using the consensus set of CNV regions (CNVR) from different control animals may lower the false discovery rate. Read-pair and split read based methods were relatively more stable, but could not identify large insertions. Split read based methods also had difficulty identifying other kinds of large-scale structural variants. Because any method alone was not comprehensive enough, and may
    result in a high false discovery, it was better to focus on combined methods and the common set of CNVR. GWAS identified the association of CNVR with birth and weaning weights, and predictive modeling helped phenotype prediction by CNVR. Random forest and Bayesian sparse linear mixed models were the best models with highest prediction accuracy. The additive SNP model had slight advantages over dominance and recessive SNP models. Some novel genes that may have effects on birth and weaning weight were discovered. Further analysis will be required to determine if the gene effects discovered are real and how they affect these traits.

publication date

  • December 2018