ஜர்னல் ஆஃப் க்ளைகோமிக்ஸ் & லிபிடோமிக்ஸ்

ஜர்னல் ஆஃப் க்ளைகோமிக்ஸ் & லிபிடோமிக்ஸ்
திறந்த அணுகல்

ஐ.எஸ்.எஸ்.என்: 2153-0637

சுருக்கம்

Identification of clinically useful genomic and epigenomic variants

Xiong Momiao

Next generation sequencing technologies will generate unprecedentedly massive (thousands or even ten thousands of individuals) and highly-dimensional (up to hundreds of millions) genomic and epigenomic variation data. A fundamental question is how to efficiently extract genomic and epigenomic information of clinical significance. Traditional paradigm for identifying variants of clinical validity is to test association of the variants. However, significantly associated genetic variants may or may not be usefulness for diagnosis and prognosis of diseases. Alternative to association studies for finding genetic variants of predictive utility is to systematically search variants that contain sufficient information for phenotype prediction. To achieve this, we introduce concepts of sufficient dimension reduction which project the original high dimensional data to very low dimensional space while preserving all information on response phenotypes. We then formulate clinically significant genetic and epigenetic variant discovery problem into sparse SDR problem and develop algorithms that can select significant genetic variants from up to or even ten millions of predictors with the aid of dividing SDR for whole genome into a number of sub-SDR problems defined for genomic regions. The sparse SDR is in turn formulated as sparse optimal scoring problem. To speed up computation, we apply the alternating direction method for multipliers to solving the sparse optimal scoring problem which can easily be implemented in parallel. To illustrate its application, the proposed method is applied to the TCGA overall cancer dataset.

Top