Accurate and fast multiple-testing correction in eQTL studies
In studies of expression quantitative trait loci (eQTLs), it is of increasing interest to identify eGenes, the genes whose expression levels are associated with variation at a particular genetic variant. Detecting eGenes is important for follow-up analyses and prioritization because genes are the main entities in biological processes. To detect eGenes, one typically focuses on the genetic variant with the minimum p value among all variants in cis with a gene and corrects for multiple testing to obtain a gene-level p value. For performing multiple-testing correction, a permutation test is widely used. Because of growing sample sizes of eQTL studies, however, the permutation test has become a computational bottleneck in eQTL studies. In this paper, we propose an efficient approach for correcting for multiple testing and assess eGene p values by utilizing a multivariate normal distribution. Our approach properly takes into account the linkage-disequilibrium structure among variants, and its time complexity is independent of sample size. By applying our small-sample correction techniques, our method achieves high accuracy in both small and large studies. We have shown that our method consistently produces extremely accurate p values (accuracy > 98%) for three human eQTL datasets with different sample sizes and SNP densities: the Genotype-Tissue Expression pilot dataset, the multi-region brain dataset, and the HapMap 3 dataset.
- Profiling Reactive Metabolites via Chemical Trapping and Targeted Mass Spectrometry
- Does the brain listen to the gut?
- (Meta)genomic insights into the pathogenome of Cellulosimicrobium cellulans
- A robust adaptive denoising framework for real-time artifact removal in scalp EEG measurements
- Imputing Gene Expression in Uncollected Tissues Within and Beyond GTEx
- Small Rad51 and Dmc1 Complexes Often Co-occupy Both Ends of a Meiotic DNA Double Strand Break
- Controlling the Cyanobacterial Clock by Synthetically Rewiring Metabolism
- Choosing experiments to accelerate collective discovery
- The transcriptional landscape of age in human peripheral blood
- Digital signaling decouples activation probability and population heterogeneity