A statistical method for high-throughput screening of predicted orthologs
Orthologs are genes in different species that diverged from a common ancestral gene after speciation. Their identification is critical for reliable prediction of gene function in newly sequenced genomes. Orthologous genes are usually identified by a high-throughput method called Reciprocal-Best BLAST-hit (RBH). As RBH is subject to errors from incomplete sequencing or gene loss in a species, a bioinformatics tool called Ortholuge was developed that identifies RBH-predicted orthologs with atypical genetic divergence. However, declaring the cut-off for atypical divergence in Ortholuge is very computationally-intensive, and so we propose a faster statistical procedure and examine its performance by simulation. We find that performance depends on the fit of the assumed model for the distribution of divergence measures in true orthologs.
This type of interdisciplinary work is a hallmark of our program in Applied Statistics at Simon Fraser University. For more information, please contact Jeong Eun Min (firstname.lastname@example.org) or her supervisor Jinko Graham (email@example.com) or Brad McNeney (firstname.lastname@example.org), Department of Statistics and Actuarial Science,