EthSEQ: ethnicity annotations from whole exome sequencing data


Whole exome sequencing (WES) is widely utilized both in translational cancer genomics studies and in the setting of precision medicine. Stratification of individual’s ethnicity is fundamental for the correct interpretation of personal genomic variation impact. We implemented EthSEQ to provide reliable and rapid ethnicity annotation from whole exome sequencing individual’s data. EthSEQ can be integrated into any WES based processing pipeline and exploits multi-core capabilities.

EthSEQ requires genotype data at SNPs positions for a set of individuals with known ethnicity (the reference model) and either a list of BAM files or genotype data (in VCF format) of individuals with unknown ethnicity. EthSEQ annotates the ethnicity of each individual using an automated procedure and returns detailed information about individual’s inferred ethnicity, including aggregated visual reports.


You can either install EthSEQ v2 from github repository using devtools package or directly from CRAN repository.

EthSEQ on github


Alessandro Romanel, Tuo Zhang, Olivier Elemento, Francesca Demichelis. EthSEQ: ethnicity annotation from whole exome sequencing data. Bioinformatics 2017 btx165.