DSpace Repository

Low-coverage sequencing cost-effectively detects knownand novel variation in underrepresented populations

Show simple item record

dc.contributor.author Martin, Alicia R.
dc.contributor.author Atkinson, Elizabeth G.
dc.contributor.author Ashaba, Fred K.
dc.contributor.author Atwoli, Lukoye
dc.contributor.author Gichuru​, Stella
dc.contributor.author Injera, Wilfred E.
dc.contributor.author Kariuki​, Symon M.
dc.contributor.author Roxanne, James​
dc.contributor.author Kigen, Gabriel
dc.contributor.author Dodge, Sheila
dc.contributor.author Stevenson, Anne
dc.date.accessioned 2020-10-14T08:39:23Z
dc.date.available 2020-10-14T08:39:23Z
dc.date.issued 2020
dc.identifier.uri https://doi.org/10.1101/2020.04.27.064832
dc.identifier.uri http://ir.mu.ac.ke:8080/jspui/handle/123456789/3562
dc.description.abstract Background Genetic studies of biomedical phenotypes in underrepresented populations identify disproportionate numbers of novel associations. However, current genomics infrastructure--including most genotyping arrays and sequenced reference panels--best serves populations of European descent. A critical step for facilitating genetic studies in underrepresented populations is to ensure that genetic technologies accurately capture variation in all populations. Here, we quantify the accuracy of low-coverage sequencing in diverse African populations. Results We sequenced the whole genomes of 91 individuals to high-coverage (≥20X) from the Neuropsychiatric Genetics of African Population-Psychosis (NeuroGAP-Psychosis) study, in which participants were recruited from Ethiopia, Kenya, South Africa, and Uganda. We empirically tested two data generation strategies, GWAS arrays versus low-coverage sequencing, by calculating the concordance of imputed variants from these technologies with those from deep whole genome sequencing data. We show that low-coverage sequencing at a depth of ≥4X captures variants of all frequencies more accurately than all commonly used GWAS arrays investigated and at a comparable cost. Lower depths of sequencing (0.5-1X) performed comparable to commonly used low-density GWAS arrays. Low-coverage sequencing is also sensitive to novel variation, with 4X sequencing detecting 45% of singletons and 95% of common variants identified in high-coverage African whole genomes. Conclusion These results indicate that low-coverage sequencing approaches surmount the problems induced by the ascertainment of common genotyping arrays, including those that capture variation most common in Europeans and Africans. Low-coverage sequencing effectively identifies novel variation (particularly in underrepresented populations), and presents opportunities to enhance variant discovery at a similar cost to traditional approaches. en_US
dc.language.iso en en_US
dc.publisher bioRxiv en_US
dc.subject Low-coverage sequencing en_US
dc.subject Whole genome sequencing en_US
dc.subject Study design en_US
dc.title Low-coverage sequencing cost-effectively detects knownand novel variation in underrepresented populations en_US
dc.type Article en_US


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account