Pf8: an open dataset of Plasmodium falciparum genome variation in 33,325 worldwide samples [version 1; peer review: awaiting peer review]
We describe the Pf8 data resource, the latest MalariaGEN release of curated genome variation data on over 33,000 Plasmodium falciparum samples from 99 partner studies and 122 locations over more than 50 years. This release provides open access to raw sequencing data and genotypes at over 12 million genomic positions. For the first time, it includes copy-number variation (CNV) calls in the drug-resistance associated genes gch1 and crt. As in Pf7, CNV calls are provided for mdr1 and plasmepsin2/3, along with calls for deletion in hrp2 and hrp3, genes associated with rapid diagnostic test failures. This data resource additionally features derived datasets, interactive web applications for exploring patterns of drug resistance and variation in over 5,000 genes, an updated Python package providing methods for accessing and analysing the data, and open access analysis notebooks that can be used as starting points for further analyses. In addition, informative example analyses show contrasting profiles of the decline of chloroquine resistance-associated mutations in Africa, and variation in copy number variation across 10 distinct sub-populations. To the best of our knowledge, Pf8 is the largest open data set of genome variation in any eukaryotic species, making it an invaluable foundational resource for understanding evolution, including that of pathogens.
Item Type | Article |
---|---|
Elements ID | 241406 |
Official URL | https://doi.org/10.12688/wellcomeopenres.24031.1 |
Date Deposited | 04 Jul 2025 14:58 |