Bioinformatic analysis of Mycobacterium tuberculosis whole genome data

Coll I Cerezo, F; (2015) Bioinformatic analysis of Mycobacterium tuberculosis whole genome data. PhD thesis, London School of Hygiene & Tropical Medicine. DOI:

Text - Accepted Version

Download (18MB) | Preview


Tuberculosis (TB) caused by bacteria of the Mycobacterium- tuberculosis complex (MTBC) is the second major cause of death from an infectious disease worldwide. Recent advances in DNA sequencing are leading to the ability to generate whole genome information of clinical isolates of MTBC. The objectives of this work include developing bioinformatic tools for processing and making accessible MTBC genomic data, as well as the identification of informative genetic markers, both strainOspecific and associated with drug resistance (DR), to barcode MTBC isolates in research and clinical settings. SpolPred software was developed to accurately predict the spoligotype from raw sequence reads, and used to bridge the gap between classical genotyping and highO throughput sequencing. A genome variation discovery pipeline was implemented to derive genomic polymorphisms from MTBC raw sequence data. This pipeline was applied to >1,500 publicly available isolates and the characterised genomic variation hosted in PolyTB, a webObased tool where genetic variants can be investigated using a genome browser, a world map showing their global allele distribution, and an additional phylogenetic view. An extensive repertoire of strainOspecific mutations was identified, of which a subset was proposed to accurately discriminate known MTBC circulating strains. A curated list of DR associated mutations was compiled from the literature and their diagnostic accuracy for predicting phenotypic resistance assessed. In addition, potentially novel genes involved in DR were discovered by applying genomeOwide association approaches to a global population of more than 2,500 MTBC strains. Whole genome sequencing (WGS) promises to be transformative for the practice of clinical microbiology, and the rapidly falling cost and turnaround time mean that this will become a viable technology in clinical settings. In this new paradigm, the presented work will facilitate the transition to and applications of WGS in clinical settings as an important tool for TB control.

Item Type: Thesis
Thesis Type: Doctoral
Thesis Name: PhD
Contributors: Clark, TG (Thesis advisor);
Additional Information:
Faculty and Department: Faculty of Infectious and Tropical Diseases > Dept of Pathogen Molecular Biology
Funders: Bloomsbury Colleges PhD Studentships
Copyright Holders: Francesc Coll I Cerezo


Download activity - last 12 months
Downloads since deposit
Accesses by country - last 12 months
Accesses by referrer - last 12 months
Impact and interest
Additional statistics for this record are available via IRStats2

Actions (login required)

Edit Item Edit Item