A Bayesian semiparametric approach for the differential analysis of sequence counts data

Guindani, M; Sepulveda, N; Paulino, CD; Mueller, P; (2014) A Bayesian semiparametric approach for the differential analysis of sequence counts data. Applied statistics, 63 (3). pp. 385-404. ISSN 0035-9254 DOI: https://doi.org/10.1111/rssc.12041

Full text not available from this repository.


Data obtained by using modern sequencing technologies are often summarized by recording the frequencies of observed sequences. Examples include the analysis of T-cell counts in immunological research and studies of gene expression based on counts of RNA fragments. In both cases the items being counted are sequences, of proteins and base pairs respectively. The resulting sequence abundance distribution is usually characterized by overdispersion. We propose a Bayesian semiparametric approach to implement inference for such data. Besides modelling the overdispersion, the approach takes also into account two related sources of bias that are usually associated with sequence counts data: some sequence types may not be recorded during the experiment and the total count may differ from one experiment to another. We illustrate our methodology with two data sets: one regarding the analysis of CD4+ T-cell counts in healthy and diabetic mice and another data set concerning the comparison of messenger RNA fragments recorded in a serial analysis of gene expression experiment with gastrointestinal tissue of healthy and cancer patients.

Item Type: Article
Faculty and Department: Faculty of Infectious and Tropical Diseases > Dept of Immunology and Infection
PubMed ID: 24833809
Web of Science ID: 332206500002
URI: http://researchonline.lshtm.ac.uk/id/eprint/1635782


Download activity - last 12 months
Downloads since deposit
Accesses by country - last 12 months
Accesses by referrer - last 12 months
Impact and interest
Additional statistics for this record are available via IRStats2

Actions (login required)

Edit Item Edit Item