Skip to Content
Exit Menu



PhD projects in Bayesian statistics for high-dimensional data:

1) The project will be in the area of large scale, high-dimensional Bayesian models for data integration and variable selection. The research will be highly motivated by applications in chronic disease epidemiology, integrating multiple high-dimensional data sets simultaneously, enabling researchers to model mediation effects of intermediate exposures and biological markers.Possible directions include:Bayesian models for high-dimensional longitudinal data, with variable selection and latent variablesModelling mediation effects in high-dimensional epidemiological big data setsMethods for exploring latent structure in Bayesian graphical models2) This project will look at Bayesian methods for simultaneous clustering and variable selection. This is a challenging problem, to find latent structure in high-dimensional data whilst simultaneously discovering the variables which best predict this structure.The project will aim to extend Bayesian clustering, mixture or factor analysis models to incorporate automatic feature selection.Areas of application may be in epidemiology, health economics or bioinformatics.

For further details please contact me


My main research area is developing Bayesian methods in statistical genomics and epidemiology, in particular Bayesian hierarchical models and variable selection models. I have worked on Bayesian models for analysing high-throughput molecular biology data, including gene expression microarrays, next-generation RNA-sequence data and metabolomics data. My current research is on methods for data integration and variable selection for multiple "omics" data sets.

I also work on methods in the Classical statistical framework, and apply these methods in genetic epidemiology and medical applications. I am particularly interested in variable selection and multiple testing issues.

I have a background in Mathematics and a PhD in Cosmology, where I worked on detecting non-Gaussianity in the cosmic microwave background and on analysis methods for Type Ia supernovae light curves.

  • Statistical methodology: Highly structured stochastic systems; Bayesian hierarchical models;Variable selection and prediction; Bayesian model criticism; Methods for multiple testing.
  • Statistical genomics and genetic epidemiology: Variable selection in high-dimensional modelling of genomics, epigenomics, transcriptomics, proteomics and metabolomics data.
  • Molecular Biology: Statistical methods for modelling high-throughput molecular biology data, including microarray and sequencing data.

Some recent publications:

Lewin A et al. (2015), MT-HESS: an efficient Bayesian approach for simultaneous association detection in OMICS datasets, with application to eQTL mapping in multiple tissues. Bioinformatics (in press).Janes J, Hu F, Lewin AM, Turro E. (2015). A comparative study of RNA-seq analysis strategies. Briefings in Bioinformatics, 2015. doi: 10.1093/bib/bbv007.Van der Valk et al. (2015), A novel common variant in DCST2 is associated with length in early life and height in adulthood. Hum Mol Genet. 24(4):1155-68.Chambers J et al. (2014), The South Asian Genome. PLoS ONE 9(8): e102645.Kirk P, Witkover A, Bangham CR, Richardson S, Lewin AM, Stumpf MP. (2013), Balancing the robustness and predictive performance of biomarkers. J. Comp. Biol. December 2013, 20(12): 979-989.Thillai M, Eberhardt C, Lewin AM, Potiphar L, Hingley-Wilson S, et al. (2012), Sarcoidosis and Tuberculosis Cytokine Profiles: Indistinguishable in Bronchoalveolar Lavage but Different in Blood. PLoS ONE 7(7): e38083.Kirk P, Witkover A, Courtney A, Lewin A, Wait R, Stumpf M, Richardson S, Taylor G and Bangham C (2011), Plasma proteome analysis in HTLV-1-associated myelopathy/tropical spastic paraparesis. Retrovirology. 2011 Oct 12;8:81.Turro E, Su S-Y, Goncalves A, Coin L J M, Richardson S and Lewin A (2011), Haplotype and isoform specific expression estimation using multi-mapping RNA-seq reads. Genome Biology Vol. 12, R13.

Newest selected publications

Lewin, A. (Accepted) 'Free serum haemoglobin is associated with brain atrophy in secondary progressive multiple sclerosis'. Wellcome Open Research.Open Access Link

Journal article

Lewin, A., Saadi, H., Peters, JE., Moreno-Moral, A., Lee, JC., Smith, KGC., et al. (2016) 'MT-HESS: an efficient Bayesian approach for simultaneous association detection in OMICS datasets, with application to eQTL mapping in multiple tissues'. BIOINFORMATICS. ISSN: 1367-4803 Open Access Link

Journal article

Felix, JF., Bradfield, JP., Monnereau, C., van der Valk, RJP., Stergiakouli, E., Chesi, A., et al. (2016) 'Genome-wide association analysis identifies three new susceptibility loci for childhood body mass index'. HUMAN MOLECULAR GENETICS, 25 (2). pp. 389 - 403. ISSN: 0964-6906

Journal article

van der Valk, RJ., Kreiner-Møller, E., Kooijman, MN., Guxens, M., Stergiakouli, E., Sääf, A., et al. (2015) 'A novel common variant in DCST2 is associated with length in early life and height in adulthood.'. Human Molecular Genetics, 24 (4). pp. 1155 - 1168.Open Access Link

Journal article

Jänes, J., Hu, F., Lewin, A. and Turro, E. (2015) 'A comparative study of RNA-seq analysis strategies'. Briefings in Bioinformatics. ISSN: 1467-5463 Open Access Link

Journal article
More publications(22)