Alex Lewin




PhD projects in Bayesian statistics for high-dimensional data:

The project will be in the area of large scale, high-dimensional Bayesian models for data integration and variable selection. The research will be highly motivated by applications in chronic disease epidemiology, integrating multiple high-dimensional data sets simultaneously, enabling researchers to model mediation effects of intermediate exposures and biological markers.

Possible directions include:

Bayesian models for high-dimensional longitudinal data, with variable selection and latent variables

Modelling mediation effects in high-dimensional epidemiological big data sets

Methods for exploring latent structure in Bayesian graphical models

This project will look at Bayesian methods for simultaneous clustering and variable selection. This is a challenging problem, to find latent structure in high-dimensional data whilst simultaneously discovering the variables which best predict this structure.

The project will aim to extend Bayesian clustering, mixture or factor analysis models to incorporate automatic feature selection.

Areas of application may be in epidemiology, health economics or bioinformatics.

For further details please contact me


My main research area is developing Bayesian methods in statistical genomics and epidemiology, in particular Bayesian hierarchical models and variable selection models. I have worked on Bayesian models for analysing high-throughput molecular biology data, including gene expression microarrays, next-generation RNA-sequence data and metabolomics data. My current research is on methods for data integration and variable selection for multiple "omics" data sets.

I also work on methods in the Classical statistical framework, and apply these methods in genetic epidemiology and medical applications. I am particularly interested in variable selection and multiple testing issues.

I have a background in Mathematics and a PhD in Cosmology, where I worked on detecting non-Gaussianity in the cosmic microwave background and on analysis methods for Type Ia supernovae light curves.

  • Statistical methodology: Highly structured stochastic systems; Bayesian hierarchical models;
    Variable selection and prediction; Bayesian model criticism; Methods for multiple testing.
  • Statistical genomics and genetic epidemiology: Variable selection in high-dimensional modelling of genomics, epigenomics, transcriptomics, proteomics and metabolomics data.
  • Molecular Biology: Statistical methods for modelling high-throughput molecular biology data, including microarray and sequencing data.

Some recent publications:

Lewin A et al. (2015), MT-HESS: an efficient Bayesian approach for simultaneous association detection in
OMICS datasets, with application to eQTL mapping in multiple tissues. Bioinformatics (in press).

Janes J, Hu F, Lewin AM, Turro E. (2015). A comparative study of RNA-seq analysis strategies. Briefings in
Bioinformatics, 2015. doi: 10.1093/bib/bbv007.

Van der Valk et al. (2015), A novel common variant in DCST2 is associated with length in early life and height in
adulthood. Hum Mol Genet. 24(4):1155-68.

Chambers J et al. (2014), The South Asian Genome. PLoS ONE 9(8): e102645.

Kirk P, Witkover A, Bangham CR, Richardson S, Lewin AM, Stumpf MP. (2013), Balancing the robustness and
predictive performance of biomarkers. J. Comp. Biol. December 2013, 20(12): 979-989.

Thillai M, Eberhardt C, Lewin AM, Potiphar L, Hingley-Wilson S, et al. (2012), Sarcoidosis and Tuberculosis
Cytokine Profiles: Indistinguishable in Bronchoalveolar Lavage but Different in Blood.
PLoS ONE 7(7):

Kirk P, Witkover A, Courtney A, Lewin A, Wait R, Stumpf M, Richardson S, Taylor G and Bangham C (2011),
Plasma proteome analysis in HTLV-1-associated myelopathy/tropical spastic paraparesis. Retrovirology.
2011 Oct 12;8:81.

Turro E, Su S-Y, Goncalves A, Coin L J M, Richardson S and Lewin A (2011), Haplotype and isoform specific
expression estimation using multi-mapping RNA-seq reads. Genome Biology Vol. 12, R13.


Selected Publications

Journal articles

Lewin, A.   (Accepted)   'Free serum haemoglobin is associated with brain atrophy in secondary progressive multiple sclerosis'. Wellcome Open ResearchDownload publication 

Felix, JF. , Bradfield, JP. , Monnereau, C. , van der Valk, RJP. , et al.   (2016)   'Genome-wide association analysis identifies three new susceptibility loci for childhood body mass index'. HUMAN MOLECULAR GENETICS, 25 (2).  pp. 389 - 403. doi: 10.1093/hmg/ddv472 

Lewin, A. , Saadi, H. , Peters, JE. , Moreno-Moral, A. , et al.   (2016)   'MT-HESS: an efficient Bayesian approach for simultaneous association detection in OMICS datasets, with application to eQTL mapping in multiple tissues'. BIOINFORMATICS. doi: 10.1093/bioinformatics/btv568 Download publication 

Jänes, J. , Hu, F. , Lewin, A.  and Turro, E.   (2015)   'A comparative study of RNA-seq analysis strategies'. Briefings in Bioinformatics. doi: 10.1093/bib/bbv007 Download publication 

van der Valk, RJ. , Kreiner-Møller, E. , Kooijman, MN. , Guxens, M. , et al.   (2015)   'A novel common variant in DCST2 is associated with length in early life and height in adulthood.'. Human Molecular Genetics, 24 (4).  pp. 1155 - 1168. doi: 10.1093/hmg/ddu510 Download publication 

Chambers, JC. , Abbott, J. , Zhang, W. , Turro, E. , et al.   (2014)   'The South Asian genome'. PLoS ONE, 9 (8).  doi: 10.1371/journal.pone.0102645 Download publication 

Kirk, P. , Witkover, A. , Bangham, CRM. , Richardson, S. , et al.   (2013)   'Balancing the robustness and predictive performance of biomarkers'. Journal of Computational Biology, 20 (12).  pp. 979 - 989. doi: 10.1089/cmb.2013.0018 

Thillai, M. , Eberhardt, C. , Lewin, AM. , Potiphar, L. , et al.   (2012)   'Sarcoidosis and tuberculosis cytokine profiles: Indistinguishable in bronchoalveolar lavage but different in blood'. PLoS ONE, 7 (7).  doi: 10.1371/journal.pone.0038083 Download publication 

Ikram, MA. , Fornage, M. , Smith, AV. , Seshadri, S. , et al.   (2012)   'Common variants at 6q22 and 17q21 are associated with intracranial volume'. Nature Genetics, 44 (5).  pp. 539 - 544. doi: 10.1038/ng.2245 

Taal, HR. , St Pourcain, B. , Thiering, E. , Das, S. , et al.   (2012)   'Common variants at 12q15 and 12q24 are associated with infant head circumference'. NATURE GENETICS, 44 (5).  pp. 532 - +. doi: 10.1038/ng.2238 

Kirk, PDW. , Witkover, A. , Courtney, A. , Lewin, AM. , et al.   (2011)   'Plasma proteome analysis in HTLV-1-associated myelopathy/tropical spastic paraparesis'. Retrovirology, 8 (1).  pp. 81 - 81. doi: 10.1186/1742-4690-8-81 

Turro, E. , Su, SY. , Gonçalves, A. , Coin, LJM. , et al.   (2011)   'Haplotype and isoform specific expression estimation using multi-mapping RNA-seq reads'. Genome Biology, 12 (2).  doi: 10.1186/gb-2011-12-2-r13 Download publication 

Kulinskaya, E.  and Lewin, A.   (2009)   'Testing for linkage and Hardy-Weinberg disequilibrium'. Annals of Human Genetics, 73 (2).  pp. 253 - 262. doi: 10.1111/j.1469-1809.2008.00501.x 

Kulinskaya, E.  and Lewin, A.   (2009)   'On fuzzy familywise error rate and false discovery rate procedures for discrete distributions'. Biometrika, 96 (1).  pp. 201 - 211. doi: 10.1093/biomet/asn061 

Turro, E. , Lewin, A. , Rose, A. , Dallman, MJ.  and Richardson, S.   (2009)   'MMBGX: A method for estimating expression at the isoform level and detecting differential splicing using whole-transcript Affymetrix arrays'. Nucleic Acids Research, 38 (1).  pp. e4 - e4. doi: 10.1093/nar/gkp853 Download publication 

Lewin, A. , Bochkina, N.  and Richardson, S.   (2007)   'Fully Bayesian Mixture Model for Differential Gene Expression: Simulations and Model Checks'. Statistical Applications in Genetics and Molecular Biology, 6 (1).  doi: 10.2202/1544-6115.1314 

Lewin, A.  and Grieves, I.   (2006)   'Grouping Gene Ontology terms to improve the assessment of gene set enrichment in microarray data'. BMC Bioinformatics, 7 doi: 10.1186/1471-2105-7-426 

Lewin, A. , Richardson, S. , Marshall, C. , Glazier, A.  and Aitman, T.   (2006)   'Bayesian Modeling of Differential Gene Expression'. Biometrics, 62 (1).  pp. 10 - 18. doi: 10.1111/j.1541-0420.2005.00394.x 

Broet, P. , Lewin, A. , Richardson, S. , Dalmasso, C.  and Magdelenat, H.   (2004)   'A mixture model-based strategy for selecting sets of genes in multiclass response microarray experiments'. Bioinformatics, 20 (16).  pp. 2562 - 2571. doi: 10.1093/bioinformatics/bth285 

Jarup, L. , Briggs, D. , de Hoogh, C. , Morris, S. , et al.   (2002)   'Cancer risks in populations living near landfill sites in Great Britain'. Br J Cancer. doi: 10.1038/sj.bjc.6600311 

Lewin, A.  and Albrecht, A.   (2001)   'Can inflationary models of cosmic perturbations evade the secondary oscillation test?'. Physical Review D, 64 (2).  doi: 10.1103/PhysRevD.64.023514 

Lewin, A. , Albrecht, A.  and Magueijo, J.   (1999)   'A new statistic for picking out non-Gaussianity in the CMB'. Monthly Notices of the Royal Astronomical Society, 302 (1).  pp. 131 - 138. doi: 10.1046/j.1365-8711.1999.02104.x 

Page last updated: Tuesday 07 February 2017