Skip to main content

Exploring dependence between categorical variables: Benefits and limitations of using variable selection within Bayesian clustering in relation to searching for interactions

Speaker: Dr Michail Papathomas, University of St Andrews


Detecting interactions when analysing data sets created by large cohort or association studies is becoming increasingly important in Biostatistics. Investigating complex dependence structures within a linear modelling framework is not straightforward due to the difficulty in searching an unwieldy large space of competing models. One approach for reducing the dimensionality of the problem is to utilize a Bayesian modelling approach based on the Dirichlet process. We investigate the relation between the Dirichlet process and linear modelling, and discuss the utility of the Dirichlet process for the exploration of high order interactions, especially when sparse data are analysed.