Skip to Content

The strain collections for synthetic biology

Comparative genomics and the preferential use of system-compatible bioParts are central to our Microbial BioEngineering strategies. There are many undefined determinants of the efficiency and stability of genes and pathways within living systems, and it is frequently possible to engineer desirable metabolic functions in bacterial systems – but they are frequently not particularly efficient or translatable to industrial and other applications. One well-known determinant of system compatibility is codon usage, but there are many others including factors such as methylation and restriction systems, regulatory RNA-mediated and metabolite-mediated processes, protein-protein interactions, and many others. And, we believe that using parts primarily derived from systems with highly similar core regulatory and metabolic properties is an engineering strategy that allows us to account as far as possible for our ‘known unknowns’ and ‘unknown unknowns’ – while exploiting as much prior knowledge that can be applied to design as possible.

Key concepts in this strategy are:

  • That different strains of the same species of bacteria contain significant diversity in both gene content and allelic variation 
  • That significant evolutionary time leads to diversification of different strains of the same species which is exploitable 
  • That different strains have evolved to be optimized to different environments and behaviours 
  • That internal and external differences have led to distinct evolutionary trajectories and different ‘fit solutions’ to emerge over time, even for bacteria adapted to similar environments 
  • That differences in the ‘core’ genes (those common to all strains) contribute to metabolic and other behaviours, as well as differences in gene complement 
  • That characterization of these systems in terms of behavioural determinants can provide the framework for design and optimization of both chassis and functionally specialized cells 
  • That these system bioParts can become the basis of largely ‘molecular biology free’ strain engineering

We currently have small and index strains for the development of the underpinning informatics for several species, and established collections and platform informatics for:

Escherichia coli with approximately 200 unrelated strains

The E. coli strain collection for synthetic biology

E. coli is the most defined free-living organism. It is used for many synthetic biology and biotechnology settings, and is based upon prior knowledge it is the best system currently available for system modelling, metabolic engineering, and laboratory use in the areas to which it is applicable. Some strains are associated with infection, predominantly urinary and intestinal infections, and rarely more serious infection, but the majority of strains are associated with animals (including bird) where they are normally harmless commensals.

This strain collection contains 200 unrelated strains of E. coli obtained from a variety of sources, plus a sequence verified strain K12 MG1655 representative. The strains are highly diverse genetically, and are divided into scalable sub-groups, based upon their diversity, in order to perform comparative functional analyses.

For internal and collaborative analyses the strains have been annotated for the presence or absence of 20,000 coding features using novel in-house highly consistent annotation strategies to support functional comparative analyses. Further annotation and characterization of the collection is ongoing.

The strain collection has been used to re-determine the core genome of the species, which is the basis for building new metabolic models of the system. We have found that the core genome has been under-estimated by current published reports, and this is the basis for designing new optimized chassis strains of E. coli.

Given the current state of this collection and assessments of its diversity, future additions are planned to be on the basis of targeted strain selection rather than random addition addressing under-represented areas of species diversity and inclusion of strains with specifically selected behaviours / properties of interest.

Academic or commercial researchers interested in collaborative use of the collection, or project-focussed extension of the collection and behavioural analysis of strains should email Professor Saunders to discuss.

Klebsiella pneumoniae with approximately 60 unrelated strains

The Klebsiella pneumoniae strain collection for synthetic biology

Klebsiella pneumoniae has an unfortunate name, reflecting the ability of some strains to cause lung infections that normally only occur in patients with underlying physical damage or other problems with the normal clearance mechanisms of the lung. It is also of biomedical importance because this species is capable of acquiring resistance to multiple antibiotics, and is one of the species in which antibiotic resistance in clinical strains poses an increasing challenge to effective treatment. In this context we are using the collection to address the basis of physiological resistance in this species, in addition to its primary application to address synthetic biology applications. However, most strains of Klebsiella, and most capsular serotypes are rarely if ever associated with infection, and it is a species that is a common plant-associated environmental species, that ought correctly to be considered an opportunist rather that only very infrequently causes infection, rather than as a primary pathogen.

In the synthetic biology context, K. pneumoniae is of interest because it can consume / convert a wider range of carbon sources than E. coli, and the metabolism of these alternatives is not as adversely affected by the presence of glucose. These include the relatively underused C5 sugars that can be obtained from hemicellulose, and the glycerol generated as a waste product of biodiesel manufacture. It is also capable of generating a range of valuable fermentation products, and has features that allow at least some of what is known about E. coli to be useful in design for re-engineering and optimization.

This strain collection contains 60 unrelated strains of K. pneumoniae from a range of sources including clinical, animal, and environmental. The strains have reasonable genetic diversity, and the collection will continue to be expanded to include better representation of the diversity of this species. This will specifically include strains associated with and isolated from plants, and also on the basis of specific project target behaviours.

The strain collection is currently being used as the basis for the design of strains optimized for fermentation from C5 and glycerol feedstocks, and for direction of fermentation metabolism as part of the EU FP7 ValorPlus project.

 We hope to soon be embarking on a collaborative project with SynbiCITE to generate collections of characterized strains with supporting bioinformatics necessary to apply the Comparative Behavioural Genomics methodology for between 6 and 10 additional species relevant to partner research and development priorities.