The Babraham Institute Publications database contains details of all publications resulting from our research groups and scientific facilities. Pre-prints by Institute authors can be viewed on the Institute's bioRxiv channel. We believe that free and open access to the outputs of publicly‐funded research offers significant social and economic benefits, as well as aiding the development of new research. We are working to provide Open Access to as many publications as possible and these can be identified below by the padlock icon. Where this hasn't been possible, subscriptions may be required to view the full text.
Pancreatic cancer is a rare but fatal form of cancer, the fourth highest in absolute mortality. Known risk factors include obesity, diet, and type 2 diabetes; however, the low incidence rate and interconnection of these factors confound the isolation of individual effects. Here, we use epidemiological analysis of prospective human cohorts and parallel tracking of pancreatic cancer in mice to dissect the effects of obesity, diet, and diabetes on pancreatic cancer. Through longitudinal monitoring and multi-omics analysis in mice, we found distinct effects of protein, sugar, and fat dietary components, with dietary sugars increasing Mad2l1 expression and tumor proliferation. Using epidemiological approaches in humans, we find that dietary sugars give a MAD2L1 genotype-dependent increased susceptibility to pancreatic cancer. The translation of these results to a clinical setting could aid in the identification of the at-risk population for screening and potentially harness dietary modification as a therapeutic measure.
Acute myeloid leukemia (AML) is characterised by a series of genetic and epigenetic alterations that result in deregulation of transcriptional networks. One understudied source of transcriptional regulators are transposable elements (TEs), whose aberrant usage could contribute to oncogenic transcriptional circuits. However, the regulatory influence of TEs and their links to AML pathogenesis remain unexplored. Here we identify six endogenous retrovirus (ERV) families with AML-associated enhancer chromatin signatures that are enriched in binding of key regulators of hematopoiesis and AML pathogenesis. Using both locus-specific genetic editing and simultaneous epigenetic silencing of multiple ERVs, we demonstrate that ERV deregulation directly alters the expression of adjacent genes in AML. Strikingly, deletion or epigenetic silencing of an ERV-derived enhancer suppresses cell growth by inducing apoptosis in leukemia cell lines. This work reveals that ERVs are a previously unappreciated source of AML enhancers that may be exploited by cancer cells to help drive tumour heterogeneity and evolution.
In metazoans, the secreted proteome participates in intercellular signalling and innate immunity, and builds the extracellular matrix scaffold around cells. Compared with the relatively constant intracellular environment, conditions for proteins in the extracellular space are harsher, and low concentrations of ATP prevent the activity of intracellular components of the protein quality-control machinery. Until now, only a few bona fide extracellular chaperones and proteases have been shown to limit the aggregation of extracellular proteins. Here we performed a systematic analysis of the extracellular proteostasis network in Caenorhabditis elegans with an RNA interference screen that targets genes that encode the secreted proteome. We discovered 57 regulators of extracellular protein aggregation, including several proteins related to innate immunity. Because intracellular proteostasis is upregulated in response to pathogens, we investigated whether pathogens also stimulate extracellular proteostasis. Using a pore-forming toxin to mimic a pathogenic attack, we found that C. elegans responded by increasing the expression of components of extracellular proteostasis and by limiting aggregation of extracellular proteins. The activation of extracellular proteostasis was dependent on stress-activated MAP kinase signalling. Notably, the overexpression of components of extracellular proteostasis delayed ageing and rendered worms resistant to intoxication. We propose that enhanced extracellular proteostasis contributes to systemic host defence by maintaining a functional secreted proteome and avoiding proteotoxicity.
Zygotic genome activation (ZGA) is an essential transcriptional event in embryonic development that coincides with extensive epigenetic reprogramming. Complex manipulation techniques and maternal stores of proteins preclude large-scale functional screens for ZGA regulators within early embryos. Here, we combined pooled CRISPR activation (CRISPRa) with single-cell transcriptomics to identify regulators of ZGA-like transcription in mouse embryonic stem cells, which serve as a tractable, in vitro proxy of early mouse embryos. Using multi-omics factor analysis (MOFA+) applied to ∼200,000 single-cell transcriptomes comprising 230 CRISPRa perturbations, we characterized molecular signatures of ZGA and uncovered 24 factors that promote a ZGA-like response. Follow-up assays validated top screen hits, including the DNA-binding protein Dppa2, the chromatin remodeler Smarca5, and the transcription factor Patz1, and functional experiments revealed that Smarca5's regulation of ZGA-like transcription is dependent on Dppa2. Together, our single-cell transcriptomic profiling of CRISPRa-perturbed cells provides both system-level and molecular insights into the mechanisms that orchestrate ZGA.
Rule-based modeling is an approach that permits constructing reaction networks based on the specification of rules for molecular interactions and transformations. These rules can encompass details such as the interacting sub-molecular domains and the states and binding status of the involved components. Conceptually, fine-grained spatial information such as locations can also be provided. Through "wildcards" representing component states, entire families of molecule complexes sharing certain properties can be specified as patterns. This can significantly simplify the definition of models involving species with multiple components, multiple states, and multiple compartments. The systems biology markup language (SBML) Level 3 Multi Package Version 1 extends the SBML Level 3 Version 1 core with the "type" concept in the Species and Compartment classes. Therefore, reaction rules may contain species that can be patterns and exist in multiple locations. Multiple software tools such as Simmune and BioNetGen support this standard that thus also becomes a medium for exchanging rule-based models. This document provides the specification for Release 2 of Version 1 of the SBML Level 3 Multi package. No design changes have been made to the description of models between Release 1 and Release 2; changes are restricted to the correction of errata and the addition of clarifications.
Genomic imprinting is an epigenetic phenomenon leading to parental allele-specific expression. Dosage of imprinted genes is crucial for normal development and its dysregulation accounts for several human disorders. This unusual expression pattern is mostly dictated by differences in DNA methylation between parental alleles at specific regulatory elements known as imprinting control regions (ICRs). Although several approaches can be used for methylation inspection, we lack an easy and cost-effective method to simultaneously measure DNA methylation at multiple imprinted regions. Here, we present IMPLICON, a high-throughput method measuring DNA methylation levels at imprinted regions with base-pair resolution and over 1000-fold coverage. We adapted amplicon bisulfite-sequencing protocols to design IMPLICON for ICRs in adult tissues of inbred mice, validating it in hybrid mice from reciprocal crosses for which we could discriminate methylation profiles in the two parental alleles. Lastly, we developed a human version of IMPLICON and detected imprinting errors in embryonic and induced pluripotent stem cells. We also provide rules and guidelines to adapt this method for investigating the DNA methylation landscape of any set of genomic regions. In summary, IMPLICON is a rapid, cost-effective and scalable method, which could become the gold standard in both imprinting research and diagnostics.
Circular DNA can arise from all parts of eukaryotic chromosomes. In yeast, circular ribosomal DNA (rDNA) accumulates dramatically as cells age, however little is known about the accumulation of other chromosome-derived circles or the contribution of such circles to genetic variation in aged cells. We profiled circular DNA in Saccharomyces cerevisiae populations sampled when young and after extensive aging. Young cells possessed highly diverse circular DNA populations but 94% of the circular DNA were lost after ∼15 divisions, whereas rDNA circles underwent massive accumulation to >95% of circular DNA. Circles present in both young and old cells were characterized by replication origins including circles from unique regions of the genome and repetitive regions: rDNA and telomeric Y' regions. We further observed that circles can have flexible inheritance patterns: [HXT6/7circle] normally segregates to mother cells but in low glucose is present in up to 50% of cells, the majority of which must have inherited this circle from their mother. Interestingly, [HXT6/7circle] cells are eventually replaced by cells carrying stable chromosomal HXT6 HXT6/7 HXT7 amplifications, suggesting circular DNAs are intermediates in chromosomal amplifications. In conclusion, the heterogeneity of circular DNA offers flexibility in adaptation, but this heterogeneity is remarkably diminished with age.
The lipid kinase VPS34 orchestrates diverse processes, including autophagy, endocytic sorting, phagocytosis, anabolic responses and cell division. VPS34 forms various complexes that help adapt it to specific pathways, with complexes I and II being the most prominent ones. We found that physicochemical properties of membranes strongly modulate VPS34 activity. Greater unsaturation of both substrate and non-substrate lipids, negative charge and curvature activate VPS34 complexes, adapting them to their cellular compartments. Hydrogen/deuterium exchange mass spectrometry (HDX-MS) of complexes I and II on membranes elucidated structural determinants that enable them to bind membranes. Among these are the Barkor/ATG14L autophagosome targeting sequence (BATS), which makes autophagy-specific complex I more active than the endocytic complex II, and the Beclin1 BARA domain. Interestingly, even though Beclin1 BARA is common to both complexes, its membrane-interacting loops are critical for complex II, but have only a minor role for complex I.
Naïve human pluripotent stem cells (hPSC) resemble the embryonic epiblast at an earlier time-point in development than conventional, 'primed' hPSC. We present a comprehensive miRNA profiling of naïve-to-primed transition in hPSC, a process recapitulating aspects of early in vivo embryogenesis. We identify miR-143-3p and miR-22-3p as markers of the naïve state and miR-363-5p, several members of the miR-17 family, miR-302 family as primed markers. We uncover that miR-371-373 are highly expressed in naïve hPSC. MiR-371-373 are the human homologs of the mouse miR-290 family, which are the most highly expressed miRNAs in naïve mouse PSC. This aligns with the consensus that naïve hPSC resemble mouse naive PSC, showing that the absence of miR-371-373 in conventional hPSC is due to cell state rather than a species difference.
An amendment to this paper has been published and can be accessed via the original article.
The receptor-linked protein tyrosine phosphatases (RPTPs) are key regulators of cell-cell communication through the control of cellular phosphotyrosine levels. Most human RPTPs possess an extracellular receptor domain and tandem intracellular phosphatase domains: comprising an active membrane proximal (D1) domain and an inactive distal (D2) pseudophosphatase domain. Here we demonstrate that PTPRU is unique amongst the RPTPs in possessing two pseudophosphatase domains. The PTPRU-D1 displays no detectable catalytic activity against a range of phosphorylated substrates and we show that this is due to multiple structural rearrangements that destabilise the active site pocket and block the catalytic cysteine. Upon oxidation, this cysteine forms an intramolecular disulphide bond with a vicinal "backdoor" cysteine, a process thought to reversibly inactivate related phosphatases. Importantly, despite the absence of catalytic activity, PTPRU binds substrates of related phosphatases strongly suggesting that this pseudophosphatase functions in tyrosine phosphorylation by competing with active phosphatases for the binding of substrates.
One of the major bottlenecks in building systems biology models is identification and estimation of model parameters for model calibration. Searching for model parameters from published literature and models is an essential, yet laborious task.
How the epigenetic landscape is established in development is still being elucidated. Here, we uncover developmental pluripotency associated 2 and 4 (DPPA2/4) as epigenetic priming factors that establish a permissive epigenetic landscape at a subset of developmentally important bivalent promoters characterized by low expression and poised RNA-polymerase. Differentiation assays reveal that Dppa2/4 double knockout mouse embryonic stem cells fail to exit pluripotency and differentiate efficiently. DPPA2/4 bind both H3K4me3-marked and bivalent gene promoters and associate with COMPASS- and Polycomb-bound chromatin. Comparing knockout and inducible knockdown systems, we find that acute depletion of DPPA2/4 results in rapid loss of H3K4me3 from key bivalent genes, while H3K27me3 is initially more stable but lost following extended culture. Consequently, upon DPPA2/4 depletion, these promoters gain DNA methylation and are unable to be activated upon differentiation. Our findings uncover a novel epigenetic priming mechanism at developmental promoters, poising them for future lineage-specific activation.
Vascular calcification, the formation of calcium phosphate crystals in the vessel wall, is mediated by vascular smooth muscle cells (VSMCs). However, the underlying molecular mechanisms remain elusive, precluding mechanism-based therapies.
Peripheral nervous system (PNS) neurons support axon regeneration into adulthood, whereas central nervous system (CNS) neurons lose regenerative ability after development. To better understand this decline whilst aiming to improve regeneration, we focused on phosphoinositide 3-kinase (PI3K) and its product phosphatidylinositol (3,4,5)-trisphosphate (PIP ). We demonstrate that adult PNS neurons utilise two catalytic subunits of PI3K for axon regeneration: p110α and p110δ. However, in the CNS, axonal PIP decreases with development at the time when axon transport declines and regenerative competence is lost. Overexpressing p110α in CNS neurons had no effect; however, expression of p110δ restored axonal PIP and increased regenerative axon transport. p110δ expression enhanced CNS regeneration in both rat and human neurons and in transgenic mice, functioning in the same way as the hyperactivating H1047R mutation of p110α. Furthermore, viral delivery of p110δ promoted robust regeneration after optic nerve injury. These findings establish a deficit of axonal PIP as a key reason for intrinsic regeneration failure and demonstrate that native p110δ facilitates axon regeneration by functioning in a hyperactive fashion.
Collagen I is a major tendon protein whose polypeptide chains are linked by covalent cross-links. It is unknown how the cross-linking contributes to the mechanical properties of tendon or whether cross-linking changes in response to stretching or relaxation. Since their discovery, imine bonds within collagen have been recognized as being important in both cross-link formation and collagen structure. They are often described as acidic or thermally labile, but no evidence is available from direct measurements of cross-link levels whether these bonds contribute to the mechanical properties of collagen. Here, we used MS to analyze these imine bonds after reduction with sodium borohydride while under tension and found that their levels are altered in stretched tendon. We studied the changes in cross-link bonding in tail tendon from 11-week-old C57Bl/6 mice at 4% physical strain, at 10% strain, and at breaking point. The cross-links hydroxy-lysino-norleucine (HLNL), dihydroxy-lysino-norleucine (DHLNL), and lysino-norleucine (LNL) increased or decreased depending on the specific cross-link and amount of mechanical strain. We also noted a decrease in glycated lysine residues in collagen, indicating that the imine formed between circulating glucose and lysine is also stress-labile. We also carried out mechanical testing, including cyclic testing at 4% strain, stress relaxation tests, and stress-strain profiles taken at breaking point, both with and without sodium borohydride reduction. The results from both the MS studies and mechanical testing provide insights into the chemical changes during tendon stretching and directly link these chemical changes to functional collagen properties.
Noncoding RNA plays essential roles in transcriptional control and chromatin silencing. At antisense transcription quantitatively influences transcriptional output, but the mechanism by which this occurs is still unclear. Proximal polyadenylation of the antisense transcripts by FCA, an RNA-binding protein that physically interacts with RNA 3' processing factors, reduces transcription. This process genetically requires FLD, a homolog of the H3K4 demethylase LSD1. However, the mechanism linking RNA processing to FLD function had not been established. Here, we show that FLD tightly associates with LUMINIDEPENDENS (LD) and SET DOMAIN GROUP 26 (SDG26) in vivo, and, together, they prevent accumulation of monomethylated H3K4 (H3K4me1) over the gene body. SDG26 interacts with the RNA 3' processing factor FY (WDR33), thus linking activities for proximal polyadenylation of the antisense transcripts to FLD/LD/SDG26-associated H3K4 demethylation. We propose this demethylation antagonizes an active transcription module, thus reducing H3K36me3 accumulation and increasing H3K27me3. Consistent with this view, we show that Polycomb Repressive Complex 2 (PRC2) silencing is genetically required by FCA to repress Overall, our work provides insights into RNA-mediated chromatin silencing.
This paper presents a high-throughput reverse transcription quantitative PCR (RT-qPCR) assay for Caenorhabditis elegans that is fast, robust, and highly sensitive. This protocol obtains precise measurements of gene expression from single worms or from bulk samples. The protocol presented here provides a novel adaptation of existing methods for complementary DNA (cDNA) preparation coupled to a nanofluidic RT-qPCR platform. The first part of this protocol, named 'Worm-to-CT', allows cDNA production directly from nematodes without the need for prior mRNA isolation. It increases experimental throughput by allowing the preparation of cDNA from 96 worms in 3.5 h. The second part of the protocol uses existing nanofluidic technology to run high-throughput RT-qPCR on the cDNA. This paper evaluates two different nanofluidic chips: the first runs 96 samples and 96 targets, resulting in 9,216 reactions in approximately 1.5 days of benchwork. The second chip type consists of six 12 x 12 arrays, resulting in 864 reactions. Here, the Worm-to-CT method is demonstrated by quantifying mRNA levels of genes encoding heat shock proteins from single worms and from bulk samples. Provided is an extensive list of primers designed to amplify processed RNA for the majority of coding genes within the C. elegans genome.
An issue often encountered when acquiring image data from fixed or anesthetized C. elegans is that worms cross and cluster with their neighbors. This problem is aggravated with increasing density of worms and creates challenges for imaging and quantification. We developed a FIJI-based workflow, Worm-align, that can be used to generate single- or multi-channel montages of user-selected, straightened and aligned worms from raw image data of C. elegans. Worm-align is a simple and user-friendly workflow that does not require prior training of either the user or the analysis algorithm. Montages generated with Worm-align can aid the visual inspection of worms, their classification and representation. In addition, the output of Worm-align can be used for subsequent quantification of fluorescence intensity in single worms, either in FIJI directly, or in other image analysis software platforms. We demonstrate this by importing the Worm-align output into Worm_CP, a pipeline that uses the open-source CellProfiler software. CellProfiler's flexibility enables the incorporation of additional modules for high-content screening. As a practical example, we have used the pipeline on two datasets: the first dataset are images of heat shock reporter worms that express green fluorescent protein (GFP) under the control of the promoter of a heat shock inducible gene hsp-70, and the second dataset are images obtained from fixed worms, stained for fat-stores with a fluorescent dye.
Axons are diverse. They have different lengths, different branching patterns, and different biological roles. Methods to study axon degeneration are also diverse. The result is a bewildering range of experimental systems in which to study mechanisms of axon degeneration, and it is difficult to extrapolate from one neuron type and one method to another. The purpose of this chapter is to help readers to do this and to choose the methods most appropriate for answering their particular research question.
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
Regulatory T (Treg) cell populations are composed of functionally quiescent resting Treg (rTreg) cells which differentiate into activated Treg (aTreg) cells upon antigen stimulation. How rTreg cells remain quiescent despite chronic exposure to cognate self- and foreign antigens is unclear. The transcription factor BACH2 is critical for early Treg lineage specification, but its function following lineage commitment is unresolved. Here, we show that BACH2 is repurposed following Treg lineage commitment and promotes the quiescence and long-term maintenance of rTreg cells. Bach2 is highly expressed in rTreg cells but is down-regulated in aTreg cells and during inflammation. In rTreg cells, BACH2 binds to enhancers of genes involved in aTreg differentiation and represses their TCR-driven induction by competing with AP-1 factors for DNA binding. This function promotes rTreg cell quiescence and long-term maintenance and is required for immune homeostasis and durable immunosuppression in cancer. Thus, BACH2 supports a "division of labor" between quiescent rTreg cells and their activated progeny in Treg maintenance and function, respectively.
Genetic variations underlying susceptibility to complex autoimmune and allergic diseases are concentrated within noncoding regulatory elements termed enhancers. The functions of a large majority of disease-associated enhancers are unknown, in part owing to their distance from the genes they regulate, a lack of understanding of the cell types in which they operate, and our inability to recapitulate the biology of immune diseases in vitro. Here, using shared synteny to guide loss-of-function analysis of homologues of human enhancers in mice, we show that the prominent autoimmune and allergic disease risk locus at chromosome 11q13.5 contains a distal enhancer that is functional in CD4 regulatory T (T) cells and required for T-mediated suppression of colitis. The enhancer recruits the transcription factors STAT5 and NF-κB to mediate signal-driven expression of Lrrc32, which encodes the protein glycoprotein A repetitions predominant (GARP). Whereas disruption of the Lrrc32 gene results in early lethality, mice lacking the enhancer are viable but lack GARP expression in Foxp3 T cells, which are unable to control colitis in a cell-transfer model of the disease. In human T cells, the enhancer forms conformational interactions with the promoter of LRRC32 and enhancer risk variants are associated with reduced histone acetylation and GARP expression. Finally, functional fine-mapping of 11q13.5 using CRISPR-activation (CRISPRa) identifies a CRISPRa-responsive element in the vicinity of risk variant rs11236797 capable of driving GARP expression. These findings provide a mechanistic basis for association of the 11q13.5 risk locus with immune-mediated diseases and identify GARP as a potential target in their therapy.
While colocalization within a bacterial operon enables coexpression of the constituent genes, the mechanistic logic of clustering of nonhomologous monocistronic genes in eukaryotes is not immediately obvious. Biosynthetic gene clusters that encode pathways for specialized metabolites are an exception to the classical eukaryote rule of random gene location and provide paradigmatic exemplars with which to understand eukaryotic cluster dynamics and regulation. Here, using 3C, Hi-C, and Capture Hi-C (CHi-C) organ-specific chromosome conformation capture techniques along with high-resolution microscopy, we investigate how chromosome topology relates to transcriptional activity of clustered biosynthetic pathway genes in Our analyses reveal that biosynthetic gene clusters are embedded in local hot spots of 3D contacts that segregate cluster regions from the surrounding chromosome environment. The spatial conformation of these cluster-associated domains differs between transcriptionally active and silenced clusters. We further show that silenced clusters associate with heterochromatic chromosomal domains toward the periphery of the nucleus, while transcriptionally active clusters relocate away from the nuclear periphery. Examination of chromosome structure at unrelated clusters in maize, rice, and tomato indicates that integration of clustered pathway genes into distinct topological domains is a common feature in plant genomes. Our results shed light on the potential mechanisms that constrain coexpression within clusters of nonhomologous eukaryotic genes and suggest that gene clustering in the one-dimensional chromosome is accompanied by compartmentalization of the 3D chromosome.