Nucleic Acids Research
Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods
[Apr 2013]
Gene set analysis (GSA) is used to elucidate genome-wide data, in particular transcriptome data. A multitude of methods have been proposed for this step of the analysis, and many of them have been compared and evaluated. Unfortunately, there is no consolidated opinion regarding what methods should be preferred, and the variety of available GSA software and implementations pose a difficulty for the end-user who wants to try out different methods. To address this, we have developed the R package Piano that collects a range of GSA methods into the same system, for the benefit of the end-user. Further on we refine the GSA workflow by using modifications of the gene-level statistics. This enables us to divide the resulting gene set P-values into three classes, describing different aspects of gene expression directionality at gene set level. We use our fully implemented workflow to investigate the impact of the individual components of GSA by using microarray and RNA-seq data. The results show that the evaluated methods are globally similar and the major separation correlates well with our defined directionality classes. As a consequence of this, we suggest to use a consensus scoring approach, based on multiple GSA runs. In combination with the directionality classes, this constitutes a more thorough basis for an enriched biological interpretation.
Large scale chromosomal mapping of human microRNA structural clusters
[Apr 2013]
MicroRNAs (miRNAs) can group together along the human genome to form stable secondary structures made of several hairpins hosting miRNAs in their stems. The few known examples of such structures are all involved in cancer development. A large scale computational analysis of human chromosomes crossing sequence analysis and deep sequencing data revealed the presence of >400 structural clusters of miRNAs in the human genome. An a posteriori analysis validates predictions as bona fide miRNAs. A functional analysis of structural clusters position along the chromosomes co-localizes them with genes involved in several key cellular processes like immune systems, sensory systems, signal transduction and development. Immune systems diseases, infectious diseases and neurodegenerative diseases are characterized by genes that are especially well organized around structural clusters of miRNAs. Target genes functional analysis strongly supports a regulatory role of most predicted miRNAs and, notably, a strong involvement of predicted miRNAs in the regulation of cancer pathways. This analysis provides new fundamental insights on the genomic organization of miRNAs in human chromosomes.
Exon-phase symmetry and intrinsic structural disorder promote modular evolution in the human genome
[Apr 2013]
A key signature of module exchange in the genome is phase symmetry of exons, suggestive of exon shuffling events that occurred without disrupting translation reading frame. At the protein level, intrinsic structural disorder may be another key element because disordered regions often serve as functional elements that can be effectively integrated into a protein structure. Therefore, we asked whether exon-phase symmetry in the human genome and structural disorder in the human proteome are connected, signalling such evolutionary mechanisms in the assembly of multi-exon genes. We found an elevated level of structural disorder of regions encoded by symmetric exons and a preferred symmetry of exons encoding for mostly disordered regions (>70% predicted disorder). Alternatively spliced symmetric exons tend to correspond to the most disordered regions. The genes of mostly disordered proteins (>70% predicted disorder) tend to be assembled from symmetric exons, which often arise by internal tandem duplications. Preponderance of certain types of short motifs (e.g. SH3-binding motif) and domains (e.g. high-mobility group domains) suggests that certain disordered modules have been particularly effective in exon-shuffling events. Our observations suggest that structural disorder has facilitated modular assembly of complex genes in evolution of the human genome.
Comparative annotation of functional regions in the human genome using epigenomic data
[Apr 2013]
Epigenetic regulation is dynamic and cell-type dependent. The recently available epigenomic data in multiple cell types provide an unprecedented opportunity for a comparative study of epigenetic landscape. We developed a machine-learning method called ChroModule to annotate the epigenetic states in eight ENCyclopedia Of DNA Elements cell types. The trained model successfully captured the characteristic histone-modification patterns associated with regulatory elements, such as promoters and enhancers, and showed superior performance on identifying enhancers compared with the state-of-art methods. In addition, given the fixed number of epigenetic states in the model, ChroModule allows straightforward illustration of epigenetic variability in multiple cell types. Using this feature, we found that invariable and variable epigenetic states across cell types correspond to housekeeping functions and stimulus response, respectively. Especially, we observed that enhancers, but not the other regulatory elements, dictate cell specificity, as similar cell types share common enhancers, and cell-type–specific enhancers are often bound by transcription factors playing critical roles in that cell type. More interestingly, we found some genomic regions are dormant in cell type but primed to become active in other cell types. These observations highlight the usefulness of ChroModule in comparative analysis and interpretation of multiple epigenomes.
The lysine demethylase, KDM4B, is a key molecule in androgen receptor signalling and turnover
[Apr 2013]
The androgen receptor (AR) is a key molecule involved in prostate cancer (PC) development and progression. Post-translational modification of the AR by co-regulator proteins can modulate its transcriptional activity. To identify which demethylases might be involved in AR regulation, an siRNA screen was performed to reveal that the demethylase, KDM4B, may be an important co-regulator protein. KDM4B enzymatic activity is required to enhance AR transcriptional activity; however, independently of this activity, KDM4B can enhance AR protein stability via inhibition of AR ubiquitination. Importantly, knockdown of KDM4B in multiple cell lines results in almost complete depletion of AR protein levels. For the first time, we have identified KDM4B to be an androgen-regulated demethylase enzyme, which can influence AR transcriptional activity not only via demethylation activity but also via modulation of ubiquitination. Together, these findings demonstrate the close functional relationship between AR and KDM4B, which work together to amplify the androgen response. Furthermore, KDM4B expression in clinical PC specimens positively correlates with increasing cancer grade (P < 0.001). Consequently, KDM4B is a viable therapeutic target in PC.
PML bodies provide an important platform for the maintenance of telomeric chromatin integrity in embryonic stem cells
[Apr 2013]
We have previously shown that α-thalassemia mental retardation X-linked (ATRX) and histone H3.3 are key regulators of telomeric chromatin in mouse embryonic stem cells. The function of ATRX and H3.3 in the maintenance of telomere chromatin integrity is further demonstrated by recent studies that show the strong association of ATRX/H3.3 mutations with alternative lengthening of telomeres in telomerase-negative human cancer cells. Here, we demonstrate that ATRX and H3.3 co-localize with the telomeric DNA and associated proteins within the promyelocytic leukemia (PML) bodies in mouse ES cells. The assembly of these telomere-associated PML bodies is most prominent at S phase. RNA interference (RNAi)-mediated knockdown of PML expression induces the disassembly of these nuclear bodies and a telomere dysfunction phenotype in mouse ES cells. Loss of function of PML bodies in mouse ES cells also disrupts binding of ATRX/H3.3 and proper establishment of histone methylation pattern at the telomere. Our study demonstrates that PML bodies act as epigenetic regulators by serving as platforms for the assembly of the telomeric chromatin to ensure a faithful inheritance of epigenetic information at the telomere.
A far-upstream (-70 kb) enhancer mediates Sox9 auto-regulation in somatic tissues during development and adult regeneration
[Apr 2013]
SOX9 encodes a transcription factor that presides over the specification and differentiation of numerous progenitor and differentiated cell types, and although SOX9 haploinsufficiency and overexpression cause severe diseases in humans, including campomelic dysplasia, sex reversal and cancer, the mechanisms underlying SOX9 transcription remain largely unsolved. We identify here an evolutionarily conserved enhancer located 70-kb upstream of mouse Sox9 and call it SOM because it specifically activates a Sox9 promoter reporter in most Sox9-expressing somatic tissues in transgenic mice. Moreover, SOM-null fetuses and pups reduce Sox9 expression by 18–37% in the pancreas, lung, kidney, salivary gland, gut and liver. Weanlings exhibit half-size pancreatic islets and underproduce insulin and glucagon, and adults slowly recover from acute pancreatitis due to a 2-fold impairment in Sox9 upregulation. Molecular and genetic experiments reveal that Sox9 protein dimers bind to multiple recognition sites in the SOM sequence and are thereby both necessary and sufficient for enhancer activity. These findings thus uncover that Sox9 directly enhances its functions in somatic tissue development and adult regeneration through SOM-mediated positive auto-regulation. They provide thereby novel insights on molecular mechanisms controlling developmental and disease processes and suggest new strategies to improve disease treatments.
miR-34 is maternally inherited in Drosophila melanogaster and Danio rerio
[Apr 2013]
MicroRNAs (miRNAs) are small, endogenous, regulatory RNA molecules that can bind to partially complementary regions on target messenger RNAs and impede their expression or translation. We rationalized that miRNAs, being localized to the cytoplasm, will be maternally inherited during fertilization and may play a role in early development. Although Dicer is known to be essential for the transition from single-celled zygote to two-cell embryo, a direct role for miRNAs has not yet been demonstrated. We identified miRNAs with targets in zygotically expressed transcripts in Drosophila using a combination of transcriptome analysis and miRNA target prediction. We experimentally established that Drosophila miRNA dme-miR-34, the fly homologue of the cancer-related mammalian miRNA miR-34, involved in somatic-cell reprogramming and having critical role in early neuronal differentiation, is present in Drosophila embryos before initiation of zygotic transcription. We also show that the Drosophila miR-34 is dependent on maternal Dicer-1 for its expression in oocytes. Further, we show that miR-34 is also abundant in unfertilized oocytes of zebrafish. Its temporal expression profile during early development showed abundant expression in unfertilized oocytes that gradually decreased by 5 days post-fertilization (dpf). We find that knocking down the maternal, but not the zygotic, miR-34 led to developmental defects in the neuronal system during early embryonic development in zebrafish. Here, we report for the first time, the maternal inheritance of an miRNA involved in development of the neuronal system in a vertebrate model system.
HP1a, Su(var)3-9, SETDB1 and POF stimulate or repress gene expression depending on genomic position, gene length and expression pattern in Drosophila melanogaster
[Apr 2013]
Heterochromatin protein 1a (HP1a) is a chromatin-associated protein important for the formation and maintenance of heterochromatin. In Drosophila, the two histone methyltransferases SETDB1 and Su(var)3-9 mediate H3K9 methylation marks that initiates the establishment and spreading of HP1a-enriched chromatin. Although HP1a is generally regarded as a factor that represses gene transcription, several reports have linked HP1a binding to active genes, and in some cases, it has been shown to stimulate transcriptional activity. To clarify the function of HP1a in transcription regulation and its association with Su(var)3-9, SETDB1 and the chromosome 4-specific protein POF, we conducted genome-wide expression studies and combined the results with available binding data in Drosophila melanogaster. The results suggest that HP1a, SETDB1 and Su(var)3-9 repress genes on chromosome 4, where non-ubiquitously expressed genes are preferentially targeted, and stimulate genes in pericentromeric regions. Further, we showed that on chromosome 4, Su(var)3-9, SETDB1 and HP1a target the same genes. In addition, we found that transposons are repressed by HP1a and Su(var)3-9 and that the binding level and expression effects of HP1a are affected by gene length. Our results indicate that genes have adapted to be properly expressed in their local chromatin environment.
IL-1{beta}-specific recruitment of GCN5 histone acetyltransferase induces the release of PAF1 from chromatin for the de-repression of inflammatory response genes
[Apr 2013]
To determine the functional specificity of inflammation, it is critical to orchestrate the timely activation and repression of inflammatory responses. Here, we explored the PAF1 (RNA polymerase II associated factor)-mediated signal- and locus-specific repression of genes induced through the pro-inflammatory cytokine interleukin (IL)-1β. Using microarray analysis, we identified the PAF1 target genes whose expression was further enhanced by PAF1 knockdown in IL-1β–stimulated HepG2 hepatocarcinomas. PAF1 bound near the transcription start sites of target genes and dissociated on stimulation. In PAF1-deficient cells, more elongating RNA polymerase II and acetylated histones were observed, although IL-1β–mediated activation and recruitment of nuclear factor B (NF-B) were not altered. Under basal conditions, PAF1 blocked histone acetyltransferase general control non-depressible 5 (GCN5)-mediated acetylation on H3K9 and H4K5 residues. On IL-1β stimulation, activated GCN5 discharged PAF1 from chromatin, allowing productive transcription to occur. PAF1 bound to histones but not to acetylated histones, and the chromatin-binding domain of PAF1 was essential for target gene repression. Moreover, IL-1β–induced cell migration was similarly controlled through counteraction between PAF1 and GCN5. These results suggest that the IL-1β signal-specific exchange of PAF1 and GCN5 on the target locus limits inappropriate gene induction and facilitates the timely activation of inflammatory responses.
The helicase-binding domain of Escherichia coli DnaG primase interacts with the highly conserved C-terminal region of single-stranded DNA-binding protein
[Apr 2013]
During bacterial DNA replication, DnaG primase and the subunit of DNA polymerase III compete for binding to single-stranded DNA-binding protein (SSB), thus facilitating the switch between priming and elongation. SSB proteins play an essential role in DNA metabolism by protecting single-stranded DNA and by mediating several important protein–protein interactions. Although an interaction of SSB with primase has been previously reported, it was unclear which domains of the two proteins are involved. This study identifies the C-terminal helicase-binding domain of DnaG primase (DnaG-C) and the highly conserved C-terminal region of SSB as interaction sites. By ConSurf analysis, it can be shown that an array of conserved amino acids on DnaG-C forms a hydrophobic pocket surrounded by basic residues, reminiscent of known SSB-binding sites on other proteins. Using protein–protein cross-linking, site-directed mutagenesis, analytical ultracentrifugation and nuclear magnetic resonance spectroscopy, we demonstrate that these conserved amino acid residues are involved in the interaction with SSB. Even though the C-terminal domain of DnaG primase also participates in the interaction with DnaB helicase, the respective binding sites on the surface of DnaG-C do not overlap, as SSB binds to the N-terminal subdomain, whereas DnaB interacts with the ultimate C-terminus.
DNA bending-induced phase transition of encapsidated genome in phage {lambda}
[Apr 2013]
The DNA structure in phage capsids is determined by DNA–DNA interactions and bending energy. The effects of repulsive interactions on DNA interaxial distance were previously investigated, but not the effect of DNA bending on its structure in viral capsids. By varying packaged DNA length and through addition of spermine ions, we transform the interaction energy from net repulsive to net attractive. This allowed us to isolate the effect of bending on the resulting DNA structure. We used single particle cryo-electron microscopy reconstruction analysis to determine the interstrand spacing of double-stranded DNA encapsidated in phage capsids. The data reveal that stress and packing defects, both resulting from DNA bending in the capsid, are able to induce a long-range phase transition in the encapsidated DNA genome from a hexagonal to a cholesteric packing structure. This structural observation suggests significant changes in genome fluidity as a result of a phase transition affecting the rates of viral DNA ejection and packaging.
The Shu complex interacts with Rad51 through the Rad51 paralogues Rad55-Rad57 to mediate error-free recombination
[Apr 2013]
The Saccharomyces cerevisiae Shu complex, consisting of Shu1, Shu2, Csm2 and Psy3, promotes error-free homologous recombination (HR) by an unknown mechanism. Recent structural analysis of two Shu proteins, Csm2 and Psy3, has revealed that these proteins are Rad51 paralogues and mediate DNA binding of this complex. We show in vitro that the Csm2–Psy3 heterodimer preferentially binds synthetic forked DNA or 3'-DNA overhang substrates resembling structures used during HR in vivo. We find that Csm2 interacts with Rad51 and the Rad51 paralogues, the Rad55–Rad57 heterodimer and that the Shu complex functions in the same epistasis group as Rad55–Rad57. Importantly, Csm2’s interaction with Rad51 is dependent on Rad55, whereas Csm2’s interaction with Rad55 occurs independently of Rad51. Consistent with the Shu complex containing Rad51 paralogues, the methyl methanesulphonate sensitivity of Csm2 is exacerbated at colder temperatures. Furthermore, Csm2 and Psy3 are needed for efficient recruitment of Rad55 to DNA repair foci after DNA damage. Finally, we observe that the Shu complex preferentially promotes Rad51-dependent homologous recombination over Rad51-independent repair. Our data suggest a model in which Csm2–Psy3 recruit the Shu complex to HR substrates, where it interacts with Rad51 through Rad55–Rad57 to stimulate Rad51 filament assembly and stability, promoting error-free repair.
Concurrent V(D)J recombination and DNA end instability increase interchromosomal trans-rearrangements in ATM-deficient thymocytes
[Apr 2013]
During the CD4–CD8– (DN) stage of T-cell development, RAG-dependent DNA breaks and V(D)J recombination occur at three T-cell receptor (TCR) loci: TCRβ, TCR and TCR. During this stage, abnormal trans-rearrangements also take place between TCR loci, occurring at increased frequency in absence of the DNA damage response mediator ataxia telangiectasia mutated (ATM). Here, we use this model of physiologic trans-rearrangement to study factors that predispose to rearrangement and the role of ATM in preventing chromosomal translocations. The frequency of DN thymocytes with DNA damage foci at multiple TCR loci simultaneously is increased 2- to 3-fold in the absence of ATM. However, trans-rearrangement is increased 10 000- to 100 000-fold, indicating that ATM function extends beyond timely resolution of DNA breaks. RAG-mediated synaptic complex formation occurs between recombination signal sequences with unequal 12 and 23 base spacer sequences (12/23 rule). TCR trans-rearrangements violate this rule, as we observed similar frequencies of 12/23 and aberrant 12/12 or 23/23 recombination products. This suggests that trans-rearrangements are not the result of trans-synaptic complex formation, but they are instead because of unstable cis synaptic complexes that form simultaneously at distinct TCR loci. Thus, ATM suppresses trans-rearrangement primarily through stabilization of DNA breaks at TCR loci.
A comparison of dense transposon insertion libraries in the Salmonella serovars Typhi and Typhimurium
[Apr 2013]
Salmonella Typhi and Typhimurium diverged only ~50 000 years ago, yet have very different host ranges and pathogenicity. Despite the availability of multiple whole-genome sequences, the genetic differences that have driven these changes in phenotype are only beginning to be understood. In this study, we use transposon-directed insertion-site sequencing to probe differences in gene requirements for competitive growth in rich media between these two closely related serovars. We identify a conserved core of 281 genes that are required for growth in both serovars, 228 of which are essential in Escherichia coli. We are able to identify active prophage elements through the requirement for their repressors. We also find distinct differences in requirements for genes involved in cell surface structure biogenesis and iron utilization. Finally, we demonstrate that transposon-directed insertion-site sequencing is not only applicable to the protein-coding content of the cell but also has sufficient resolution to generate hypotheses regarding the functions of non-coding RNAs (ncRNAs) as well. We are able to assign probable functions to a number of cis-regulatory ncRNA elements, as well as to infer likely differences in trans-acting ncRNA regulatory networks.
The {sigma}70 region 1.2 regulates promoter escape by unwinding DNA downstream of the transcription start site
[Apr 2013]
The mechanisms of abortive synthesis and promoter escape during initiation of transcription are poorly understood. Here, we show that, after initiation of RNA synthesis, non-specific interaction of 70 region 1.2, present in all 70 family factors, with the non-template strand around position –4 relative to the transcription start site facilitates unwinding of the DNA duplex downstream of the transcription start site. This leads to stabilization of short RNA products and allows their extension, i.e. promoter escape. We show that this activity of 70 region 1.2 is assisted by the β-lobe domain, but does not involve the β'-rudder or the β'-switch-2, earlier proposed to participate in promoter escape. DNA sequence independence of this function of 70 region 1.2 suggests that it may be conserved in all 70 family factors. Our results indicate that the abortive nature of initial synthesis is caused, at least in part, by failure to open the downstream DNA by the β-lobe and region 1.2.
Two-step model of stop codon recognition by eukaryotic release factor eRF1
[Apr 2013]
Release factor eRF1 plays a key role in the termination of protein synthesis in eukaryotes. The eRF1 consists of three domains (N, M and C) that perform unique roles in termination. Previous studies of eRF1 point mutants and standard/variant code eRF1 chimeras unequivocally demonstrated a direct involvement of the highly conserved N-domain motifs (NIKS, YxCxxxF and GTx) in stop codon recognition. In the current study, we extend this work by investigating the role of the 41 invariant and conserved N-domain residues in stop codon decoding by human eRF1. Using a combination of the conservative and non-conservative amino acid substitutions, we measured the functional activity of >80 mutant eRF1s in an in vitro reconstituted eukaryotic translation system and selected 15 amino acid residues essential for recognition of different stop codon nucleotides. Furthermore, toe-print analyses provide evidence of a conformational rearrangement of ribosomal complexes that occurs during binding of eRF1 to messenger RNA and reflects stop codon decoding activity of eRF1. Based on our experimental data and molecular modelling of the N-domain at the ribosomal A site, we propose a two-step model of stop codon decoding in the eukaryotic ribosome.
Characterization of an unusual bipolar helicase encoded by bacteriophage T5
[Apr 2013]
Bacteriophage T5 has a 120 kb double-stranded linear DNA genome encoding most of the genes required for its own replication. This lytic bacteriophage has a burst size of ~500 new phage particles per infected cell, demonstrating that it is able to turn each infected bacterium into a highly efficient DNA manufacturing machine. To begin to understand DNA replication in this prodigious bacteriophage, we have characterized a putative helicase encoded by gene D2. We show that bacteriophage T5 D2 protein is the first viral helicase to be described with bipolar DNA unwinding activities that require the same core catalytic residues for unwinding in either direction. However, unwinding of partially single- and double-stranded DNA test substrates in the 3'–5' direction is more robust and can be distinguished from the 5'–3' activity by a number of features including helicase complex stability, salt sensitivity and the length of single-stranded DNA overhang required for initiation of helicase action. The presence of D2 in an early gene cluster, the identification of a putative helix-turn-helix DNA-binding motif outside the helicase core and homology with known eukaryotic and prokaryotic replication initiators suggest an involvement for this unusual helicase in DNA replication initiation.
Altered error specificity of RNase H-deficient HIV-1 reverse transcriptases during DNA-dependent DNA synthesis
[Apr 2013]
Asp443 and Glu478 are essential active site residues in the RNase H domain of human immunodeficiency virus type 1 (HIV-1) reverse transcriptase (RT). We have investigated the effects of substituting Asn for Asp443 or Gln for Glu478 on the fidelity of DNA-dependent DNA synthesis of phylogenetically diverse HIV-1 RTs. In M13mp2 lacZα-based forward mutation assays, HIV-1 group M (BH10) and group O RTs bearing substitutions D443N, E478Q, V75I/D443N or V75I/E478Q showed 2.0- to 6.6-fold increased accuracy in comparison with the corresponding wild-type enzymes. This was a consequence of their lower base substitution error rates. One-nucleotide deletions and insertions represented between 30 and 68% of all errors identified in the mutational spectra of RNase H-deficient HIV-1 group O RTs. In comparison with the wild-type RT, these enzymes showed higher frameshift error rates and higher dissociation rate constants (koff) for DNA/DNA template–primers. The effects on frameshift fidelity were similar to those reported for mutation E89G and suggest that in HIV-1 group O RT, RNase H inactivation could affect template/primer slippage. Our results support a role for the RNase H domain during plus-strand DNA polymerization and suggest that mutations affecting RNase H function could also contribute to retrovirus variability during the later steps of reverse transcription.
Translocation of Saccharomyces cerevisiae Pif1 helicase monomers on single-stranded DNA
[Apr 2013]
In Saccharomyces cerevisiae Pif1 participates in a wide variety of DNA metabolic pathways both in the nucleus and in mitochondria. The ability of Pif1 to hydrolyse ATP and catalyse unwinding of duplex nucleic acid is proposed to be at the core of its functions. We recently showed that upon binding to DNA Pif1 dimerizes and we proposed that a dimer of Pif1 might be the species poised to catalysed DNA unwinding. In this work we show that monomers of Pif1 are able to translocate on single-stranded DNA with 5' to 3' directionality. We provide evidence that the translocation activity of Pif1 could be used in activities other than unwinding, possibly to displace proteins from ssDNA. Moreover, we show that monomers of Pif1 retain some unwinding activity although a dimer is clearly a better helicase, suggesting that regulation of the oligomeric state of Pif1 could play a role in its functioning as a helicase or a translocase. Finally, although we show that Pif1 can translocate on ssDNA, the translocation profiles suggest the presence on ssDNA of two populations of Pif1, both able to translocate with 5' to 3' directionality.