Nucleic Acids Research
Genetic linkage may result in the expression of multiple products from a polycistronic transcript, under the control of a single promoter. In animals, protein-coding polycistronic transcripts are rare. However, microRNAs are frequently clustered in the genomes of animals, and these clusters are often transcribed as a single unit. The evolution of microRNA clusters has been the subject of much speculation, and a selective advantage of clusters of functionally related microRNAs is often proposed. However, the origin of microRNA clusters has not been so far explored. Here, we study the evolution of microRNA clusters in Drosophila melanogaster. We observed that the majority of microRNA clusters arose by the de novo formation of new microRNA-like hairpins in existing microRNA transcripts. Some clusters also emerged by tandem duplication of a single microRNA. Comparative genomics show that these clusters are unlikely to split or undergo rearrangements. We did not find any instances of clusters appearing by rearrangement of pre-existing microRNA genes. We propose a model for microRNA cluster evolution in which selection over one of the microRNAs in the cluster interferes with the evolution of the other linked microRNAs. Our analysis suggests that the study of microRNAs and small RNAs must consider linkage associations.
LHX2 regulates the neural differentiation of human embryonic stem cells via transcriptional modulation of PAX6 and CER1
The LIM homeobox 2 transcription factor Lhx2 is known to control crucial aspects of neural development in various species. However, its function in human neural development is still elusive. Here, we demonstrate that LHX2 plays a critical role in human neural differentiation, using human embryonic stem cells (hESCs) as a model. In hESC-derived neural progenitors (hESC-NPs), LHX2 was found to be expressed before PAX6, and co-expressed with early neural markers. Conditional ectopic expression of LHX2 promoted neural differentiation, whereas disruption of LHX2 expression in hESCs significantly impaired neural differentiation. Furthermore, we have demonstrated that LHX2 regulates neural differentiation at two levels: first, it promotes expression of PAX6 by binding to its active enhancers, and second, it attenuates BMP and WNT signaling by promoting expression of the BMP and WNT antagonist Cerberus 1 gene (CER1), to inhibit non-neural differentiation. These findings indicate that LHX2 regulates the transcription of downstream intrinsic and extrinsic molecules that are essential for early neural differentiation in human.
Zinc-finger-nucleases mediate specific and efficient excision of HIV-1 proviral DNA from infected and latently infected human T cells
HIV-infected individuals currently cannot be completely cured because existing antiviral therapy regimens do not address HIV provirus DNA, flanked by long terminal repeats (LTRs), already integrated into host genome. Here, we present a possible alternative therapeutic approach to specifically and directly mediate deletion of the integrated full-length HIV provirus from infected and latently infected human T cell genomes by using specially designed zinc-finger nucleases (ZFNs) to target a sequence within the LTR that is well conserved across all clades. We designed and screened one pair of ZFN to target the highly conserved HIV-1 5'-LTR and 3'-LTR DNA sequences, named ZFN-LTR. We found that ZFN-LTR can specifically target and cleave the full-length HIV-1 proviral DNA in several infected and latently infected cell types and also HIV-1 infected human primary cells in vitro. We observed that the frequency of excision was 45.9% in infected human cell lines after treatment with ZFN-LTR, without significant host-cell genotoxicity. Taken together, our data demonstrate that a single ZFN-LTR pair can specifically and effectively cleave integrated full-length HIV-1 proviral DNA and mediate antiretroviral activity in infected and latently infected cells, suggesting that this strategy could offer a novel approach to eradicate the HIV-1 virus from the infected host in the future.
The initiation factor 4E (eIF4E) is implicated in most of the crucial steps of the mRNA life cycle and is recognized as a pivotal protein in gene regulation. Many of these roles are mediated by its interaction with specific proteins generally known as eIF4E-interacting partners (4E-IPs), such as eIF4G and 4E-BP. To screen for new 4E-IPs, we developed a novel approach based on structural, in silico and biochemical analyses. We identified the protein Angel1, a member of the CCR4 deadenylase family. Immunoprecipitation experiments provided evidence that Angel1 is able to interact in vitro and in vivo with eIF4E. Point mutation variants of Angel1 demonstrated that the interaction of Angel1 with eIF4E is mediated through a consensus eIF4E-binding motif. Immunofluorescence and cell fractionation experiments showed that Angel1 is confined to the endoplasmic reticulum and Golgi apparatus, where it partially co-localizes with eIF4E and eIF4G, but not with 4E-BP. Furthermore, manipulating Angel1 levels in living cells had no effect on global translation rates, suggesting that the protein has a more specific function. Taken together, our results illustrate that we developed a powerful method for identifying new eIF4E partners and open new perspectives for understanding eIF4E-specific regulation.
TDP1 repairs nuclear and mitochondrial DNA damage induced by chain-terminating anticancer and antiviral nucleoside analogs
Chain-terminating nucleoside analogs (CTNAs) that cause stalling or premature termination of DNA replication forks are widely used as anticancer and antiviral drugs. However, it is not well understood how cells repair the DNA damage induced by these drugs. Here, we reveal the importance of tyrosyl–DNA phosphodiesterase 1 (TDP1) in the repair of nuclear and mitochondrial DNA damage induced by CTNAs. On investigating the effects of four CTNAs—acyclovir (ACV), cytarabine (Ara-C), zidovudine (AZT) and zalcitabine (ddC)—we show that TDP1 is capable of removing the covalently linked corresponding CTNAs from DNA 3'-ends. We also show that Tdp1–/– cells are hypersensitive and accumulate more DNA damage when treated with ACV and Ara-C, implicating TDP1 in repairing CTNA-induced DNA damage. As AZT and ddC are known to cause mitochondrial dysfunction, we examined whether TDP1 repairs the mitochondrial DNA damage they induced. We find that AZT and ddC treatment leads to greater depletion of mitochondrial DNA in Tdp1–/– cells. Thus, TDP1 seems to be critical for repairing nuclear and mitochondrial DNA damage caused by CTNAs.
Detailed mechanisms of DNA clamps in prokaryotic and eukaryotic systems were investigated by probing their mechanics with single-molecule force spectroscopy. Specifically, the mechanical forces required for the Escherichia coli and Saccharomyces cerevisiae clamp opening were measured at the single-molecule level by optical tweezers. Steered molecular dynamics simulations further examined the forces involved in DNA clamp opening from the perspective of the interface binding energies associated with the clamp opening processes. In combination with additional molecular dynamics simulations, we identified the contact networks between the clamp subunits that contribute significantly to the interface stability of the S.cerevisiae and E. coli clamps. These studies provide a vivid picture of the mechanics and energy landscape of clamp opening and reveal how the prokaryotic and eukaryotic clamps function through different mechanisms.
Structural insight into negative DNA supercoiling by DNA gyrase, a bacterial type 2A DNA topoisomerase
Type 2A DNA topoisomerases (Topo2A) remodel DNA topology during replication, transcription and chromosome segregation. These multisubunit enzymes catalyze the transport of a double-stranded DNA through a transient break formed in another duplex. The bacterial DNA gyrase, a target for broad-spectrum antibiotics, is the sole Topo2A enzyme able to introduce negative supercoils. We reveal here for the first time the architecture of the full-length Thermus thermophilus DNA gyrase alone and in a cleavage complex with a 155 bp DNA duplex in the presence of the antibiotic ciprofloxacin, using cryo-electron microscopy. The structural organization of the subunits of the full-length DNA gyrase points to a central role of the ATPase domain acting like a ‘crossover trap’ that may help to sequester the DNA positive crossover before strand passage. Our structural data unveil how DNA is asymmetrically wrapped around the gyrase-specific C-terminal β-pinwheel domains and guided to introduce negative supercoils through cooperativity between the ATPase and β-pinwheel domains. The overall conformation of the drug-induced DNA binding–cleavage complex also suggests that ciprofloxacin traps a DNA pre-transport conformation.
Unlike other transfer RNAs (tRNA)-modifying enzymes from the SPOUT methyltransferase superfamily, the tRNA (Um34/Cm34) methyltransferase TrmL lacks the usual extension domain for tRNA binding and consists only of a SPOUT domain. Both the catalytic and tRNA recognition mechanisms of this enzyme remain elusive. By using tRNAs purified from an Escherichia coli strain with the TrmL gene deleted, we found that TrmL can independently catalyze the methyl transfer from S-adenosyl-L-methionine to and isoacceptors without the involvement of other tRNA-binding proteins. We have solved the crystal structures of TrmL in apo form and in complex with S-adenosyl-homocysteine and identified the cofactor binding site and a possible active site. Methyltransferase activity and tRNA-binding affinity of TrmL mutants were measured to identify residues important for tRNA binding of TrmL. Our results suggest that TrmL functions as a homodimer by using the conserved C-terminal half of the SPOUT domain for catalysis, whereas residues from the less-conserved N-terminal half of the other subunit participate in tRNA recognition.
DNA polymerases must accurately replicate DNA to maintain genome integrity. Carcinogenic adducts, such as 2-aminofluorene (AF) and N-acetyl-2-aminofluorene (AAF), covalently bind DNA bases and promote mutagenesis near the adduct site. The mechanism by which carcinogenic adducts inhibit DNA synthesis and cause mutagenesis remains unclear. Here, we measure interactions between a DNA polymerase and carcinogenic DNA adducts in real-time by single-molecule fluorescence. We find the degree to which an adduct affects polymerase binding to the DNA depends on the adduct location with respect to the primer terminus, the adduct structure and the nucleotides present in the solution. Not only do the adducts influence the polymerase dwell time on the DNA but also its binding position and orientation. Finally, we have directly observed an adduct- and mismatch-induced intermediate state, which may be an obligatory step in the DNA polymerase proofreading mechanism.
Conservation of mRNA secondary structures may filter out mutations in Escherichia coli evolution
Recent reports indicate that mutations in viral genomes tend to preserve RNA secondary structure, and those mutations that disrupt secondary structural elements may reduce gene expression levels, thereby serving as a functional knockout. In this article, we explore the conservation of secondary structures of mRNA coding regions, a previously unknown factor in bacterial evolution, by comparing the structural consequences of mutations in essential and nonessential Escherichia coli genes accumulated over 40 000 generations in the course of the ‘long-term evolution experiment’. We monitored the extent to which mutations influence minimum free energy (MFE) values, assuming that a substantial change in MFE is indicative of structural perturbation. Our principal finding is that purifying selection tends to eliminate those mutations in essential genes that lead to greater changes of MFE values and, therefore, may be more disruptive for the corresponding mRNA secondary structures. This effect implies that synonymous mutations disrupting mRNA secondary structures may directly affect the fitness of the organism. These results demonstrate that the need to maintain intact mRNA structures imposes additional evolutionary constraints on bacterial genomes, which go beyond preservation of structure and function of the encoded proteins.
Whole-body scanning PCR; a highly sensitive method to study the biodistribution of mRNAs, noncoding RNAs and therapeutic oligonucleotides
Efficient tissue-specific delivery is a crucial factor in the successful development of therapeutic oligonucleotides. Screening for novel delivery methods with unique tissue-homing properties requires a rapid, sensitive, flexible and unbiased technique able to visualize the in vivo biodistribution of these oligonucleotides. Here, we present whole body scanning PCR, a platform that relies on the local extraction of tissues from a mouse whole body section followed by the conversion of target-specific qPCR signals into an image. This platform was designed to be compatible with a novel RT-qPCR assay for the detection of siRNAs and with an assay suitable for the detection of heavily chemically modified oligonucleotides, which we termed Chemical-Ligation qPCR (CL-qPCR). In addition to this, the platform can also be used to investigate the global expression of endogenous mRNAs and non-coding RNAs. Incorporation of other detection systems, such as aptamers, could even further expand the use of this technology.
Integrated analysis of microRNA and mRNA expression: adding biological significance to microRNA target predictions
Current microRNA target predictions are based on sequence information and empirically derived rules but do not make use of the expression of microRNAs and their targets. This study aimed to improve microRNA target predictions in a given biological context, using in silico predictions, microRNA and mRNA expression. We used target prediction tools to produce lists of predicted targets and used a gene set test designed to detect consistent effects of microRNAs on the joint expression of multiple targets. In a single test, association between microRNA expression and target gene set expression as well as the contribution of the individual target genes on the association are determined. The strongest negatively associated mRNAs as measured by the test were prioritized. We applied our integration method to a well-defined muscle differentiation model. Validation of our predictions in C2C12 cells confirmed predicted targets of known as well as novel muscle-related microRNAs. We further studied associations between microRNA–mRNA pairs in human prostate cancer, finding some pairs that have been recently experimentally validated by others. Using the same study, we showed the advantages of the global test over Pearson correlation and lasso. We conclude that our integrated approach successfully identifies regulated microRNAs and their targets.
High-throughput sequencing for microRNA (miRNA) profiling has revealed a vast complexity of miRNA processing variants, but these are difficult to discern for those without bioinformatics expertise and large computing capability. In this article, we present miRNA Sequence Profiling (miRspring) (http://mirspring.victorchang.edu.au), a software solution that creates a small portable research document that visualizes, calculates and reports on the complexities of miRNA processing. We designed an index-compression algorithm that allows the miRspring document to reproduce a complete miRNA sequence data set while retaining a small file size (typically <3 MB). Through analysis of 73 public data sets, we demonstrate miRspring’s features in assessing quality parameters, miRNA cluster expression levels and miRNA processing. Additionally, we report on a new class of miRNA variants, which we term seed-isomiRs, identified through the novel visualization tools of the miRspring document. Further investigation identified that ~30% of human miRBase entries are likely to have a seed-isomiR. We believe that miRspring will be a highly useful research tool that will enhance the analysis of miRNA data sets and thus increase our understanding of miRNA biology.
Template driven chemical ligation of fluorogenic probes represents a powerful method for DNA and RNA detection and imaging. Unfortunately, previous techniques have been hampered by requiring chemistry with sluggish kinetics and background side reactions. We have developed fluorescent DNA probes containing quenched fluorophore-tetrazine and methyl-cyclopropene groups that rapidly react by bioorthogonal cycloaddition in the presence of complementary DNA or RNA templates. Ligation increases fluorescence with negligible background signal in the absence of hybridization template. Reaction kinetics depend heavily on template length and linker structure. Using this technique, we demonstrate rapid discrimination between single template mismatches both in buffer and cell media. Fluorogenic bioorthogonal ligations offer a promising route towards the fast and robust fluorescent detection of specific DNA or RNA sequences.
SECISearch3 and Seblastian: new tools for prediction of SECIS elements and selenoproteins
Selenoproteins are proteins containing an uncommon amino acid selenocysteine (Sec). Sec is inserted by a specific translational machinery that recognizes a stem-loop structure, the SECIS element, at the 3' UTR of selenoprotein genes and recodes a UGA codon within the coding sequence. As UGA is normally a translational stop signal, selenoproteins are generally misannotated and designated tools have to be developed for this class of proteins. Here, we present two new computational methods for selenoprotein identification and analysis, which we provide publicly through the web servers at http://gladyshevlab.org/SelenoproteinPredictionServer or http://seblastian.crg.es. SECISearch3 replaces its predecessor SECISearch as a tool for prediction of eukaryotic SECIS elements. Seblastian is a new method for selenoprotein gene detection that uses SECISearch3 and then predicts selenoprotein sequences encoded upstream of SECIS elements. Seblastian is able to both identify known selenoproteins and predict new selenoproteins. By applying these tools to diverse eukaryotic genomes, we provide a ranked list of newly predicted selenoproteins together with their annotated cysteine-containing homologues. An analysis of a representative candidate belonging to the AhpC family shows how the use of Sec in this protein evolved in bacterial and eukaryotic lineages.
Co-expression of RNA-protein complexes in Escherichia coli and applications to RNA biology
RNA has emerged as a major player in many cellular processes. Understanding these processes at the molecular level requires homogeneous RNA samples for structural, biochemical and pharmacological studies. We previously devised a generic approach that allows efficient in vivo expression of recombinant RNA in Escherichia coli. In this work, we have extended this method to RNA/protein co-expression. We have engineered several plasmids that allow overexpression of RNA–protein complexes in E. coli. We have investigated the potential of these tools in many applications, including the production of nuclease-sensitive RNAs encapsulated in viral protein pseudo-particles, the co-production of non-coding RNAs with chaperone proteins, the incorporation of a post-transcriptional RNA modification by co-production with the appropriate modifying enzyme and finally the production and purification of an RNA–His-tagged protein complex by nickel affinity chromatography. We show that this last application easily provides pure material for crystallographic studies. The new tools we report will pave the way to large-scale structural and molecular investigations of RNA function and interactions with proteins.
Computational methods to detect conserved non-genic elements in phylogenetically isolated genomes: application to zebrafish
Many important model organisms for biomedical and evolutionary research have sequenced genomes, but occupy a phylogenetically isolated position, evolutionarily distant from other sequenced genomes. This phylogenetic isolation is exemplified for zebrafish, a vertebrate model for cis-regulation, development and human disease, whose evolutionary distance to all other currently sequenced fish exceeds the distance between human and chicken. Such large distances make it difficult to align genomes and use them for comparative analysis beyond gene-focused questions. In particular, detecting conserved non-genic elements (CNEs) as promising cis-regulatory elements with biological importance is challenging. Here, we develop a general comparative genomics framework to align isolated genomes and to comprehensively detect CNEs. Our approach integrates highly sensitive and quality-controlled local alignments and uses alignment transitivity and ancestral reconstruction to bridge large evolutionary distances. We apply our framework to zebrafish and demonstrate substantially improved CNE detection and quality compared with previous sets. Our zebrafish CNE set comprises 54 533 CNEs, of which 11 792 (22%) are conserved to human or mouse. Our zebrafish CNEs (http://zebrafish.stanford.edu) are highly enriched in known enhancers and extend existing experimental (ChIP-Seq) sets. The same framework can now be applied to the isolated genomes of frog, amphioxus, Caenorhabditis elegans and many others.