PLoS Computational Biology
AprioriGWAS, a New Pattern Mining Strategy for Detecting Genetic Variants Associated with Disease through Interaction Effects
by Qingrun Zhang, Quan Long, Jurg OttIdentifying gene-gene interaction is a hot topic in genome wide association studies. Two fundamental challenges are: (1) how to smartly identify combinations of variants that may be associated with the trait from astronomical number of all possible combinations; and (2) how to test epistatic interaction when all potential combinations are available. We developed AprioriGWAS, which brings two innovations. (1) Based on Apriori, a successful method in field of Frequent Itemset Mining (FIM) in which a pattern growth strategy is leveraged to effectively and accurately reduce search space, AprioriGWAS can efficiently identify genetically associated genotype patterns. (2) To test the hypotheses of epistasis, we adopt a new conditional permutation procedure to obtain reliable statistical inference of Pearson's chi-square test for the contingency table generated by associated variants. By applying AprioriGWAS to age-related macular degeneration (AMD) data, we found that: (1) angiopoietin 1 (ANGPT1) and four retinal genes interact with Complement Factor H (CFH). (2) GO term “glycosaminoglycan biosynthetic process” was enriched in AMD interacting genes. The epistatic interactions newly found by AprioriGWAS on AMD data are likely true interactions, since genes interacting with CFH are retinal genes, and GO term enrichment also verified that interaction between glycosaminoglycans (GAGs) and CFH plays an important role in disease pathology of AMD. By applying AprioriGWAS on Bipolar disorder in WTCCC data, we found variants without marginal effect show significant interactions. For example, multiple-SNP genotype patterns inside gene GABRB2 and GRIA1 (AMPA subunit 1 receptor gene). AMPARs are found in many parts of the brain and are the most commonly found receptor in the nervous system. The GABRB2 mediates the fastest inhibitory synaptic transmission in the central nervous system. GRIA1 and GABRB2 are relevant to mental disorders supported by multiple evidences.
Messages Do Diffuse Faster than Messengers: Reconciling Disparate Estimates of the Morphogen Bicoid Diffusion Coefficient
by Lorena Sigaut, John E. Pearson, Alejandro Colman-Lerner, Silvina Ponce DawsonThe gradient of Bicoid (Bcd) is key for the establishment of the anterior-posterior axis in Drosophila embryos. The gradient properties are compatible with the SDD model in which Bcd is synthesized at the anterior pole and then diffuses into the embryo and is degraded with a characteristic time. Within this model, the Bcd diffusion coefficient is critical to set the timescale of gradient formation. This coefficient has been measured using two optical techniques, Fluorescence Recovery After Photobleaching (FRAP) and Fluorescence Correlation Spectroscopy (FCS), obtaining estimates in which the FCS value is an order of magnitude larger than the FRAP one. This discrepancy raises the following questions: which estimate is "correct''; what is the reason for the disparity; and can the SDD model explain Bcd gradient formation within the experimentally observed times? In this paper, we use a simple biophysical model in which Bcd diffuses and interacts with binding sites to show that both the FRAP and the FCS estimates may be correct and compatible with the observed timescale of gradient formation. The discrepancy arises from the fact that FCS and FRAP report on different effective (concentration dependent) diffusion coefficients, one of which describes the spreading rate of the individual Bcd molecules (the messengers) and the other one that of their concentration (the message). The latter is the one that is more relevant for the gradient establishment and is compatible with its formation within the experimentally observed times.
Histone Modifications Are Associated with Transcript Isoform Diversity in Normal and Cancer Cells
by Ondrej Podlaha, Subhajyoti De, Mithat Gonen, Franziska MichorMechanisms that generate transcript diversity are of fundamental importance in eukaryotes. Although a large fraction of human protein-coding genes and lincRNAs produce more than one mRNA isoform each, the regulation of this phenomenon is still incompletely understood. Much progress has been made in deciphering the role of sequence-specific features as well as DNA-and RNA-binding proteins in alternative splicing. Recently, however, several experimental studies of individual genes have revealed a direct involvement of epigenetic factors in alternative splicing and transcription initiation. While histone modifications are generally correlated with overall gene expression levels, it remains unclear how histone modification enrichment affects relative isoform abundance. Therefore, we sought to investigate the associations between histone modifications and transcript diversity levels measured by the rates of transcription start-site switching and alternative splicing on a genome-wide scale across protein-coding genes and lincRNAs. We found that the relationship between enrichment levels of epigenetic marks and transcription start-site switching is similar for protein-coding genes and lincRNAs. Furthermore, we found associations between splicing rates and enrichment levels of H2az, H3K4me1, H3K4me2, H3K4me3, H3K9ac, H3K9me3, H3K27ac, H3K27me3, H3K36me3, H3K79me2, and H4K20me, marks traditionally associated with enhancers, transcription initiation, transcriptional repression, and others. These patterns were observed in both normal and cancer cell lines. Additionally, we developed a novel computational method that identified 840 epigenetically regulated candidate genes and predicted transcription start-site switching and alternative exon splicing with up to 92% accuracy based on epigenetic patterning alone. Our results suggest that the epigenetic regulation of transcript isoform diversity may be a relatively common genome-wide phenomenon representing an avenue of deregulation in tumor development.
by Xin Tang, Alireza Tofangchi, Sandeep V. Anand, Taher A. SaifTraction forces exerted by adherent cells on their microenvironment can mediate many critical cellular functions. Accurate quantification of these forces is essential for mechanistic understanding of mechanotransduction. However, most existing methods of quantifying cellular forces are limited to single cells in isolation, whereas most physiological processes are inherently multi-cellular in nature where cell-cell and cell-microenvironment interactions determine the emergent properties of cell clusters. In the present study, a robust finite-element-method-based cell traction force microscopy technique is developed to estimate the traction forces produced by multiple isolated cells as well as cell clusters on soft substrates. The method accounts for the finite thickness of the substrate. Hence, cell cluster size can be larger than substrate thickness. The method allows computing the traction field from the substrate displacements within the cells' and clusters' boundaries. The displacement data outside these boundaries are not necessary. The utility of the method is demonstrated by computing the traction generated by multiple monkey kidney fibroblasts (MKF) and human colon cancerous (HCT-8) cells in close proximity, as well as by large clusters. It is found that cells act as individual contractile groups within clusters for generating traction. There may be multiple of such groups in the cluster, or the entire cluster may behave a single group. Individual cells do not form dipoles, but serve as a conduit of force (transmission lines) over long distances in the cluster. The cell-cell force can be either tensile or compressive depending on the cell-microenvironment interactions.
The Self-Limiting Dynamics of TGF-β Signaling In Silico and In Vitro, with Negative Feedback through PPM1A Upregulation
by Junjie Wang, Lisa Tucker-Kellogg, Inn Chuan Ng, Ruirui Jia, P. S. Thiagarajan, Jacob K. White, Hanry YuThe TGF-β/Smad signaling system decreases its activity through strong negative regulation. Several molecular mechanisms of negative regulation have been published, but the relative impact of each mechanism on the overall system is unknown. In this work, we used computational and experimental methods to assess multiple negative regulatory effects on Smad signaling in HaCaT cells. Previously reported negative regulatory effects were classified by time-scale: degradation of phosphorylated R-Smad and I-Smad-induced receptor degradation were slow-mode effects, and dephosphorylation of R-Smad was a fast-mode effect. We modeled combinations of these effects, but found no combination capable of explaining the observed dynamics of TGF-β/Smad signaling. We then proposed a negative feedback loop with upregulation of the phosphatase PPM1A. The resulting model was able to explain the dynamics of Smad signaling, under both short and long exposures to TGF-β. Consistent with this model, immuno-blots showed PPM1A levels to be significantly increased within 30 min after TGF-β stimulation. Lastly, our model was able to resolve an apparent contradiction in the published literature, concerning the dynamics of phosphorylated R-Smad degradation. We conclude that the dynamics of Smad negative regulation cannot be explained by the negative regulatory effects that had previously been modeled, and we provide evidence for a new negative feedback loop through PPM1A upregulation. This work shows that tight coupling of computational and experiments approaches can yield improved understanding of complex pathways.
by Trygve Solstad, Hosam N. Yousif, Terrence J. SejnowskiEpisodic-like memory is thought to be supported by attractor dynamics in the hippocampus. A possible neural substrate for this memory mechanism is rate remapping, in which the spatial map of place cells encodes contextual information through firing rate variability. To test whether memories are stored as multimodal attractors in populations of place cells, recent experiments morphed one familiar context into another while observing the responses of CA3 cell ensembles. Average population activity in CA3 was reported to transition gradually rather than abruptly from one familiar context to the next, suggesting a lack of attractive forces associated with the two stored representations. On the other hand, individual CA3 cells showed a mix of gradual and abrupt transitions at different points along the morph sequence, and some displayed hysteresis which is a signature of attractor dynamics. To understand whether these seemingly conflicting results are commensurate with attractor network theory, we developed a neural network model of the CA3 with attractors for both position and discrete contexts. We found that for memories stored in overlapping neural ensembles within a single spatial map, position-dependent context attractors made transitions at different points along the morph sequence. Smooth transition curves arose from averaging across the population, while a heterogeneous set of responses was observed on the single unit level. In contrast, orthogonal memories led to abrupt and coherent transitions on both population and single unit levels as experimentally observed when remapping between two independent spatial maps. Strong recurrent feedback entailed a hysteretic effect on the network which diminished with the amount of overlap in the stored memories. These results suggest that context-dependent memory can be supported by overlapping local attractors within a spatial map of CA3 place cells. Similar mechanisms for context-dependent memory may also be found in other regions of the cerebral cortex.
by Michael Kuhn, Anthony A. Hyman, Andreas BeyerRepurposing existing proteins for new cellular functions is recognized as a main mechanism of evolutionary innovation, but its role in organelle evolution is unclear. Here, we explore the mechanisms that led to the evolution of the centrosome, an ancestral eukaryotic organelle that expanded its functional repertoire through the course of evolution. We developed a refined sequence alignment technique that is more sensitive to coiled coil proteins, which are abundant in the centrosome. For proteins with high coiled-coil content, our algorithm identified 17% more reciprocal best hits than BLAST. Analyzing 108 eukaryotic genomes, we traced the evolutionary history of centrosome proteins. In order to assess how these proteins formed the centrosome and adopted new functions, we computationally emulated evolution by iteratively removing the most recently evolved proteins from the centrosomal protein interaction network. Coiled-coil proteins that first appeared in the animal–fungi ancestor act as scaffolds and recruit ancestral eukaryotic proteins such as kinases and phosphatases to the centrosome. This process created a signaling hub that is crucial for multicellular development. Our results demonstrate how ancient proteins can be co-opted to different cellular localizations, thereby becoming involved in novel functions.
by Steven W. Sowa, Michael Baldea, Lydia M. ContrerasMethods for improving microbial strains for metabolite production remain the subject of constant research. Traditionally, metabolic tuning has been mostly limited to knockouts or overexpression of pathway genes and regulators. In this paper, we establish a new method to control metabolism by inducing optimally tuned time-oscillations in the levels of selected clusters of enzymes, as an alternative strategy to increase the production of a desired metabolite. Using an established kinetic model of the central carbon metabolism of Escherichia coli, we formulate this concept as a dynamic optimization problem over an extended, but finite time horizon. Total production of a metabolite of interest (in this case, phosphoenolpyruvate, PEP) is established as the objective function and time-varying concentrations of the cellular enzymes are used as decision variables. We observe that by varying, in an optimal fashion, levels of key enzymes in time, PEP production increases significantly compared to the unoptimized system. We demonstrate that oscillations can improve metabolic output in experimentally feasible synthetic circuits.
by Johanni Brea, Robert Urbanczik, Walter SennRecent experiments revealed that the fruit fly Drosophila melanogaster has a dedicated mechanism for forgetting: blocking the G-protein Rac leads to slower and activating Rac to faster forgetting. This active form of forgetting lacks a satisfactory functional explanation. We investigated optimal decision making for an agent adapting to a stochastic environment where a stimulus may switch between being indicative of reward or punishment. Like Drosophila, an optimal agent shows forgetting with a rate that is linked to the time scale of changes in the environment. Moreover, to reduce the odds of missing future reward, an optimal agent may trade the risk of immediate pain for information gain and thus forget faster after aversive conditioning. A simple neuronal network reproduces these features. Our theory shows that forgetting in Drosophila appears as an optimal adaptive behavior in a changing environment. This is in line with the view that forgetting is adaptive rather than a consequence of limitations of the memory system.
Correction: Impact of Different Oseltamivir Regimens on Treating Influenza A Virus Infection and Resistance Emergence: Insights from a Modelling Study
by The PLOS Computational Biology Staff
by Deborah Chasman, Brandi Gancarz, Linhui Hao, Michael Ferris, Paul Ahlquist, Mark CravenSystematic, genome-wide loss-of-function experiments can be used to identify host factors that directly or indirectly facilitate or inhibit the replication of a virus in a host cell. We present an approach that combines an integer linear program and a diffusion kernel method to infer the pathways through which those host factors modulate viral replication. The inputs to the method are a set of viral phenotypes observed in single-host-gene mutants and a background network consisting of a variety of host intracellular interactions. The output is an ensemble of subnetworks that provides a consistent explanation for the measured phenotypes, predicts which unassayed host factors modulate the virus, and predicts which host factors are the most direct interfaces with the virus. We infer host-virus interaction subnetworks using data from experiments screening the yeast genome for genes modulating the replication of two RNA viruses. Because a gold-standard network is unavailable, we assess the predicted subnetworks using both computational and qualitative analyses. We conduct a cross-validation experiment in which we predict whether held-aside test genes have an effect on viral replication. Our approach is able to make high-confidence predictions more accurately than several baselines, and about as well as the best baseline, which does not infer mechanistic pathways. We also examine two kinds of predictions made by our method: which host factors are nearest to a direct interaction with a viral component, and which unassayed host genes are likely to be involved in viral replication. Multiple predictions are supported by recent independent experimental data, or are components or functional partners of confirmed relevant complexes or pathways. Integer program code, background network data, and inferred host-virus subnetworks are available at http://www.biostat.wisc.edu/~craven/chasman_host_virus/.
A New Tool to Quantify Receptor Recruitment to Cell Contact Sites during Host-Pathogen Interaction
by Matthew S. Graus, Carolyn Pehlke, Michael J. Wester, Lisa B. Davidson, Stanly L. Steinberg, Aaron K. NeumannTo understand the process of innate immune fungal recognition, we developed computational tools for the rigorous quantification and comparison of receptor recruitment and distribution at cell-cell contact sites. We used these tools to quantify pattern recognition receptor spatiotemporal distributions in contacts between primary human dendritic cells and the fungal pathogens C. albicans, C. parapsilosis and the environmental yeast S. cerevisiae, imaged using 3D multichannel laser scanning confocal microscopy. The detailed quantitative analysis of contact sites shows that, despite considerable biochemical similarity in the composition and structure of these species' cell walls, the receptor spatiotemporal distribution in host-microbe contact sites varies significantly between these yeasts. Our findings suggest a model where innate immune cells discriminate fungal microorganisms based on differential mobilization and coordination of receptor networks. Our analysis methods are also broadly applicable to a range of cell-cell interactions central to many biological problems.
Exploration of the Dynamic Properties of Protein Complexes Predicted from Spatially Constrained Protein-Protein Interaction Networks
by Eric A. Yen, Aaron Tsay, Jerome Waldispuhl, Jackie VogelProtein complexes are not static, but rather highly dynamic with subunits that undergo 1-dimensional diffusion with respect to each other. Interactions within protein complexes are modulated through regulatory inputs that alter interactions and introduce new components and deplete existing components through exchange. While it is clear that the structure and function of any given protein complex is coupled to its dynamical properties, it remains a challenge to predict the possible conformations that complexes can adopt. Protein-fragment Complementation Assays detect physical interactions between protein pairs constrained to ≤8 nm from each other in living cells. This method has been used to build networks composed of 1000s of pair-wise interactions. Significantly, these networks contain a wealth of dynamic information, as the assay is fully reversible and the proteins are expressed in their natural context. In this study, we describe a method that extracts this valuable information in the form of predicted conformations, allowing the user to explore the conformational landscape, to search for structures that correlate with an activity state, and estimate the abundance of conformations in the living cell. The generator is based on a Markov Chain Monte Carlo simulation that uses the interaction dataset as input and is constrained by the physical resolution of the assay. We applied this method to an 18-member protein complex composed of the seven core proteins of the budding yeast Arp2/3 complex and 11 associated regulators and effector proteins. We generated 20,480 output structures and identified conformational states using principle component analysis. We interrogated the conformation landscape and found evidence of symmetry breaking, a mixture of likely active and inactive conformational states and dynamic exchange of the core protein Arc15 between core and regulatory components. Our method provides a novel tool for prediction and visualization of the hidden dynamics within protein interaction networks.
by Jisoo Park, Heather C. Wick, Daniel E. Kee, Keith Noto, Jill L. Maron, Donna K. SlonimIdentifying molecular connections between developmental processes and disease can lead to new hypotheses about health risks at all stages of life. Here we introduce a new approach to identifying significant connections between gene sets and disease genes, and apply it to several gene sets related to human development. To overcome the limits of incomplete and imperfect information linking genes to disease, we pool genes within disease subtrees in the MeSH taxonomy, and we demonstrate that such pooling improves the power and accuracy of our approach. Significance is assessed through permutation. We created a web-based visualization tool to facilitate multi-scale exploration of this large collection of significant connections (http://gda.cs.tufts.edu/development). High-level analysis of the results reveals expected connections between tissue-specific developmental processes and diseases linked to those tissues, and widespread connections to developmental disorders and cancers. Yet interesting new hypotheses may be derived from examining the unexpected connections. We highlight and discuss the implications of three such connections, linking dementia with bone development, polycystic ovary syndrome with cardiovascular development, and retinopathy of prematurity with lung development. Our results provide additional evidence that plays a key role in the early pathogenesis of polycystic ovary syndrome. Our evidence also suggests that the VEGF pathway and downstream NFKB signaling may explain the complex relationship between bronchopulmonary dysplasia and retinopathy of prematurity, and may form a bridge between two currently-competing hypotheses about the molecular origins of bronchopulmonary dysplasia. Further data exploration and similar queries about other gene sets may generate a variety of new information about the molecular relationships between additional diseases.
Learn from the Best
by Virginie Bernard, Sebastian J. Schultheiss, Magali MichautWhat is more inspiring than a discussion with the leading scientists in your field? As a student or a young researcher, you have likely been influenced by mentors guiding you in your career and leading you to your current position. Any discussion with or advice from an expert is certainly very helpful for young people. But how often do we have the opportunity to meet experts? Do we make the most out of these situations? Meetings organized for young scientists are a great opportunity not only for the attendees: they are an opportunity for experts to meet bright students and learn from them in return. In this article, we introduce several successful events organized by Regional Student Groups all around the world, bridging the gap between experts and young scientists. We highlight how rewarding it is for all participants: young researchers, experts, and organizers. We then discuss the various benefits and emphasize the importance of organizing and attending such meetings. As a young researcher, seeking mentorship and additional skills training is a crucial step in career development. Keep in mind that one day, you may be an inspiring mentor, too.
by Neetika Nath, John B. O. Mitchell, Gustavo Caetano-AnollésPhylogenomic analysis of the occurrence and abundance of protein domains in proteomes has recently showed that the α/β architecture is probably the oldest fold design. This holds important implications for the origins of biochemistry. Here we explore structure-function relationships addressing the use of chemical mechanisms by ancestral enzymes. We test the hypothesis that the oldest folds used the most mechanisms. We start by tracing biocatalytic mechanisms operating in metabolic enzymes along a phylogenetic timeline of the first appearance of homologous superfamilies of protein domain structures from CATH. A total of 335 enzyme reactions were retrieved from MACiE and were mapped over fold age. We define a mechanistic step type as one of the 51 mechanistic annotations given in MACiE, and each step of each of the 335 mechanisms was described using one or more of these annotations. We find that the first two folds, the P-loop containing nucleotide triphosphate hydrolase and the NAD(P)-binding Rossmann-like homologous superfamilies, were α/β architectures responsible for introducing 35% (18/51) of the known mechanistic step types. We find that these two oldest structures in the phylogenomic analysis of protein domains introduced many mechanistic step types that were later combinatorially spread in catalytic history. The most common mechanistic step types included fundamental building blocks of enzyme chemistry: “Proton transfer,” “Bimolecular nucleophilic addition,” “Bimolecular nucleophilic substitution,” and “Unimolecular elimination by the conjugate base.” They were associated with the most ancestral fold structure typical of P-loop containing nucleotide triphosphate hydrolases. Over half of the mechanistic step types were introduced in the evolutionary timeline before the appearance of structures specific to diversified organisms, during a period of architectural diversification. The other half unfolded gradually after organismal diversification and during a period that spanned ∼2 billion years of evolutionary history.
by Tal Neiman, Yonatan LoewensteinIn operant learning, behaviors are reinforced or inhibited in response to the consequences of similar actions taken in the past. However, because in natural environments the “same” situation never recurs, it is essential for the learner to decide what “similar” is so that he can generalize from experience in one state of the world to future actions in different states of the world. The computational principles underlying this generalization are poorly understood, in particular because natural environments are typically too complex to study quantitatively. In this paper we study the principles underlying generalization in operant learning of professional basketball players. In particular, we utilize detailed information about the spatial organization of shot locations to study how players adapt their attacking strategy in real time according to recent events in the game. To quantify this learning, we study how a make \ miss from one location in the court affects the probabilities of shooting from different locations. We show that generalization is not a spatially-local process, nor is governed by the difficulty of the shot. Rather, to a first approximation, players use a simplified binary representation of the court into 2 pt and 3 pt zones. This result indicates that rather than using low-level features, generalization is determined by high-level cognitive processes that incorporate the abstract rules of the game.
In Silico Screening of the Key Cellular Remodeling Targets in Chronic Atrial Fibrillation
by Jussi T. Koivumäki, Gunnar Seemann, Mary M. Maleckar, Pasi TaviChronic atrial fibrillation (AF) is a complex disease with underlying changes in electrophysiology, calcium signaling and the structure of atrial myocytes. How these individual remodeling targets and their emergent interactions contribute to cell physiology in chronic AF is not well understood. To approach this problem, we performed in silico experiments in a computational model of the human atrial myocyte. The remodeled function of cellular components was based on a broad literature review of in vitro findings in chronic AF, and these were integrated into the model to define a cohort of virtual cells. Simulation results indicate that while the altered function of calcium and potassium ion channels alone causes a pronounced decrease in action potential duration, remodeling of intracellular calcium handling also has a substantial impact on the chronic AF phenotype. We additionally found that the reduction in amplitude of the calcium transient in chronic AF as compared to normal sinus rhythm is primarily due to the remodeling of calcium channel function, calcium handling and cellular geometry. Finally, we found that decreased electrical resistance of the membrane together with remodeled calcium handling synergistically decreased cellular excitability and the subsequent inducibility of repolarization abnormalities in the human atrial myocyte in chronic AF. We conclude that the presented results highlight the complexity of both intrinsic cellular interactions and emergent properties of human atrial myocytes in chronic AF. Therefore, reversing remodeling for a single remodeled component does little to restore the normal sinus rhythm phenotype. These findings may have important implications for developing novel therapeutic approaches for chronic AF.
by Aistis Stankevicius, Quentin J. M. Huys, Aditi Kalra, Peggy SerièsOptimists hold positive a priori beliefs about the future. In Bayesian statistical theory, a priori beliefs can be overcome by experience. However, optimistic beliefs can at times appear surprisingly resistant to evidence, suggesting that optimism might also influence how new information is selected and learned. Here, we use a novel Pavlovian conditioning task, embedded in a normative framework, to directly assess how trait optimism, as classically measured using self-report questionnaires, influences choices between visual targets, by learning about their association with reward progresses. We find that trait optimism relates to an a priori belief about the likelihood of rewards, but not losses, in our task. Critically, this positive belief behaves like a probabilistic prior, i.e. its influence reduces with increasing experience. Contrary to findings in the literature related to unrealistic optimism and self-beliefs, it does not appear to influence the iterative learning process directly.
by Devin Greene, Kristina CronaIt has recently been noted that the relative prevalence of the various kinds of epistasis varies along an adaptive walk. This has been explained as a result of mean regression in NK model fitness landscapes. Here we show that this phenomenon occurs quite generally in fitness landscapes. We propose a simple and general explanation for this phenomenon, confirming the role of mean regression. We provide support for this explanation with simulations, and discuss the empirical relevance of our findings.