Proteins: Structure, Function, Bioinformatics
Characterization of the differences in the cyclopiazonic acid binding mode to mammalian and P. Falciparum Ca2+ pumps: A computational study
Despite the investments in malaria research, an effective vaccine has not yet been developed and the causative parasites are becoming increasingly resistant to most of the available drugs. PfATP6, the sarco/endoplasmic reticulum Ca2+ pump (SERCA) of P. falciparum, has been recently genetically validated as a potential antimalarial target and cyclopiazonic acid (CPA) has been found to be a potent inhibitor of SERCAs in several organisms, including P. falciparum. In position 263, PfATP6 displays a leucine residue, whilst the corresponding position in the mammalian SERCA is occupied by a glutamic acid. The PfATP6 L263E mutation has been studied in relation to the artemisinin inhibitory effect on P. falciparum and recent studies have provided evidence that the parasite with this mutation is more susceptible to CPA. Here, we characterized, for the first time, the interaction of CPA with PfATP6 and its mammalian counterpart to understand similarities and differences in the mode of binding of the inhibitor to the two Ca2+ pumps. We found that, even though CPA does not directly interact with the residue in position 263, the presence of a hydrophobic residue in this position in PfATP6 rather than a negatively charged one, as in the mammalian SERCA, entails a conformational arrangement of the binding pocket which, in turn, determines a relaxation of CPA leading to a different binding mode of the compound. Our findings highlight differences between the plasmodial and human SERCA CPA-binding pockets that may be exploited to design CPA derivatives more selective toward PfATP6. Proteins 2015; 83:564–574. © 2015 The Authors. Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.
During CASP10 in summer 2012, we tested BCL::Fold for prediction of free modeling (FM) and template-based modeling (TBM) targets. BCL::Fold assembles the tertiary structure of a protein from predicted secondary structure elements (SSEs) omitting more flexible loop regions early on. This approach enables the sampling of conformational space for larger proteins with more complex topologies. In preparation of CASP11, we analyzed the quality of CASP10 models throughout the prediction pipeline to understand BCL::Fold's ability to sample the native topology, identify native-like models by scoring and/or clustering approaches, and our ability to add loop regions and side chains to initial SSE-only models. The standout observation is that BCL::Fold sampled topologies with a GDT_TS score > 33% for 12 of 18 and with a topology score > 0.8 for 11 of 18 test cases de novo. Despite the sampling success of BCL::Fold, significant challenges still exist in clustering and loop generation stages of the pipeline. The clustering approach employed for model selection often failed to identify the most native-like assembly of SSEs for further refinement and submission. It was also observed that for some β-strand proteins model refinement failed as β-strands were not properly aligned to form hydrogen bonds removing otherwise accurate models from the pool. Further, BCL::Fold samples frequently non-natural topologies that require loop regions to pass through the center of the protein. Proteins 2015; 83:547–563. © 2015 Wiley Periodicals, Inc.
Hierarchical domain-motion analysis of conformational changes in sarcoplasmic reticulum Ca2+-ATPase
Sarco(endo)plasmic reticulum Ca2+-ATPase transports two Ca2+ per ATP-hydrolyzed across biological membranes against a large concentration gradient by undergoing large conformational changes. Structural studies with X-ray crystallography revealed functional roles of coupled motions between the cytoplasmic domains and the transmembrane helices in individual reaction steps. Here, we employed “Motion Tree (MT),” a tree diagram that describes a conformational change between two structures, and applied it to representative Ca2+-ATPase structures. MT provides information of coupled rigid-body motions of the ATPase in individual reaction steps. Fourteen rigid structural units, “common rigid domains (CRDs)” are identified from seven MTs throughout the whole enzymatic reaction cycle. CRDs likely act as not only the structural units, but also the functional units. Some of the functional importance has been newly revealed by the analysis. Stability of each CRD is examined on the morphing trajectories that cover seven conformational transitions. We confirmed that the large conformational changes are realized by the motions only in the flexible regions that connect CRDs. The Ca2+-ATPase efficiently utilizes its intrinsic flexibility and rigidity to response different switches like ligand binding/dissociation or ATP hydrolysis. The analysis detects functional motions without extensive biological knowledge of experts, suggesting its general applicability to domain movements in other membrane proteins to deepen the understanding of protein structure and function.Proteins 2015. © 2015 Wiley Periodicals, Inc.
Conformational and connotational heterogeneity: A surprising relationship between protein structural flexibility and puns
Protein structures are often thought of as static objects, and indeed, the bulk of a protein's sequence forms α-helices, β-sheets, and other generally well-ordered substructures. These portions of the molecule pre-pay the entropic price of maintaining a globally unique fold, freeing other regions to adopt multiple alternative conformations. In many cases, this localized flexibility is biologically interesting: it may be important for catalytic turnover or for conformational selection before forming an intermolecular complex, for example. Similarly, most of written language is carefully tuned to avoid ambiguity and convey a singular meaning, a cohesive message. This linguistic scaffolding in some sense pre-pays a rhetorical price, paving the way for punctuated instances in which a given word or phrase can simultaneously adopt multiple alternative connotations—in other words, for puns.Proteins 2015. © 2015 Wiley Periodicals, Inc.
The ability of bacteria to use cGMP as a second messenger has been controversial for decades. Recently, nucleotide cyclases from Rhodospirillum centenum, GcyA, and Xanthomonas campestris, GuaX, have been shown to possess guanylate cyclase activities. Enzymatic activities of these guanylate cyclases measured in vitro were low, which makes interpretation of the assays ambiguous. Protein sequence analysis at present is insufficient to distinguish between bacterial adenylate and guanylate cyclases, both of which belong to nucleotide cyclases of type III. We developed a simple method for discriminating between guanylate and adenylate cyclase activities in a physiologically relevant bacterial system. The method relies on the use of a mutant cAMP receptor protein, CRPG, constructed here. While wild-type CRP is activated exclusively by cAMP, CRPG can be activated by either cAMP or cGMP. Using CRP- and CRPG-dependent lacZ expression in two E. coli strains, we verified that R. centenum GcyA and X. campestris GuaX have primarily guanylate cyclase activities. Among two other bacterial nucleotide cyclases tested, one, GuaA from Azospillrillum sp. B510, proved to have guanylate cyclase activity, while the other one, Bradyrhizobium japonicum CyaA, turned out to function as an adenylate cyclase. The results obtained with this reporter system were in excellent agreement with direct measurements of cyclic nucleotides secreted by E. coli expressing nucleotide cyclase genes. The simple genetic screen developed here is expected to facilitate identification of bacterial guanylate cyclases and engineering of guanylate cyclases with desired properties.Proteins 2015. © 2015 Wiley Periodicals, Inc.
Structural mapping of the coiled-coil domain of a bacterial condensin and comparative analyses across all domains of life suggest conserved features of SMC proteins
The structural maintenance of chromosomes (SMC) proteins form the cores of multisubunit complexes that are required for the segregation and global organization of chromosomes in all domains of life. These proteins share a common domain structure in which N- and C- terminal regions pack against one another to form a globular ATPase domain. This “head” domain is connected to a central, globular, “hinge” or dimerization domain by a long, antiparallel coiled coil. To date, most efforts for structural characterization of SMC proteins have focused on the globular domains. Recently, however, we developed a method to map inter-strand interactions in the 50nm coiled-coil domain of MukB, the divergent SMC protein found in γ-proteobacteria. Here, we apply that technique to map the structure of the B. subtilis SMC (BsSMC) coiled-coil domain. We find that, in contrast to the relatively complicated coiled-coil domain of MukB, the BsSMC domain is nearly continuous, with only two detectable coiled-coil interruptions. Near the middle of the domain is a break in coiled-coil structure where there are three more residues on the C-terminal strand than on the N-terminal strand. Close to the head domain, there is a second break with a significantly longer insertion on the same strand. These results provide an experience base that allows an informed interpretation of the output of coiled-coil prediction algorithms for this family of proteins. A comparison of such predictions suggests that these coiled-coil deviations are highly conserved across SMC types in a wide variety of organisms, including humans. This article is protected by copyright. All rights reserved.
AbDesign: An algorithm for combinatorial backbone design guided by natural conformations and sequences
Computational design of protein function has made substantial progress, generating new enzymes, binders, inhibitors, and nanomaterials not previously seen in nature. However, the ability to design new protein backbones for function – essential to exert control over all polypeptide degrees of freedom – remains a critical challenge. Most previous attempts to design new backbones computed the mainchain from scratch. Here, instead, we describe a combinatorial backbone and sequence optimization algorithm called AbDesign, which leverages the large number of sequences and experimentally determined molecular structures of antibodies to construct new antibody models, dock them against target surfaces and optimize their sequence and backbone conformation for high stability and binding affinity. We used the algorithm to produce antibody designs that target the same molecular surfaces as nine natural, high-affinity antibodies; in five cases interface sequence identity is above 30%, and in four of those the backbone conformation at the core of the antibody binding surface is within 1 Å root-mean square deviation from the natural antibodies. Designs recapitulate polar interaction networks observed in natural complexes, and amino acid sidechain rigidity at the designed binding surface, which is likely important for affinity and specificity, is high compared to previous design studies. In designed anti-lysozyme antibodies, complementarity-determining regions (CDRs) at the periphery of the interface, such as L1 and H2, show greater backbone conformation diversity than the CDRs at the core of the interface, and increase the binding surface area compared to the natural antibody, potentially enhancing affinity and specificity. This article is protected by copyright. All rights reserved.
Molecular modelling reveals binding interface of γ-tubulin with GCP4 and interactions with noscapinoids
The initiation of microtubule assembly within cells is guided by a cone shaped multi-protein complex, γ-tubulin ring complex (γTuRC) containing γ-tubulin and atleast five other γ-tubulin-complex proteins (GCPs) i.e., GCP2, GCP3, GCP4, GCP5, and GCP6. The rim of γTuRC is a ring of γ-tubulin molecules that interacts, via one of its longitudinal interfaces, with GCP2, GCP3 or GCP4 and, via other interface, with α/β−tubulin dimers recruited for the microtubule lattice formation. These interactions however, are not well understood in the absence of crystal structure of functional reconstitution of γTuRC subunits. In this study we elucidate the atomic interactions between γ-tubulin and GCP4 through computational techniques. We simulated two complexes of γ-tubulin-GCP4 complex (we called dimer1 and dimer2) for 25 ns to obtain a stable complex and calculated the ensemble average of binding free energies of -158.82 kcal/mol and -170.19 kcal/mol for dimer 1 and -79.53 kcal/mol and -101.50 kcal/mol for dimer 2 using MM-PBSA and MM-GBSA methods, respectively. These highly favourable binding free energy values points to very robust interactions between GCP4 and γ-tubulin. From the results of the free-energy decomposition and the computational alanine scanning calculation, we identified the amino acids crucial for the interaction of γ-tubulin with GCP4, called hotspots. Furthermore, in the endeavour to identify chemical leads that might interact at the interface of γ-tubulin-GCP4 complex; we found a class of compounds based on the plant alkaloid, noscapine that binds with high affinity in a cavity close to γ-tubulin-GCP4 interface compared to previously reported compounds. All noscapinoids displayed stable interaction throughout the simulation, however, most robust interaction was observed for bromo-noscapine followed by noscapine and amino-noscapine. This offers a novel chemical scaffold for γ-tubulin binding drugs near γ-tubulin-GCP4 interface. This article is protected by copyright. All rights reserved.
Substrate Tunnels in Enzymes: Structure-Function Relationships and Computational Methodology
In enzymes, the active site is the location where incoming substrates are chemically converted to products. In some enzymes, this site is deeply buried within the core of the protein and in order to access the active site, substrates must pass through the body of the protein via a tunnel. In many systems, these tunnels act as filters and have been found to influence both substrate specificity and catalytic mechanism. Identifying and understanding how these tunnels exert such control has been of growing interest over the past several years due to implications in fields such as protein engineering and drug design. This growing interest has spurred the development of several computational methods to identify and analyze tunnels and how ligands migrate through these tunnels. The goal of this review is to outline how tunnels influence substrate specificity and catalytic efficiency in enzymes with tunnels and to provide a brief summary of the computational tools used to identify and evaluate these tunnels. This article is protected by copyright. All rights reserved.
Unprecedented access of phenolic substrates to the heme active site of a catalase: Substrate binding and peroxidase-like reactivity of Bacillus pumilus catalase monitored by X-ray crystallography and EPR spectroscopy
Heme-containing catalases and catalase-peroxidases catalyze the dismutation of hydrogen peroxide as their predominant catalytic activity, but in addition, individual enzymes support low levels of peroxidase and oxidase activities, produce superoxide and activate isoniazid as an anti-tubercular drug. The recent report of a heme enzyme with catalase, peroxidase and penicillin oxidase activities in Bacillus pumilus and its categorization as an unusual catalase-peroxidase led us to investigate the enzyme for comparison with other catalase-peroxidases, catalases and peroxidases. Characterization revealed a typical homotetrameric catalase with one pentacoordinated heme b per subunit (Tyr340 being the axial ligand), albeit in two orientations, and a very fast catalatic turnover rate (kcat = 339,000s−1). In addition, the enzyme supported a much slower (kcat ∼20s−1) peroxidatic activity utilizing substrates as diverse as ABTS and polyphenols, but no oxidase activity. Two binding sites, one in the main access channel and the other on the protein surface, accommodating pyrogallol, catechol, resorcinol, guaiacol, hydroquinone and 2-chlorophenol were identified in crystal structures at 1.65-1.95 Å. A third site, in the heme distal side, accommodating only pyrogallol and catechol, interacting with the heme iron and the catalytic His and Arg residues, was also identified. This site was confirmed in solution by EPR spectroscopy, which also showed that the phenolic oxygen was not directly coordinated to the heme iron (no low-spin conversion of the heme FeIII high-spin EPR signal upon substrate binding). This is the first demonstration of phenolic substrates directly accessing the heme distal side of a catalase. This article is protected by copyright. All rights reserved.
Crystal structure of YwpF from Staphylococcus aureus reveals its architecture comprised of a β-barrel core domain resembling type VI secretion system proteins and a two-helix pair
The ywpF gene (SAV2097) of the Staphylococcus aureus strain Mu50 encodes the YwpF protein, which may play a role in antibiotic resistance. Here, we report the first crystal structure of the YwpF superfamily from S. aureus at 2.5 Å resolution. The YwpF structure consists of two regions: an N-terminal core β-barrel domain that shows structural similarity to type VI secretion system (T6SS) proteins (e.g. Hcp1, Hcp3, and EvpC) and a C-terminal two-helix pair. Although the monomer structure of S. aureus YwpF resembles those of T6SS proteins, the dimer/tetramer model of S. aureus YwpF is distinct from the functionally important hexameric ring of T6SS proteins. We therefore suggest that the S. aureus YwpF may have a different function compared to T6SS proteins. This article is protected by copyright. All rights reserved.
Conserved movement of TMS11 between occluded conformations of LacY and XylE of the major facilitator superfamily suggests a similar hinge-like mechanism
The Δ-distance maps can detect local remodeling that is difficult to accurately determine using superimpositions. Transmembrane segments (TMSs) 11 in both LacY and XylE of the major facilitator superfamily uniquely contribute the greatest amount of mobile surface area in the outward-occluded state and undergo analogous movements. The intracellular part of TMS11 moves away from the C-terminal domain and into the substrate cavity during the conformational change from the outward-occluded to the inward-occluded state. A difference was noted between LacY and XylE when they assumed the inward open state after releasing a substrate to the inside in which TMS11 of LacY moved further into the substrate release space, whereas in XylE, TMS11 slightly retracted into the C-terminal domain. Independent movement of the N-terminal half of TMS11 suggests that it is flexible in the middle. Repeat-swapped homology modeling was used to discover that a loop connecting TMSs 10 and 11 in LacY probably moves during the transition between the unavailable outward-open state and the outward-occluded state. TMSs 11 and the other elements displaying a notable domain-independent movement colocalize with the interdomain linker, suggesting that these elements could drive the alternating access movement between the domain halves. Preliminary evidence indicates that analogous movements occur in other members of the major facilitator superfamily. Proteins 2015. © 2015 Wiley Periodicals, Inc.
How to compare the structures of an ensemble of protein conformations is a fundamental problem in structural biology. As has been previously observed, the widely used RMSD measure due to Kabsch, in which a rigid-body superposition minimising the least-squares positional deviations is performed, has its drawbacks when comparing and visualising a set of flexible proteins structures.
Here, we develop a method, fleximatch, of protein structure comparison that takes flexibility into account. Based on a distance matrix measure of flexibility, a weighted superposition of distance matrices rather than of atomic coordinates is performed. Subsequently, this allows a consistent determination of a) a superposition of structures for visualisation b) a partitioning of the protein structure into rigid molecular components (core atoms) and c) an atomic mobility measure. The method is suitable for highlighting both particularly flexible and rigid parts of a protein from structures derived from NMR, X-ray diffraction or molecular simulation. This article is protected by copyright. All rights reserved.
Large efforts have been made in classifying residues as binding sites in proteins using machine learning methods. The prediction task can be translated into the computational challenge of assigning each residue the label binding site or non-binding site. Observational data comes from various possibly highly correlated sources. It includes the structure of the protein but not the structure of the complex. The model class of conditional random fields (CRFs) has previously successfully been used for protein binding site prediction. Here, a new CRF-approach is presented that models the dependencies of residues using a general graphical structure defined as a neighborhood graph and thus our model makes fewer independence assumptions on the labels than sequential labeling approaches. A novel node feature “change in free energy” is introduced into the model which is then denoted by ΔF-CRF. Parameters are trained with an Online Large-Margin algorithm. Using the standard feature class relative accessible surface area alone, the general graph-structure CRF already achieves higher prediction accuracy than the linear chain CRF of Li et al. ΔF-CRF performs significantly better on a large range of false positive rates than the support-vector-machine-based program PresCont of Zellner et al. on a homodimer set containing 128 chains. ΔF-CRF has a broader scope than PresCont since it is not constrained to protein subgroups and requires no multiple sequence alignment. The improvement is attributed to the advantageous combination of the novel node feature with the standard feature and to the adopted parameter training method. This article is protected by copyright. All rights reserved.
The near-symmetry of proteins
The majority of protein oligomers form clusters which are nearly symmetric. Understanding of that imperfection, its origins, and perhaps also its advantages requires the conversion of the currently used vague qualitative descriptive language of the near-symmetry into an accurate quantitative measure that will allow to answer questions such as: “What is the degree of symmetry deviation of the protein?,” “how do these deviations compare within a family of proteins?,” and so on. We developed quantitative methods to answer this type of questions, which are capable of analyzing the whole protein, its backbone or selected portions of it, down to comparison of symmetry-related specific amino-acids, and which are capable of visualizing the various levels of symmetry deviations in the form of symmetry maps. We have applied these methods on an extensive list of homomers and heteromers and found that apparently all proteins never reach perfect symmetry. Strikingly, even homomeric protein clusters are never ideally symmetric. We also found that the main burden of symmetry distortion is on the amino-acids near the symmetry axis; that it is mainly the more hydrophilic amino-acids that take place in symmetry-distortive interactions; and more. The remarkable ability of heteromers to preserve near-symmetry, despite the different sequences, was also shown and analyzed. The comprehensive literature on the suggested advantages symmetric oligomerizations raises a yet-unsolved key question: If symmetry is so advantageous, why do proteins stop shy of perfect symmetry? Some tentative answers to be tested in further studies are suggested in a concluding outlook. Proteins 2014. © 2014 Wiley Periodicals, Inc.
Structure of the terminal PCP domain of the non-ribosomal peptide synthetase in teicoplanin biosynthesis
The biosynthesis of the glycopeptide antibiotics, of which teicoplanin and vancomycin are representative members, relies on the combination of non-ribosomal peptide synthesis and modification of the peptide by cytochrome P450 (Oxy) enzymes while the peptide remains bound to the peptide synthesis machinery. We have structurally characterized the final peptidyl carrier protein domain of the teicoplanin non-ribosomal peptide synthetase machinery: this domain is believed to mediate the interactions with tailoring Oxy enzymes in addition to its function as a shuttle for intermediates between multiple non-ribosomal peptide synthetase domains. Using solution state NMR, we have determined structures of this PCP domain in two states, the apo and the post-translationally modified holo state, both of which conform to a four-helix bundle assembly. The structures exhibit the same general fold as the majority of known carrier protein structures, in spite of the complex biosynthetic role that PCP domains from the final non-ribosomal peptide synthetase module must play in glycopeptide antibiotic biosynthesis. These structures thus support the hypothesis that it is subtle rearrangements, rather than dramatic conformational changes, which govern carrier protein interactions and selectivity during non-ribosomal peptide synthesis. Proteins 2015. © 2015 Wiley Periodicals, Inc.
Flexibility in the N-terminal actin-binding domain: Clues from in silico mutations and molecular dynamics
Dystrophin is a long, rod-shaped cytoskeleton protein implicated in muscular dystrophy (MDys). Utrophin is the closest autosomal homolog of dystrophin. Both proteins have N-terminal actin-binding domain (N-ABD), a central rod domain and C-terminal region. N-ABD, composed of two calponin homology (CH) subdomains joined by a helical linker, harbors a few disease causing missense mutations. Although the two proteins share considerable homology (>72%) in N-ABD, recent structural and biochemical studies have shown that there are significant differences (including stability, mode of actin-binding) and their functions are not completely interchangeable. In this investigation, we have used extensive molecular dynamics simulations to understand the differences and the similarities of these two proteins, along with another actin-binding protein, fimbrin. In silico mutations were performed to identify two key residues that might be responsible for the dynamical difference between the molecules. Simulation points to the inherent flexibility of the linker region, which adapts different conformations in the wild type dystrophin. Mutations T220V and G130D in dystrophin constrain the flexibility of the central helical region, while in the two known disease-causing mutants, K18N and L54R, the helicity of the region is compromised. Phylogenetic tree and sequence analysis revealed that dystrophin and utrophin genes have probably originated from the same ancestor. The investigation would provide insight into the functional diversity of two closely related proteins and fimbrin, and contribute to our understanding of the mechanism of MDys.Proteins 2015. © 2015 Wiley Periodicals, Inc.
The antigen-binding site of antibodies forms at the interface of their two variable domains, VH and VL, making VH–VL domain orientation a factor that codetermines antibody specificity and affinity. Preserving VH–VL domain orientation in the process of antibody engineering is important in order to retain the original antibody properties, and predicting the correct VH–VL orientation has also been recognized as an important factor in antibody homology modeling. In this article, we present a fast sequence-based predictor that predicts VH–VL domain orientation with Q2 values ranging from 0.54 to 0.73 on the evaluation set. We describe VH–VL orientation in terms of the six absolute ABangle parameters that have recently been proposed as a means to separate the different degrees of freedom of VH–VL domain orientation. In order to assess the impact of adjusting VH–VL orientation according to our predictions, we use the set of antibody structures of the recently published Antibody Modeling Assessment (AMA) II study. In comparison to the original AMAII homology models, we find an improvement in the accuracy of VH–VL orientation modeling, which also translates into an improvement in the average root-mean-square deviation with regard to the crystal structures. Proteins 2015. © 2015 Wiley Periodicals, Inc.
Molecular dynamics investigation of the ionic liquid/enzyme interface: Application to engineering enzyme surface charge
Molecular simulations of the enzymes Candida rugosa lipase and Bos taurus α-chymotrypsin in aqueous ionic liquids 1-butyl-3-methylimidazolium chloride and 1-ethyl-3-methylimidazolium ethyl sulfate were used to study the change in enzyme–solvent interactions induced by modification of the enzyme surface charge. The enzymes were altered by randomly mutating lysine surface residues to glutamate, effectively decreasing the net surface charge by two for each mutation. These mutations resemble succinylation of the enzyme by chemical modification, which has been shown to enhance the stability of both enzymes in ILs. After establishing that the enzymes were stable on the simulated time scales, we focused the analysis on the organization of the ionic liquid substituents about the enzyme surface. Calculated solvent charge densities show that for both enzymes and in both solvents that changing positively charged residues to negative charge does indeed increase the charge density of the solvent near the enzyme surface. The radial distribution of IL constituents with respect to the enzyme reveals decreased interactions with the anion are prevalent in the modified systems when compared to the wild type, which is largely accompanied by an increase in cation contact. Additionally, the radial dependence of the charge density and ion distribution indicates that the effect of altering enzyme charge is confined to short range (≤1 nm) ordering of the IL. Ultimately, these results, which are consistent with that from prior experiments, provide molecular insight into the effect of enzyme surface charge on enzyme stability in ILs. Proteins 2015. © 2015 Wiley Periodicals, Inc.
Evaluation of conformational changes in diabetes-associated mutation in insulin a chain: A molecular dynamics study
Insulin plays a central role in the regulation of metabolism in humans. Mutations in the insulin gene can impair the folding of its precursor protein, proinsulin, and cause permanent neonatal-onset diabetes mellitus known as Mutant INS-gene induced Diabetes of Youth (MIDY) with insulin deficiency. To gain insights into the molecular basis of this diabetes-associated mutation, we perform molecular dynamics simulations in wild-type and mutant (CysA7 to Tyr or C(A7)Y) insulin A chain in aqueous solutions. The C(A7)Y mutation is one of the identified mutations that impairs the protein folding by substituting the cysteine residue which is required for the disulfide bond formation. A comparative analysis reveals structural differences between the wild-type and the mutant conformations. The analyzed mutant insulin A chain forms a metastable state with major effects on its N-terminal region. This suggests that MIDY mutant involves formation of a partially folded intermediate with conformational change in N-terminal region in A chain that generates flexible N-terminal domain. This may lead to the abnormal interactions with other proinsulins in the aggregation process. Proteins 2015. © 2015 Wiley Periodicals, Inc.