Skip to main content

    LEE-WEI YANG

    Since the 90s, keyword-based search engines have been helping people locate relevant web content via a simple query, so have the recent full-text-based search engines mainly used for plagiarism detection following an article upload.... more
    Since the 90s, keyword-based search engines have been helping people locate relevant web content via a simple query, so have the recent full-text-based search engines mainly used for plagiarism detection following an article upload. However, these "free" or paid services operate by storing users' search queries and preferences for personal profiling and targeted ads delivery, while user-uploaded articles can further profit the service providers as part of their expanding databases. In short, search engine privacy has not been an option for web exploration in the past decades. Here we demonstrate that a database or internet search, provided with the entire article as a query, can be correctly carried out without revealing users' sensitive queries by an irreversible encoding scheme and an efficient FM-index search routine that is generally used in the NGS of genomes. In our solution, Sapiens Aperio Veritas Engine (S.A.V.E.), every word in the query is encoded into on...
    <b>Copyright information:</b>Taken from "GNM: online computation of structural dynamics using the Gaussian Network Model"Nucleic Acids Research 2006;34(Web Server issue):W24-W31.Published online 14 Jul... more
    <b>Copyright information:</b>Taken from "GNM: online computation of structural dynamics using the Gaussian Network Model"Nucleic Acids Research 2006;34(Web Server issue):W24-W31.Published online 14 Jul 2006PMCID:PMC1538811.© The Author 2006. Published by Oxford University Press. All rights reserved The computational times (seconds) are plotted on a log-log scale against the number of residues for 13 test proteins. The amount of time required to calculate all the GNM modes and theoretical -factors () by the standard SVD approach (black circles) scales as = 2.2 × 10 . The calculation (blue diamonds) scales as = 7.2 × 10 . The computation of the 101 slowest modes using (red squares) exhibits a power law of = 5.9 × 10 . Using the latter two methods sequentially results in a dramatic decrease in computing time without loss of accuracy. The improvement is especially significant for large structures ( > 2000), permitting us to release on-line results in GNM.
    <b>Copyright information:</b>Taken from "GNM: online computation of structural dynamics using the Gaussian Network Model"Nucleic Acids Research 2006;34(Web Server issue):W24-W31.Published online 14 Jul... more
    <b>Copyright information:</b>Taken from "GNM: online computation of structural dynamics using the Gaussian Network Model"Nucleic Acids Research 2006;34(Web Server issue):W24-W31.Published online 14 Jul 2006PMCID:PMC1538811.© The Author 2006. Published by Oxford University Press. All rights reserved () Color-coded ribbon diagram illustrating the mobilities in the lowest frequency GNM mode using Jmol. The structure is colored from blue, white, to red in the order of increasing mobility. () Chime representation of the lowest mode for 1J1V; the structure is now colored from blue, green orange, to red. () Comparison of experimental and theoretical factors with each chain shown as a different curve. In this example the correlation coefficient between computed and experimental data are 0.642. () Cross-correlation map, , between residue fluctuations, plotted as a function of residue indices (abscissa) and (ordinate). The pairs subject to fully correlated motions ( = +1) are colored dark red; those undergoing anti-correlated motions (i.e. < 0) are colored blue, and moderately correlated and uncorrelated ( ≈ 0) regions are yellow and cyan, respectively. Note that the residue numbers in (d) refer to the index of EN nodes, 1–94 for the protein and 95–118 for the DNA double strands. The mapping of these indices to PDB file residue numbers can be found in the output files delivered by GNM.
    The systematic design of functional peptides has technological and therapeutic applications. However, there is a need for pattern-based search engines that help locate desired functional motifs in primary sequences regardless of their... more
    The systematic design of functional peptides has technological and therapeutic applications. However, there is a need for pattern-based search engines that help locate desired functional motifs in primary sequences regardless of their evolutionary conservation. Existing databases such as The Protein Secondary Structure database (PSS) no longer serves the community, while the Dictionary of Protein Secondary Structure (DSSP) annotates the secondary structures when tertiary structures of proteins are provided. Here, we extract 1.7 million helices from the PDB and compile them into a database (Therapeutic Peptide Design database; TP-DB) that allows queries of compounded patterns to facilitate the identification of sequence motifs of helical structures. We show how TP-DB helps us identify a known purification-tag-specific antibody that can be repurposed into a diagnostic kit for Helicobacter pylori. We also show how the database can be used to design a new antimicrobial peptide that show...
    Copyright © 2012 Lee-Wei Yang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is... more
    Copyright © 2012 Lee-Wei Yang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. X-ray crystallography captures protein conformational states that are stable enough to be captured. These observable states are simply part of the functional event in the context of protein dynamics. Life at the molecular level has a vibrant nature at physiological temperature, governed by physics laws. Molecular dynamics (MDs) simulations have been used to study biomolecular systems since 1970s [1]. Advancing to this day, it has been a common practice for researchers who access supercomputers to obtain trajectories for hundreds of nanoseconds to two microseconds for small proteins [2, 3] and a few tens of nanoseconds for supramolecular assemblies such as the ribosome of a size larger than 200 Å
    Opioids account for 69,000 overdose deaths per annum worldwide and cause serious side effects. Safer analgesics are urgently needed. The endogenous opioid peptide Leu-Enkephalin (Leu-ENK) is ineffective when introduced peripherally due to... more
    Opioids account for 69,000 overdose deaths per annum worldwide and cause serious side effects. Safer analgesics are urgently needed. The endogenous opioid peptide Leu-Enkephalin (Leu-ENK) is ineffective when introduced peripherally due to poor stability and limited membrane permeability. We developed a focused library of Leu-ENK analogs containing small hydrophobic modifications. N-pivaloyl analog KK-103 showed the highest binding affinity to the delta opioid receptor (68% relative to Leu-ENK) and an extended plasma half-life of 37 h. In the murine hot-plate model, subcutaneous KK-103 showed 10-fold improved anticonception (142%MPE·h) compared to Leu-ENK (14%MPE·h). In the formalin model, KK-103 reduced the licking and biting time to ~50% relative to the vehicle group. KK-103 was shown to act through the opioid receptors in the central nervous system. In contrast to morphine, KK-103 was longer-lasting and did not induce breathing depression, physical dependence, and tolerance, showi...
    The Lon protease is ubiquitous in nature. Its proteolytic activity is associated with diverse cellular functions ranging from maintaining proteostasis under normal and stress conditions to regulating cell metabolism. Although Lon was... more
    The Lon protease is ubiquitous in nature. Its proteolytic activity is associated with diverse cellular functions ranging from maintaining proteostasis under normal and stress conditions to regulating cell metabolism. Although Lon was originally identified as an ATP-dependent protease with fused AAA+ (ATPases associated with diverse cellular activities) and protease domains, analyses have recently identified LonC as a class of Lon-like proteases with no intrinsic ATPase activity. In contrast to the canonical ATP-dependent Lon present in eukaryotic organelles and prokaryotes, LonC contains an AAA-like domain that lacks the conserved ATPase motifs. Moreover, the LonC AAA-like domain is inserted with a large domain predicted to be largely α-helical; intriguingly, this unique Lon-insertion domain (LID) was disordered in the recently determined full-length crystal structure of Meiothermus taiwanensis LonC (MtaLonC). Here, the crystal structure of the N-terminal AAA-like α/β subdomain of MtaLonC containing an intact LID, which forms a large α-helical hairpin protruding from the AAA-like domain, is reported. The structure of the LID is remarkably similar to the tentacle-like prong of the periplasmic chaperone Skp. It is shown that the LID of LonC is involved both in Skp-like chaperone activity and in recognition of unfolded protein substrates. The structure allows the construction of a complete model of LonC with six helical hairpin extensions defining a basket-like structure atop the AAA ring and encircling the entry portal to the barrel-like degradation chamber of Lon.
    Motivation Quaternary structure determination for transmembrane/soluble proteins requires a reliable computational protocol that leverages observed distance restraints and/or cyclic symmetry (Cn symmetry) found in most homo-oligomeric... more
    Motivation Quaternary structure determination for transmembrane/soluble proteins requires a reliable computational protocol that leverages observed distance restraints and/or cyclic symmetry (Cn symmetry) found in most homo-oligomeric transmembrane proteins. Results We survey 118 X-ray crystallographically solved structures of homo-oligomeric transmembrane proteins (HoTPs) and find that ∼97% are Cn symmetric. Given the prevalence of Cn symmetric HoTPs and the benefits of incorporating geometry restraints in aiding quaternary structure determination, we introduce two new filters, the distance-restraints (DR) and the Symmetry-Imposed Packing (SIP) filters. SIP relies on a new method that can rebuild the closest ideal Cn symmetric complex from docking poses containing a homo-dimer without prior knowledge of the number (n) of monomers. Using only the geometrical filter, SIP, near-native poses of 7 HoTPs in their monomeric states can be correctly identified in the top-10 for 71% of all c...
    In this study, we provide a time-dependent (td-) mechanical model, taking advantage of molecular dynamics (MD) simulations, quasiharmonic analysis of MD trajectories and td-linear response theories (td-LRT) to describe vibrational energy... more
    In this study, we provide a time-dependent (td-) mechanical model, taking advantage of molecular dynamics (MD) simulations, quasiharmonic analysis of MD trajectories and td-linear response theories (td-LRT) to describe vibrational energy redistribution within the protein matrix. The theoretical description explains the observed biphasic responses of specific residues in myoglobin to CO-photolysis and photoexcitation on heme. The fast responses are found triggered by impulsive forces and propagated mainly by principal modes <40 cm-1. The predicted fast responses for individual atoms are then used to study signal propagation within protein matrix and signals are found to propagate ~ 8 times faster across helices (4076 m/s) than within the helices, suggesting the importance of tertiary packing in proteins' sensitivity to external perturbations. We further develop a method to integrate multiple intramolecular signal pathways and discover frequent "communicators". These ...
    It has been an established idea in recent years that protein is a physiochemically connected network. Allostery, understood in this new context, is a manifestation of residue communicating to remote parts in a molecule, and hence a rising... more
    It has been an established idea in recent years that protein is a physiochemically connected network. Allostery, understood in this new context, is a manifestation of residue communicating to remote parts in a molecule, and hence a rising interest to identify communication pathways within such a network. Recently, we have developed the time-dependent linear response theories (td-LRTs), which shown to well interpret the ligand photodissociation relaxation dynamics of myoglobin. In this article, we applied the td-LRT with impulse force to investigate the allostery on an enzyme system - the dihydrofolate reductase (DHFR), which catalyze the hydride transfer (HT) reaction - the reduction of dihydrofolate (DHF) to tetrahydrofolate (THF) using the cofactor nicotinamide adenine dinucleotide phosphate (NADPH). We developed a connection matrix to record the counts of mechanical signals propagating through donor-acceptor pairs launched from the active sites with the td-LRT. Through the connec...
    Learning from experimentally determined interacting secondary structural motifs, we compiled a database to facilitate a data-driven design of therapeutic peptides (TPs). 1.7 million helical peptides (HPs) in…
    The clock of life ticks as fast as how efficiently proteins could perform their functional dynamics. Protein complexes execute functions via several large-scale intrinsic motions across multiple conformational states, which occur at a... more
    The clock of life ticks as fast as how efficiently proteins could perform their functional dynamics. Protein complexes execute functions via several large-scale intrinsic motions across multiple conformational states, which occur at a timescale of nano- to milliseconds for well-folded proteins. Computationally expensive molecular dynamics (MD) simulation has been the only theoretical tool to time and size these motions, though barely to their slowest ends. Here, we convert a simple elastic network model (ENM), which takes a few seconds (ubiquitin) to hours (ribosome) for the analysis, into a molecular timer and sizer to gauge the slowest functional motions of proteins. Quasi-harmonic analysis, fluctuation-profile matching (FPM) and the Wiener-Khintchine theorem (WKT) are used to define the "time-periods", t, for anharmonic principal components (PCs) which are validated by NMR order parameters. The PCs with their respective "time-periods" are mapped to the eigenva...
    Ca2+-binding human S100A1 protein is a type of S100 protein. S100A1 is a significant mediator during inflammation when Ca2+ binds to its EF-hand motifs. Receptors for advanced glycation end products (RAGE) correspond to 5 domains: the... more
    Ca2+-binding human S100A1 protein is a type of S100 protein. S100A1 is a significant mediator during inflammation when Ca2+ binds to its EF-hand motifs. Receptors for advanced glycation end products (RAGE) correspond to 5 domains: the cytoplasmic, transmembrane, C2, C1, and V domains. The V domain of RAGE is one of the most important target proteins for S100A1. It binds to the hydrophobic surface and triggers signaling transduction cascades that induce cell growth, cell proliferation, and tumorigenesis. We used nuclear magnetic resonance (NMR) spectroscopy to characterize the interaction between S100A1 and the RAGE V domain. We found that S100B could interact with S100A1 via NMR 1H-15N HSQC titrations. We used the HADDOCK program to generate the following two binary complexes based on the NMR titration results: S100A1-RAGE V domain and S100A1-S100B. After overlapping these two complex structures, we found that S100B plays a crucial role in blocking the interaction site between RAGE ...
    Background: Tumor cells require proficient autophagy to meet high metabolic demands and resist chemotherapy, which suggests that reducing autophagic flux might be an attractive route for cancer therapy. However, this theory in clinical... more
    Background: Tumor cells require proficient autophagy to meet high metabolic demands and resist chemotherapy, which suggests that reducing autophagic flux might be an attractive route for cancer therapy. However, this theory in clinical cancer research remains controversial due to the limited number of drugs that specifically inhibit autophagy-related (ATG) proteins. Methods: We screened FDA-approved drugs using a novel platform that integrates computational docking and simulations as well as biochemical and cellular reporter assays to identify potential drugs that inhibit autophagy-required cysteine proteases of the ATG4 family. The effects of ATG4 inhibitors on autophagy and tumor suppression were examined using cell culture and a tumor xenograft mouse model. Results: Tioconazole was found to inhibit activities of ATG4A and ATG4B with an IC50 of 1.3 µM and 1.8 µM, respectively. Further studies based on docking and molecular dynamics (MD) simulations supported that tioconazole can s...
    Structure-encoded conformational dynamics are crucial for biomolecular functions. However, there is insufficient evidence to support the notion that dynamics play a role in guiding protein-nucleic acid interactions. Here, we show that... more
    Structure-encoded conformational dynamics are crucial for biomolecular functions. However, there is insufficient evidence to support the notion that dynamics play a role in guiding protein-nucleic acid interactions. Here, we show that protein-DNA docking orientation is a function of protein intrinsic dynamics, but the binding site itself does not display unique patterns in the examined spectrum of motions. This revelation is made possible by a novel technique that locates "dynamics interfaces" in proteins across which protein parts are anticorrelated in their slowest dynamics. A striking statistic is that such interfaces intersect the DNA in 97% of the 104 examined cases. These findings were then used to screen decoys generated by rigid-body docking of DNA molecules onto DNA-binding proteins. Using our method, the chance to discern near-native poses from non-native decoys increased by 2.5- and 1.6-fold, as compared to a random guess and methods based on surface complementa...
    DynOmics (dynomics.pitt.edu) is a portal developed to leverage rapidly growing structural proteomics data by efficiently and accurately evaluating the dynamics of structurally resolved systems, from individual molecules to large complexes... more
    DynOmics (dynomics.pitt.edu) is a portal developed to leverage rapidly growing structural proteomics data by efficiently and accurately evaluating the dynamics of structurally resolved systems, from individual molecules to large complexes and assemblies, in the context of their physiological environment. At the core of the portal is a newly developed server, ENM 1.0, which permits users to efficiently generate information on the collective dynamics of any structure in PDB format, user-uploaded or database-retrieved. ENM 1.0 integrates two widely used elastic network models (ENMs)-the Gaussian Network Model (GNM) and the Anisotropic Network Model (ANM), extended to take account of molecular environment. It enables users to assess potentially functional sites, signal transduction or allosteric communication mechanisms, and protein-protein and protein-DNA interaction poses, in addition to delivering ensembles of accessible conformers reconstructed at atomic details based on the global ...
    Human S100A9 (Calgranulin B) is a Ca(2+)-binding protein, from the S100 family, that often presents as a homodimer in myeloid cells. It becomes an important mediator during inflammation once calcium binds to its EF-hand motifs. Human RAGE... more
    Human S100A9 (Calgranulin B) is a Ca(2+)-binding protein, from the S100 family, that often presents as a homodimer in myeloid cells. It becomes an important mediator during inflammation once calcium binds to its EF-hand motifs. Human RAGE protein (receptor for advanced glycation end products) is one of the target-proteins. RAGE binds to a hydrophobic surface on S100A9. Interactions between these proteins trigger signal transduction cascades, promoting cell growth, proliferation, and tumorigenesis. Here, we present the solution structure of mutant S100A9 (C3S) homodimer, determined by multi-dimensional NMR experiments. We further characterize the solution interactions between mS100A9 and the RAGE V domain via NMR spectroscopy. CHAPS is a zwitterionic and non-denaturing molecule widely used for protein solubilizing and stabilization. We found out that CHAPS and RAGE V domain would interact with mS100A9 by using (1)H-(15)N HSQC NMR titrations. Therefore, using the HADDOCK program, we s...
    Receptor-binding and subsequent signal-activation of interleukin-1 beta (IL-1β) are essential to immune and proinflammatory responses. We mutated 12 residues to identify sites important for biological activity and/or receptor binding.... more
    Receptor-binding and subsequent signal-activation of interleukin-1 beta (IL-1β) are essential to immune and proinflammatory responses. We mutated 12 residues to identify sites important for biological activity and/or receptor binding. Four of these mutants with mutations in loop 9 (T117A, E118K, E118A, E118R) displayed significantly reduced biological activity. Neither T117A nor E118K mutants substantially affected receptor binding, whereas both mutants lack the IL-1β signaling in vitro but can antagonize wild-type (WT) IL-1β. Crystal structures of T117A, E118A, and E118K revealed that the secondary structure or surface charge of loop 9 is dramatically altered compared with that of wild-type chicken IL-1β. Molecular dynamics simulations of IL-1β bound to its receptor (IL-1RI) and receptor accessory protein (IL-1RAcP) revealed that loop 9 lies in a pocket that is formed at the IL-1RI/IL-1RAcP interface. This pocket is also observed in the human ternary structure. The conformations of...
    Gaussian network model (GNM) is a simple yet powerful model for investigating the dynamics of proteins and their complexes. GNM analysis became a broadly used method for assessing the conformational dynamics of biomolecular structures... more
    Gaussian network model (GNM) is a simple yet powerful model for investigating the dynamics of proteins and their complexes. GNM analysis became a broadly used method for assessing the conformational dynamics of biomolecular structures with the development of a user-friendly interface and database, iGNM, in 2005. We present here an updated version, iGNM 2.0 http://gnmdb.csb.pitt.edu/, which covers more than 95% of the structures currently available in the Protein Data Bank (PDB). Advanced search and visualization capabilities, both 2D and 3D, permit users to retrieve information on inter-residue and inter-domain cross-correlations, cooperative modes of motion, the location of hinge sites and energy localization spots. The ability of iGNM 2.0 to provide structural dynamics data on the large majority of PDB structures and, in particular, on their biological assemblies makes it a useful resource for establishing the bridge between structure, dynamics and function.
    Although the dynamic motions and peptidyl transferase activity seem to be embedded in the rRNAs, the ribosome contains more than 50 ribosomal proteins (r-proteins), whose functions remain largely elusive. Also, the precise forms of some... more
    Although the dynamic motions and peptidyl transferase activity seem to be embedded in the rRNAs, the ribosome contains more than 50 ribosomal proteins (r-proteins), whose functions remain largely elusive. Also, the precise forms of some of these r-proteins, as being part of the ribosome, are not structurally solved due to their high flexibility, which hinders the efforts in their functional elucidation. Owing to recent advances in cryo-electron microscopy, single-molecule techniques, and theoretical modeling, much has been learned about the dynamics of these r-proteins. Surprisingly, allosteric regulations have been found in between spatially separated components as distant as those in the opposite sides of the ribosome. Here, we focus on the functional roles and intricate regulations of the mobile L1 and L12 stalks and L9 and S1 proteins. Conformational flexibility also enables versatile functions for r-proteins beyond translation. The arrangement of r-proteins may be under evoluti...
    In this review, we summarize the progress on coarse-grained elastic network models (CG-ENMs) in the past decade. Theories were formulated to allow study of conformational dynamics in time/space frames of biological interest. Several... more
    In this review, we summarize the progress on coarse-grained elastic network models (CG-ENMs) in the past decade. Theories were formulated to allow study of conformational dynamics in time/space frames of biological interest. Several highlighted models and their underlined hypotheses are introduced in physical depth. Important ENM offshoots, motivated to reproduce experimental data as well as to address the slow-mode-encoded configurational transitions, are also introduced. With the theoretical developments, computational cost is significantly reduced due to simplified potentials and coarse-grained schemes. Accumulating wealth of data suggest that ENMs agree equally well with experiment in describing equilibrium dynamics despite their distinct potentials and levels of coarse-graining. They however do differ in the slowest motional components that are essential to address large conformational changes of functional significance. The difference stems from the dissimilar curvatures of th...
    We provide evidence supporting that protein-protein and protein-ligand docking poses are functions of protein shape and intrinsic dynamics. Over sets of 68 protein-protein complexes and 240 nonhomologous enzymes, we recognize common... more
    We provide evidence supporting that protein-protein and protein-ligand docking poses are functions of protein shape and intrinsic dynamics. Over sets of 68 protein-protein complexes and 240 nonhomologous enzymes, we recognize common predispositions for binding sites to have minimal vibrations and angular momenta, while two interacting proteins orient so as to maximize the angle between their rotation/bending axes (>65°). The findings are then used to define quantitative criteria to filter out docking decoys less likely to be the near-native poses; hence, the chances to find near-native hits can be doubled. With the novel approach to partition a protein into "domains" of robust but disparate intrinsic dynamics, 90% of catalytic residues in enzymes can be found within the first 50% of the residues closest to the interface of these dynamics domains. The results suggest an anisotropic rather than isotropic distribution of catalytic residues near the mass centers of enzymes.
    In this review, we summarize the progress on coarse-grained elastic network models (CG-ENMs) in the past decade. Theories were formulated to allow study of conformational dynamics in time/space frames of biological interest. Several... more
    In this review, we summarize the progress on coarse-grained elastic network models (CG-ENMs) in the past decade. Theories were formulated to allow study of conformational dynamics in time/space frames of biological interest. Several highlighted models and their underlined hypotheses are introduced in physical depth. Important ENM offshoots, motivated to reproduce experimental data as well as to address the slow-mode-encoded configurational transitions, are also introduced. With the theoretical developments, computational cost is significantly reduced due to simplified potentials and coarse-grained schemes. Accumulating wealth of data suggest that ENMs agree equally well with experiment in describing equilibrium dynamics despite their distinct potentials and levels of coarse-graining. They however do differ in the slowest motional components that are essential to address large conformational changes of functional significance. The difference stems from the dissimilar curvatures of th...

    And 6 more