Skip to main content
ABSTRACT Choline is abundant in association with eukaryotes and plays roles in osmoprotection, thermoprotection, and membrane biosynthesis in many bacteria. Aerobic catabolism of choline is widespread among soil proteobacteria,... more
ABSTRACT Choline is abundant in association with eukaryotes and plays roles in osmoprotection, thermoprotection, and membrane biosynthesis in many bacteria. Aerobic catabolism of choline is widespread among soil proteobacteria, particularly those associated with eukaryotes. Catabolism of choline as a carbon, nitrogen, and/or energy source may play important roles in association with eukaryotes, including pathogenesis, symbioses, and nutrient cycling.
" It is the responsibility of those of us involved in today's biomedical research enterprise to translate the remarkable scientific innovations we are witnessing into health gains for the nation... At no other time has the need for a... more
" It is the responsibility of those of us involved in today's biomedical research enterprise to translate the remarkable scientific innovations we are witnessing into health gains for the nation... At no other time has the need for a robust, bidirectional information flow between basic and translational scientists been so necessary."
Background GenBank (R) is a public repository of all publicly available molecular sequence data from a range of sources. In addition to relevant metadata (eg, sequence description, source organism and taxonomy), publication information is... more
Background GenBank (R) is a public repository of all publicly available molecular sequence data from a range of sources. In addition to relevant metadata (eg, sequence description, source organism and taxonomy), publication information is recorded in the GenBank data file. The identification of literature associated with a given molecular sequence may be an essential first step in developing research hypotheses.
Background Authority and year information have been attached to taxonomic names since Linnaean times. The systematic structure of taxonomic nomenclature facilitates the ability to develop tools that can be used to explore historical... more
Background Authority and year information have been attached to taxonomic names since Linnaean times. The systematic structure of taxonomic nomenclature facilitates the ability to develop tools that can be used to explore historical trends that may be associated with taxonomy. Results From the over 10.7 million taxonomic names that are part of the uBio system [4], approximately 3 million names were identified to have taxonomic authority information from the years 1750 to 2004.
Background The availability of sequences from whole genomes to reconstruct the tree of life has the potential to enable the development of phylogenomic hypotheses in ways that have not been before possible. A significant bottleneck in the... more
Background The availability of sequences from whole genomes to reconstruct the tree of life has the potential to enable the development of phylogenomic hypotheses in ways that have not been before possible. A significant bottleneck in the analysis of genomic-scale views of the tree of life is the time required for manual curation of genomic data into multi-gene phylogenetic matrices.
Abstract Biomedical informatics involves a core set of methodologies that can provide a foundation for crossing the “translational barriers” associated with translational medicine.
When novel gene sequences are discovered, they are usually identified, classified, and annotated based on aggregate measures of sequence similarity. This method is prone to errors, however. Phylogenetic analysis is a more accurate basis... more
When novel gene sequences are discovered, they are usually identified, classified, and annotated based on aggregate measures of sequence similarity. This method is prone to errors, however. Phylogenetic analysis is a more accurate basis for gene classification and ortholog identification, but it is relatively labor-intensive and computationally demanding. Here we report and demonstrate a rapid new method for gene classification based on phylogenetic principles.
Abstract.--Identification of organism names in biological texts is essential for the management of archival resources to facilitate comparative biological investigation. Because organism nomenclature conforms closely to prescribed rules,... more
Abstract.--Identification of organism names in biological texts is essential for the management of archival resources to facilitate comparative biological investigation. Because organism nomenclature conforms closely to prescribed rules, automated techniques may be useful for identifying organism names from existing documents, and may also support the completion of comprehensive indices of taxonomic names; such comprehensive lists are not yet available.
Abstract Social and behavioral history is increasingly recognized as integral for understanding important determinants of disease and critical for patient care, research, clinical guidelines, and public health policies. Social and... more
Abstract Social and behavioral history is increasingly recognized as integral for understanding important determinants of disease and critical for patient care, research, clinical guidelines, and public health policies. Social and behavioral history information in the public health domain, specifically large public health surveys, has not been well described.
Background In recent years, there have been numerous initiatives undertaken to describe critical information needs related to the collection, management, analysis, and dissemination of data in support of biomedical research (J Investig... more
Background In recent years, there have been numerous initiatives undertaken to describe critical information needs related to the collection, management, analysis, and dissemination of data in support of biomedical research (J Investig Med 54: 327-333, 2006);(J Am Med Inform Assoc 16: 316–327, 2009);(Physiol Genomics 39: 131-140, 2009);(J Am Med Inform Assoc 18: 354–357, 2011).
Abstract Plants have been used as a source of medicine since historic times and several commercially important drugs are of plant-based origin. The traditional approach towards discovery of plant-based drugs often times involves... more
Abstract Plants have been used as a source of medicine since historic times and several commercially important drugs are of plant-based origin. The traditional approach towards discovery of plant-based drugs often times involves significant amount of time and expenditure. These labor-intensive approaches have struggled to keep pace with the rapid development of high-throughput technologies. In the era of high volume, high-throughput data generation across the biosciences, bioinformatics plays a crucial role.
Abstract Within Electronic Health Records (EHRs), the social history section contains information relevant to social, behavioral, and environmental determinants of health. While social history is playing an increasingly important role in... more
Abstract Within Electronic Health Records (EHRs), the social history section contains information relevant to social, behavioral, and environmental determinants of health. While social history is playing an increasingly important role in patient care, biomedical research, and public health, little analysis has been done to describe content in the EHR or the adequacy of existing standards for representing this information.
Recent years have seen great technological advances that have helped usher in a new generation of approaches to understand and share knowledge about the planet in which we live. A number of major initiatives that aim to catalyze necessary... more
Recent years have seen great technological advances that have helped usher in a new generation of approaches to understand and share knowledge about the planet in which we live. A number of major initiatives that aim to catalyze necessary technological and biological advances synergistically have emerged globally. Of these enabling initiatives, the Encyclopedia of Life (EOL; http://www. eol. org webcite) and the Barcode of Life (BOL; http://barcoding. si.
Abstract Biomedical literature can offer valuable information for organizing genes associated with the etiology and pathogenesis of disease. In this study, we demonstrate the utility of existing phylogenetic methods for organizing 375... more
Abstract Biomedical literature can offer valuable information for organizing genes associated with the etiology and pathogenesis of disease. In this study, we demonstrate the utility of existing phylogenetic methods for organizing 375 genes associated with Breast Cancer using the MeSH annotations from over 35,000 Medline articles.
Abstract The detailed collection of family history information is becoming increasingly important for patient care and biomedical research. Recent reports have highlighted the need for efforts to better understand collection and use of... more
Abstract The detailed collection of family history information is becoming increasingly important for patient care and biomedical research. Recent reports have highlighted the need for efforts to better understand collection and use of this information in resources such as the Electronic Health Record (EHR). This two-part study involved characterizing the use and contents of free-text comments within the family history section of an EHR.
Abstract Within large sequence repositories such as GenBank there is a wealth of metadata providing contextual information that may enhance search and retrieval of relevant sequences for a range of subsequent analyses. One challenge is... more
Abstract Within large sequence repositories such as GenBank there is a wealth of metadata providing contextual information that may enhance search and retrieval of relevant sequences for a range of subsequent analyses. One challenge is the use of free-text in these metadata fields where approaches are needed to extract, structure, and encode essential information.
Objective The relationship between diseases and their causative genes can be complex, especially in the case of polygenic diseases. Further exacerbating the challenges in their study is that many genes may be causally related to multiple... more
Objective The relationship between diseases and their causative genes can be complex, especially in the case of polygenic diseases. Further exacerbating the challenges in their study is that many genes may be causally related to multiple diseases. This study explored the relationship between diseases through the adaptation of an approach pioneered in the context of information retrieval: vector space models.
Abstract Mining biological networks can be an effective means to uncover system level knowledge out of micro level associations, such as encapsulated in genetic pathways. Analysis of human disease genetic pathways can lead to the... more
Abstract Mining biological networks can be an effective means to uncover system level knowledge out of micro level associations, such as encapsulated in genetic pathways. Analysis of human disease genetic pathways can lead to the identification of major mechanisms that may underlie disorders at an abstract functional level. The focus of this study was to develop an approach for structural pattern analysis and classification of genetic pathways of diseases.
METHODS Codification of SNOMED Data SNOMED disease terms were broken down into individual components (eg, Topology & Morphology, etc.) and codified in a simple square matrix. A'1'in the matrix represented the existence of specific... more
METHODS Codification of SNOMED Data SNOMED disease terms were broken down into individual components (eg, Topology & Morphology, etc.) and codified in a simple square matrix. A'1'in the matrix represented the existence of specific components for a particular disease term. A'0'represented cases where there was no indication for the specific disease component. For this study, we codified over 1000 diseases with over 100 individual terms.
In the biomedical domain, researchers strive to organize and describe organisms within the context of health and epidemiology. In the biodiversity domain, researchers seek to understand how organisms relate to one another, either... more
In the biomedical domain, researchers strive to organize and describe organisms within the context of health and epidemiology. In the biodiversity domain, researchers seek to understand how organisms relate to one another, either historically through evolution or spatially and geographically in the environment. Currently, there is limited cross-communication between these domains. As a result, valuable knowledge that could inform studies in either domain often goes unnoticed.
Bioinformatics competency has become a key element in much of contemporary biology. The increasing potential that is to be realized with modern technologies, such as 'next-generation'sequencing, parallel and 'cloud'computing, and... more
Bioinformatics competency has become a key element in much of contemporary biology. The increasing potential that is to be realized with modern technologies, such as 'next-generation'sequencing, parallel and 'cloud'computing, and continually increasing data banks, requires a fundamental understanding of bioinformatics concepts and applications. On par with the complexity of biological inquiry, acquiring bioinformatics competency can become a complex endeavor of undecipherable jargon, mathematics and frustration.
Background The apicomplexan parasite Toxoplasma gondii is an obligate intracellular human pathogen causing toxoplasmosis predominantly in immune-compromised hosts such as cancer and transplant patients as well as patients with AIDS [1]. A... more
Background The apicomplexan parasite Toxoplasma gondii is an obligate intracellular human pathogen causing toxoplasmosis predominantly in immune-compromised hosts such as cancer and transplant patients as well as patients with AIDS [1]. A specific cGMP-dependent protein kinase (TgPKG) which appears to be crucial for host invasion has been identified in T. gondii and related coccidial protozoa [2].
Abstract The course of treatment and ultimate clinical outcome often depends on a holistic understanding of the patient status, which often requires cataloguing of concomitant conditions (“comorbidities”). A number of approaches have been... more
Abstract The course of treatment and ultimate clinical outcome often depends on a holistic understanding of the patient status, which often requires cataloguing of concomitant conditions (“comorbidities”). A number of approaches have been developed to quantify the effect of comorbidities (eg, the Charlson Comorbidity Index); however, reported metrics have been based on pair-wise analyses of co-occurring conditions.
Abstract Biology has become a large-scale comparative science. Significant advances in computational and sequencing techniques have resulted in massive amounts of data that need to be deciphered, organized, and correctly annotated.... more
Abstract Biology has become a large-scale comparative science. Significant advances in computational and sequencing techniques have resulted in massive amounts of data that need to be deciphered, organized, and correctly annotated. Existing manual curation methods are often laborious, time-consuming, and error-prone. Furthermore, current automated techniques are either computationally prohibitive for rapid curation or may be misleading for accurate curation. New automated techniques are needed to alleviate the ...
Content Management Systems (CMS) have been used for over a decade to organize knowledge within a community framework. They are often developed as customizable applications that cater to the storage, processing, management and delivery of... more
Content Management Systems (CMS) have been used for over a decade to organize knowledge within a community framework. They are often developed as customizable applications that cater to the storage, processing, management and delivery of document contents within the scope of an enterprise. CMS applications thus make for the easy distribution and management of documents, with modern implementations including things like user management and portal technology. Here, we examine the utility of Open-Source ...
Web technologies that promote information-sharing and collaboration are known as Web 2.0. 1. These technologies include the popular social networks and wikis, Web services (Web application programming interfaces [APIs] such as SOAP or... more
Web technologies that promote information-sharing and collaboration are known as Web 2.0. 1. These technologies include the popular social networks and wikis, Web services (Web application programming interfaces [APIs] such as SOAP or REST) that support interoperability between computers, 2 and mashup and workflow environments such as Yahoo! Pipes (Yahoo! Inc., Sunnyvale, CA USA) and Taverna (University of Manchester, Manchester, UK). Geospatial applications such as Google Earth (Google, Mountain View, ...
Pairwise comparisons of disagreement in phylogenetic datasets offer a powerful tool for isolating historical incongruence for closer analysis. Statistically significant phylogenetic character incongruence may reflect important differences... more
Pairwise comparisons of disagreement in phylogenetic datasets offer a powerful tool for isolating historical incongruence for closer analysis. Statistically significant phylogenetic character incongruence may reflect important differences in evolutionary history, such as horizontal gene transfer. Such testing can also be used to specify possible combinations of datasets for further phylogenetic analysis. The process of comparing multiple datasets can be very time consuming, and it is sometimes unclear how to combine data partitions given the observed patterns of incongruence. Here we present an application that automates the process of making pairwise comparisons between large numbers of phylogenetic datasets using the Incongruence Length Difference (ILD) test. The application also implements strategies for data combination based on the patterns of incongruence observed in pairwise comparisons.

And 16 more