New Data Available! Access Avian Influenza A (H5N1) Virus Sequences at NCBI

New Data Available! Access Avian Influenza A (H5N1) Virus Sequences at NCBI

Sequence data from the ongoing avian influenza A (H5N1) virus outbreak in cattle are now available through NLM’s NCBI resources NCBI Virus and NCBI Datasets.

These data were submitted by the U.S. Department of Agriculture (USDA), U.S. Centers for Disease Control and Prevention (CDC), the World Health Organization (WHO), Iowa State University, and St. Jude Children’s Research HospitalContinue reading “New Data Available! Access Avian Influenza A (H5N1) Virus Sequences at NCBI”

Ortholog Groups Added for ~2 Million Insect Genes

Ortholog Groups Added for ~2 Million Insect Genes

Find evolutionarily related genes across insects and other arthropods on our new Ortholog webpages

NCBI recently released a set of orthologs for approximately 2 million insect genes. You can now find and access the orthologous genes, transcripts, and proteins by searching a species and gene name in NCBI All Databases, NCBI Gene, or NCBI Datasets. As previously described, these orthologs are based on comparisons to the Drosophila melanogaster annotated genome. Using Drosophila gene nomenclature for orthologs should lead to more informative gene symbols for insects and other arthropods.  Continue reading “Ortholog Groups Added for ~2 Million Insect Genes”

International Nucleotide Database Collaboration (INSDC) Introduces Enhanced Website

International Nucleotide Database Collaboration (INSDC) Introduces Enhanced Website

Aims to broaden INSDC membership and attract diverse new members

The National Center for Biotechnology Information (NCBI) at the National Library of Medicine (NLM) and other Founding Members of the International Nucleotide Database Collaboration (INSDC) have enhanced its website, www.insdc.org, to provide comprehensive information on how interested parties from around the world can evaluate their readiness to participate in the INSDC. This effort supports INSDC’s aim to broaden membership and attract qualified nucleotide sequence databases. Web content now includes a formalized Founders Arrangement and a Membership Arrangement, along with other updated information about the INSDC mission, vision, governance, and technical documentation. In doing so, INSDC encourages interested parties to visit the INSDC website to learn more. Continue reading “International Nucleotide Database Collaboration (INSDC) Introduces Enhanced Website”

New! RefSeq Release 224

New! RefSeq Release 224

Check out RefSeq release 224, now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.

What’s included in this release?

As of May 6, 2024, this full release incorporates genomic, transcript, and protein data containing:

  • 435,879,646 records
  • 324,246,652 proteins
  • 62,348,147 RNAs
  • Sequences from 150,742 organisms

The release is provided in several directories as a complete dataset and also as divided by logical groupings. Continue reading “New! RefSeq Release 224”

Automated Lineage Definitions Now Available in NCBI Virus SARS-CoV-2 Variants Overview

Automated Lineage Definitions Now Available in NCBI Virus SARS-CoV-2 Variants Overview

Recently, NCBI Virus SARS-CoV-2 Variants Overview moved from a manual to an automated process for selecting mutations required to define a lineage (e.g., Omicron, BA.2, JN.1, etc.). With this update, the SARS-CoV-2 Variant Overview provides coverage for all SARS-CoV-2 lineages and is no longer limited to only lineages with CDC status. The SARS-CoV-2 Variants Overview website reports results from analyzing both GenBank and unassembled Sequence Read Archive (SRA) sequence data. It allows you to view geographic and frequency trends of records assigned to Pango lineages and search for sequence records using lineage-defining or other mutations (example shown in Figure 1)  Continue reading “Automated Lineage Definitions Now Available in NCBI Virus SARS-CoV-2 Variants Overview”

NCBI Pathogen Detection Presents the Antibiotic Susceptibility Test (AST) Browser

NCBI Pathogen Detection Presents the Antibiotic Susceptibility Test (AST) Browser

Have you ever wanted to compare antibiotic resistance data and resistance gene calls in bacteria? Now you can! Easily access and browse antibiotic susceptibility testing (AST) data and link to other NCBI resources using the new AST Browser. NCBI has collected AST data for many isolates in the Pathogen Detection system.  

Features and Benefits 
  • Data is in a searchable, tabular format 
  • Download data for further analysis 
  • Use the Cross-browser selection tool to link out to the Isolates Browser or MicroBIGG-E to identify the isolates and the genetic elements associated with each AST result 

Continue reading “NCBI Pathogen Detection Presents the Antibiotic Susceptibility Test (AST) Browser”

GenBank Release 260.0 is Available!

GenBank Release 260.0 is Available!

GenBank release 260.0 (4/19/2024) is now available on the NCBI FTP site. This release has 31.18 trillion bases and 4.46 billion records.

The current release has:

  • 250,803,006 traditional records containing 3,213,818,003,787 base pairs of sequence data
  • 3,333,621,823 WGS records containing 27,225,116,587,937 base pairs of sequence data
  • 741,066,498 bulk-oriented TSA records containing 689,648,317,082 base pairs of sequence data
  • 135,115,766 bulk-oriented TLS records containing 53,492,243,256 base pairs of sequence data  Continue reading “GenBank Release 260.0 is Available!”
Now Available! Updated Bacterial and Archaeal Reference Genomes Collection

Now Available! Updated Bacterial and Archaeal Reference Genomes Collection

Download the updated bacterial and archaeal reference genome collection! We built this collection of 19,328 genomes by selecting the “best” genome assembly for each species among the 350,000+ prokaryotic genomes in RefSeq (except for E. coli for which two assemblies were selected as reference).

What’s New?
  • 413 species are represented in this collection for the first time
  • 198 species are represented by a better assembly
  • 27 species were removed because of changes in NCBI Taxonomy or uncertainty in their species assignment 

Continue reading “Now Available! Updated Bacterial and Archaeal Reference Genomes Collection”

NCBI Hidden Markov Models (HMM) Release 15.0 Now Available!

NCBI Hidden Markov Models (HMM) Release 15.0 Now Available!

Download release 15.0 of the NCBI protein profile Hidden Markov models (HMMs) used by the Prokaryotic Genome Annotation Pipeline (PGAP)! Search this collection against your favorite prokaryotic proteins to identify their function using the HMMER sequence analysis package.

What’s New?

Release 15.0 contains:

  • 16,667 HMMs maintained by NCBI
  • 279 new HMMs since release 14.0
  • Several hundreds HMMs with better names, EC numbers, Gene Ontology (GO) terms, gene symbols, or publications. 

Continue reading “NCBI Hidden Markov Models (HMM) Release 15.0 Now Available!”