Content | |
---|---|
Description | Nucleotide sequences for more than 300,000 organisms with supporting bibliographic and biological annotation. |
Data types captured |
|
Organisms | All |
Contact | |
Research center | NCBI |
Primary citation | PMID 21071399 |
Release date | 1982 |
Access | |
Data format | |
Website | NCBI |
Download URL | ncbi ftp |
Web service URL | |
Tools | |
Web | BLAST |
Standalone | BLAST |
Miscellaneous | |
License | Unclear[1] |
The GenBank sequence database is an open access, annotated collection of all publicly available nucleotide sequences and their protein translations. It is produced and maintained by the National Center for Biotechnology Information (NCBI; a part of the National Institutes of Health in the United States) as part of the International Nucleotide Sequence Database Collaboration (INSDC).
In October 2024, GenBank contained 34 trillion base pairs from over 4.7 billion nucleotide sequences and more than 580,000 formally described species.[2][3]
The database started in 1982 by Walter Goad and Los Alamos National Laboratory. GenBank has become an important database for research in biological fields and has grown in recent years at an exponential rate by doubling roughly every 18 months.[4][5][3]
GenBank is built by direct submissions from individual laboratories, as well as from bulk submissions from large-scale sequencing centers.