GenBank

GenBank
Content
Description	Nucleotide sequences for more than 300,000 organisms with supporting bibliographic and biological annotation.
Data types; captured	Nucleotide sequence; Protein sequence;
Organisms	All
Contact
Research center	NCBI
Primary citation	PMID 21071399
Release date	1982
Access
Data format	XML; ASN.1; Genbank format;
Website	NCBI
Download URL	ncbi ftp
Web service URL	eutils; soap;
Tools
Web	BLAST
Standalone	BLAST
Miscellaneous
License	Unclear

The GenBank sequence database is an open access, annotated collection of all publicly available nucleotide sequences and their protein translations. It is produced and maintained by the National Center for Biotechnology Information (NCBI; a part of the National Institutes of Health in the United States) as part of the International Nucleotide Sequence Database Collaboration (INSDC).

In October 2024, GenBank contained 34 trillion base pairs from over 4.7 billion nucleotide sequences and more than 580,000 formally described species.^[2]^[3]

The database started in 1982 by Walter Goad and Los Alamos National Laboratory. GenBank has become an important database for research in biological fields and has grown in recent years at an exponential rate by doubling roughly every 18 months.^[4]^[5]^[3]

GenBank is built by direct submissions from individual laboratories, as well as from bulk submissions from large-scale sequencing centers.

^ The download page at UCSC says "NCBI places no restrictions on the use or distribution of the GenBank data. However, some submitters may claim patent, copyright, or other intellectual property rights in all or a portion of the data they have submitted. NCBI is not in a position to assess the validity of such claims, and therefore cannot provide comment or unrestricted permission concerning the use, copying, or distribution of the information contained in GenBank."
^ Eric W Sayers; Mark Cavanaugh; Karen Clark; Kim D Pruitt; Conrad L Schoch; Stephen T Sherry; Ilene Karsch-Mizrachi (7 January 2022). "GenBank". Nucleic Acids Archive. 50 (D1): D161 – D164. doi:10.1093/nar/gkab1135. PMC 8690257. PMID 34850943.
^ ^a ^b Sayers, Eric W; Cavanaugh, Mark; Frisse, Linda; Pruitt, Kim D; Schneider, Valerie A; Underwood, Beverly A; Yankie, Linda; Karsch-Mizrachi, Ilene (2025-01-06). "GenBank 2025 update". Nucleic Acids Research. 53 (D1): D56 – D61. doi:10.1093/nar/gkae1114. ISSN 0305-1048. PMC 11701615. PMID 39558184.
^ Benson D; Karsch-Mizrachi, I.; Lipman, D. J.; Ostell, J.; Wheeler, D. L.; et al. (2008). "GenBank". Nucleic Acids Research. 36 (Database): D25 – D30. doi:10.1093/nar/gkm929. PMC 2238942. PMID 18073190.
^ Benson D; Karsch-Mizrachi, I.; Lipman, D. J.; Ostell, J.; Sayers, E. W.; et al. (2009). "GenBank". Nucleic Acids Research. 37 (Database): D26 – D31. doi:10.1093/nar/gkn723. PMC 2686462. PMID 18940867.

[1] The download page at UCSC says "NCBI places no restrictions on the use or distribution of the GenBank data. However, some submitters may claim patent, copyright, or other intellectual property rights in all or a portion of the data they have submitted. NCBI is not in a position to assess the validity of such claims, and therefore cannot provide comment or unrestricted permission concerning the use, copying, or distribution of the information contained in GenBank."

[2] Eric W Sayers; Mark Cavanaugh; Karen Clark; Kim D Pruitt; Conrad L Schoch; Stephen T Sherry; Ilene Karsch-Mizrachi (7 January 2022). "GenBank". Nucleic Acids Archive. 50 (D1): D161 – D164. doi:10.1093/nar/gkab1135. PMC 8690257. PMID 34850943.

[:0-3] Sayers, Eric W; Cavanaugh, Mark; Frisse, Linda; Pruitt, Kim D; Schneider, Valerie A; Underwood, Beverly A; Yankie, Linda; Karsch-Mizrachi, Ilene (2025-01-06). "GenBank 2025 update". Nucleic Acids Research. 53 (D1): D56 – D61. doi:10.1093/nar/gkae1114. ISSN 0305-1048. PMC 11701615. PMID 39558184.

[pmid18073190-4] Benson D; Karsch-Mizrachi, I.; Lipman, D. J.; Ostell, J.; Wheeler, D. L.; et al. (2008). "GenBank". Nucleic Acids Research. 36 (Database): D25 – D30. doi:10.1093/nar/gkm929. PMC 2238942. PMID 18073190.

[pmid18940867-5] Benson D; Karsch-Mizrachi, I.; Lipman, D. J.; Ostell, J.; Sayers, E. W.; et al. (2009). "GenBank". Nucleic Acids Research. 37 (Database): D26 – D31. doi:10.1093/nar/gkn723. PMC 2686462. PMID 18940867.

[1]

[2]

[3]

[4]

[5]

Content
Description	Nucleotide sequences for more than 300,000 organisms with supporting bibliographic and biological annotation.
Data types captured	Nucleotide sequence Protein sequence
Organisms	All
Contact
Research center	NCBI
Primary citation	PMID 21071399
Release date	1982; 43 years ago (1982)
Access
Data format	XML ASN.1 Genbank format
Website	NCBI
Download URL	ncbi ftp
Web service URL	eutils soap
Tools
Web	BLAST
Standalone	BLAST
Miscellaneous
License	Unclear^[1]

Our website is made possible by displaying online advertisements to our visitors. Please consider supporting us by disabling your ad blocker.

GenBank

Our website is made possible by displaying online advertisements to our visitors.
Please consider supporting us by disabling your ad blocker.