Document Type


Publication Date




Publication Title

Scientific Data






411 (1-10)


Accurate identification of fishes is essential for understanding their biology and to ensure food safety for consumers. DNA barcoding is an important tool because it can verify identifications of both whole and processed fishes that have had key morphological characters removed (e.g., filets, fish meal); however, DNA reference libraries are incomplete, and public repositories for sequence data contain incorrectly identified sequences. During a nine-year sampling program in the Philippines, a global biodiversity hotspot for marine fishes, we developed a verified reference library of cytochrome c oxidase subunit I (COI) sequences for 2,525 specimens representing 984 species. Specimens were primarily purchased from markets, with additional diversity collected using rotenone or fishing gear. Species identifications were verified based on taxonomic, phenotypic, and genotypic data, and sequences are associated with voucher specimens, live-color photographs, and genetic samples catalogued at Smithsonian Institution, National Museum of Natural History. The Biodiversity of Philippine Marine Fishes dataset is released herein to increase knowledge of species diversity and distributions and to facilitate accurate identification of market fishes.


This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply 2023.

This article is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original authors and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Data Availability

Article states: "The verified COI sequence library for Biodiversity of Philippine Marine Fishes includes (1) voucher specimens (2), tissues samples and DNA extracts (3), voucher collection information (4), live-color photographs, and (5) COI sequences of at least 500 bp. All photographs, voucher catalog numbers, DNA sequences, and collection data are publicly available through FigShare30. Data is also available on BOLD39, GenBank (BioProject PRJNA94750340), through the Fish Collection at the National Museum of Natural History Smithsonian Institution (, and the FDA Reference Standard Sequence Library for Seafood Identification (RSSL; The library follows the BARCODE data standard requirements41,42 for (1) species name (2), voucher data (3), collection data (4), sequence length (5), PCR primers used to generate the amplicon, and (6) trace files."

Links to data as shown in references 30, 39, and 40 are as follows:

Original Publication Citation

Bemis, K. E., Girard, M. G., Santos, M. D., Carpenter, K. E., Deeds, J. R., Pitassy, D. E., Flores, N. A. L., Hunter, E. S., Driskell, A. C., Macdonald, K. S., III, Weigt, L. A., & Williams, J. T. (2023). Biodiversity of Philippine marine fishes: A DNA barcode reference library based on voucher specimens. Scientific Data, 10(1), 1-10, Article 411.


0000-0003-3618-1811 (Carpenter)


Article Location