The BIOSCAN-5M dataset features five million specimens from 47 countries with paired high-resolution images and DNA barcodes for every sample. The dataset’s hierarchical taxonomic labels, geographic data, and long-tail distribution of rare species offer valuable resources for ecological research and AI model training. BIOSCAN-5M represents a significant advancement in biodiversity informatics, facilitated by the International Barcode of Life and the BIOSCAN project, and is publicly available for download via Hugging Face and PyPI.
talk-data.com
Topic
dna barcodes
1
tagged
Activity Trend
1
peak/qtr
2020-Q1
2026-Q1