Reference Databases
Refers to libraries of DNA sequences (usually from barcode genes) that have been generated from species of known identity. Sequences from unidentified organisms – obtained either by Sanger sequencing or high-throughput sequencing – are compared against a reference database to make species identifications. Databases can be curated (e.g. the Barcode of Life Database – BOLD – www.boldsystems.org) or uncurated (e.g. Genbank – www.ncbi.nlm.nih.gov). In curated databases, identifications are scrutinised and verified; in uncurated databases they are not. GenBank is therefore far more extensive than BOLD, but contains many errors.