Biodiversity data is messy. How to transform messy data into reusable data?
We attended OBIS node training course November 2017 to learn about methods/tools available to clean biodiversity data in Darwin Core standard.
The teaching materials are available online to everyone. Ocean Biodiversity Information System (OBIS) and World Register of Marine Species (WoRMS) have put in a lot of effort to facilitate biodiversity data cleaning.
One of the very user friendly R packages that we learned is obistools. It allows user to easily achieve following objectives by just calling 1 command:
- Perform taxon match using WoRMS API
names <- c("Abra alva", "Buccinum fusiforme", "Buccinum fusiforme", "Buccinum fusiforme") match_taxa(names)
- Plot points on map
- Check required fields before publishing data through IPT.
- Remark: IPT was developed by GBIF. OBIS also uses IPT to harvest data to their database. The required fields for OBIS and GBIF for darwin core standard is slightly different.
- There are a lot more to offer from obistools. A more comprehensive tutorial is available at their GitHub page.