# Load libraries
library(rfishbase)
library(tidyverse)
library(rredlist)
library(janitor)
library(knitr)Cleaning and Joining Data from FishBase and IUCN RedList in R
Description: In this qmd, I showcase how to access, clean, and join data from FishBase and the IUCN Red List.
Introduction
As a data scientist, it is frequently necessary to join data from two or more sources. Sometimes this can be trickier than others depending on the state of the original data sources. Here, I will access data from FishBase and the The International Union for Conservation of Nature (IUCN) Red List. Luckily, these data sources already have fairly thoughtful layouts and clean data. Both data sources have quite a lot of data on fish and other animal species, so I will need to select and filter for what I am interested in. Here, I am interested in fish in Hawaii. This data accessing, cleaning, and join, was done in preparation for a logistic regression analysis to identify key traits in Hawaiian fish that predict risk of extinction (Not shown here).
The Data
I use the IUCN Red List data accessed via the IUCN Red List API 1 and package rredlist. The Red List categorizes species by Not Evaluated, Data Deficient, Least Concern, Near Threatened, Vulnerable, Endangered, Critically Endangered, Extinct in the Wild and Extinct. Extinct species were not evaluated in this analysis. The IUCN Red List data are limited in that many marine species have not been listed yet or have been identified as too data deficient to be evaluated.
Fish ecological data were accessed from FishBase 2 via package rfishbase. Different species in the FishBase data were originally described by different people, possibly leading to errors or biases.
Methods
To get started, there are several packages I will be using. I will use rfishbase to access the FishBase data, rredlist to access the IUCN Red List data, tidyverse for data cleaning and joining, janitor for data cleaning, and knitr for displaying data tables.
I access the species and country data from FishBase. I filter for fish in Hawaii then join the species information to the Hawaiian fish with a left join.
# Load all species in FishBase
species <- fb_tbl("species") %>%
janitor::clean_names()
species %>%
select(-"comments") %>%
head() %>% kable()| spec_code | genus | species | species_ref_no | author | f_bname | pic_preferred_name | pic_preferred_name_m | pic_preferred_name_f | pic_preferred_name_j | fam_code | subfamily | gen_code | sub_gen_code | body_shape_i | source | author_ref | remark | tax_issue | fresh | brack | saltwater | demers_pelag | air_breathing | air_breathing_ref | ana_cat | migrat_ref | depth_range_shallow | depth_range_deep | depth_range_ref | depth_range_com_shallow | depth_range_com_deep | depth_com_ref | longevity_wild | longevity_wild_ref | longevity_captive | longevity_cap_ref | vulnerability | vulnerability_climate | length | l_type_max_m | length_female | l_type_max_f | max_length_ref | common_length | l_type_com_m | common_length_f | l_type_com_f | common_length_ref | weight | weight_female | max_weight_ref | pic | picture_female | larva_pic | egg_pic | importance_ref | importance | price_categ | price_reliability | remarks7 | landing_statistics | landings | main_catching_method | ii | m_seines | m_gillnets | m_castnets | m_traps | m_spears | m_trawls | m_dredges | m_liftnets | m_hooks_lines | m_other | usedfor_aquaculture | life_cycle | aquaculture_ref | usedas_bait | bait_ref | aquarium | aquarium_fish_ii | aquarium_ref | game_fish | game_ref | dangerous | dangerous_ref | electrogenic | electro_ref | complete | google_image | profile | pd50 | emblematic | entered | date_entered | modified | date_modified | expert | date_checked | ts |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 24523 | Aborichthys | kempi | 39226 | Chaudhuri, 1913 | NA | Abkem_u1.jpg | NA | NA | NA | 692 | NA | 784 | NA | elongated | O | NA | NA | 0 | 1 | 0 | 0 | demersal | WaterAssumed | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 10.00 | NA | 8.10 | SL | NA | NA | 39226 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 4832 | of no interest | unknown | NA | NA | NA | NA | NA | NA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | never/rarely | NA | NA | never/rarely | NA | never/rarely | NA | NA | 0 | NA | harmless | NA | no special ability | NA | NA | 1 | NA | 0.5039 | 0 | 10 | 1996-07-18 | 65 | 2013-07-18 | 10 | 2001-09-08 | NA |
| 65802 | Aborichthys | rosammai | 95217 | Sen, 2009 | NA | NA | NA | NA | NA | 692 | NA | 784 | NA | Elongated | O | NA | NA | 0 | 1 | 0 | 0 | demersal | WaterAssumed | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 10.00 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | never/rarely | NA | NA | never/rarely | NA | never/rarely | NA | NA | 0 | NA | harmless | NA | no special ability | NA | NA | 1 | NA | 0.5039 | 0 | 10 | 2010-10-05 | 65 | 2021-10-25 | NA | NA | NA |
| 24524 | Aborichthys | tikaderi | 4832 | Barman, 1985 | NA | NA | NA | NA | NA | 692 | NA | 784 | NA | Elongated | O | NA | NA | 0 | 1 | 0 | 0 | demersal | WaterAssumed | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 10.00 | NA | 10.90 | SL | NA | NA | 95217 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 4832 | of no interest | unknown | NA | NA | NA | NA | NA | NA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | never/rarely | NA | NA | never/rarely | NA | never/rarely | NA | NA | 0 | NA | harmless | NA | no special ability | NA | NA | 1 | NA | 0.5039 | 0 | 10 | 1996-07-18 | 65 | 2021-10-25 | 10 | 2003-01-09 | NA |
| 67300 | Aborichthys | verticauda | 95164 | Arunachalam, Raja, Malaiammal & Mayden, 2014 | NA | NA | NA | NA | NA | 692 | NA | 784 | NA | elongated | O | NA | NA | NA | 1 | 0 | 0 | demersal | WaterAssumed | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 10.00 | NA | 7.75 | SL | NA | NA | 95164 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | never/rarely | NA | NA | never/rarely | NA | never/rarely | NA | NA | 0 | NA | harmless | NA | no special ability | NA | NA | 1 | NA | 0.5039 | 0 | 10 | 2014-04-05 | 10 | 2014-09-04 | NA | NA | NA |
| 67609 | Aborichthys | waikhomi | 95217 | Kosygin, 2012 | NA | NA | NA | NA | NA | 692 | NA | 784 | NA | elongated | O | NA | NA | NA | 1 | 0 | 0 | demersal | WaterAssumed | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 10.00 | NA | 6.80 | SL | NA | NA | 95217 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | never/rarely | NA | NA | never/rarely | NA | never/rarely | NA | NA | 0 | NA | harmless | NA | no special ability | NA | NA | 1 | NA | 0.5039 | 0 | 10 | 2015-02-02 | NA | 2015-02-03 | NA | NA | NA |
| 268 | Abramis | brama | 59043 | (Linnaeus, 1758) | Freshwater bream | Abbra_ue.jpg | NA | NA | NA | 756 | Leuciscinae | 50 | NA | fusiform / normal | O | NA | NA | 0 | 1 | 1 | 0 | benthopelagic | WaterAssumed | NA | potamodromous | 51243 | 1 | NA | 9696 | NA | NA | NA | 23 | 796 | 17 | 72462 | 61.85 | NA | 82.00 | TL | NA | NA | 6114 | 25 | TL | NA | NA | 3561 | 6010 | NA | 4699 | ABBRA_U0 | NA | NA | NA | 4931 | highly commercial | unknown | NA | NA | from 50,000 to 100,000 | Russia (Ref. 9988) | seines | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 0 | commercial | 12108 | usually | NA | never/rarely | 274 | 1 | 4699 | harmless | NA | no special ability | NA | NA | 1 | NA | 1.0000 | 0 | 2 | 1990-10-17 | 2 | 2013-08-28 | 97 | 2003-01-03 | NA |
# Load all species by country
country <- fb_tbl("country") %>%
janitor::clean_names()
country %>%
select(-"comments") %>%
head() %>% kable()| autoctr | stockcode | c_code | spec_code | country_ref_no | also_ref | status | current_presence | freshwater | brackish | saltwater | land | abundance | ref_abundance | importance | ref_importance | ex_vessel_price | aquaculture | ref_aquaculture | live_export_org | live_export | ref_live_export | game | bait | regulations | ref_regulations | threatened | country_sub_comp | entered | date_entered | modified | date_modified | expert | date_checked | ts |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 435 | 1 | 638 | 2 | 13246 | 1739 | introduced | present | 1 | 1 | 0 | 0 | NA | NA | NA | NA | NA | NA | NA | 0 | NA | NA | 0 | 0 | NA | NA | 0 | 0 | 19 | 1992-08-27 | 1 | 2000-11-21 | NA | NA | NA |
| 436 | 1 | 646 | 2 | 2 | 35818 | native | present | 1 | 0 | 0 | 0 | NA | NA | NA | NA | NA | commercial | 7306 | 0 | NA | NA | 0 | 0 | NA | NA | 0 | 0 | 19 | 1992-08-27 | 10 | 2013-02-05 | NA | NA | NA |
| 437 | 1 | 662 | 2 | 12228 | NA | introduced | present | 1 | 0 | 0 | 0 | NA | NA | NA | NA | NA | commercial | NA | 0 | NA | NA | 0 | 0 | NA | NA | 0 | 0 | 97 | 1997-03-12 | 97 | 1998-02-02 | NA | NA | NA |
| 438 | 1 | 670 | 2 | 26042 | NA | introduced | present | 1 | 0 | 0 | 0 | NA | NA | NA | NA | NA | NA | NA | 0 | NA | NA | 0 | 0 | NA | NA | 0 | 0 | 97 | 1998-02-02 | 1 | 2000-11-21 | 409 | 2000-09-28 | NA |
| 439 | 1 | 678 | 2 | 28179 | NA | introduced | present | 1 | 0 | 0 | 0 | NA | NA | NA | NA | NA | experimental | 28179 | 0 | NA | NA | 0 | 0 | NA | NA | 0 | 0 | 97 | 2002-10-25 | NA | 2002-10-25 | NA | NA | NA |
| 440 | 1 | 682 | 2 | 104 | 12228 | introduced | present | 1 | 1 | 0 | 0 | NA | NA | minor commercial | 8984 | NA | commercial | 7306 | 0 | NA | NA | 0 | 0 | NA | NA | 0 | 0 | 18 | 1993-08-30 | 1 | 1999-03-22 | NA | NA | NA |
# Find all species that might live in Hawaii
hi_fish_c <- country %>%
filter(c_code == "840B") # Found in documentation
hi_fish_c %>%
select(-"comments") %>%
head() %>% kable()| autoctr | stockcode | c_code | spec_code | country_ref_no | also_ref | status | current_presence | freshwater | brackish | saltwater | land | abundance | ref_abundance | importance | ref_importance | ex_vessel_price | aquaculture | ref_aquaculture | live_export_org | live_export | ref_live_export | game | bait | regulations | ref_regulations | threatened | country_sub_comp | entered | date_entered | modified | date_modified | expert | date_checked | ts |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 75 | 3 | 840B | 3 | 5360 | 6362 | introduced | present | 1 | 1 | 0 | 0 | NA | NA | NA | NA | NA | NA | NA | 0 | NA | NA | 0 | 0 | NA | NA | 0 | 0 | 19 | 1992-08-27 | 1472 | 2008-01-07 | 56 | 1993-04-08 | NA |
| 582 | 9 | 840B | 7 | 58302 | NA | native | present | 0 | 0 | 1 | 0 | NA | NA | NA | NA | NA | NA | NA | 0 | NA | NA | 0 | 0 | NA | NA | 0 | 0 | 2 | 1990-12-11 | 1472 | 2008-01-17 | NA | NA | NA |
| 154 | 16 | 840B | 10 | 583 | NA | native | present | 0 | 1 | 1 | 0 | NA | NA | NA | NA | NA | NA | NA | 0 | NA | NA | 0 | 0 | NA | NA | 0 | 0 | 10 | 1993-02-24 | NA | 1996-11-15 | NA | NA | NA |
| 1323 | 50 | 840B | 40 | 58302 | NA | native | present | 0 | 0 | 1 | 0 | NA | NA | NA | NA | NA | NA | NA | 0 | NA | NA | 0 | 0 | NA | NA | 0 | 0 | 10 | 1993-02-16 | 1472 | 2008-01-09 | NA | NA | NA |
| 1524 | 56 | 840B | 46 | 58302 | NA | native | present | 0 | 0 | 1 | 0 | NA | NA | NA | NA | NA | NA | NA | 0 | NA | NA | 0 | 0 | NA | NA | 0 | 0 | 1472 | 2007-06-13 | NA | 2007-06-13 | NA | NA | NA |
| 2574 | 87 | 840B | 77 | 43 | NA | native | present | 0 | 0 | 1 | 0 | NA | NA | NA | NA | NA | NA | NA | 0 | NA | NA | 0 | 0 | NA | NA | 0 | 0 | 1 | 1990-11-10 | 1472 | 2008-01-24 | NA | NA | NA |
# Join hi_fish with species info
hi_fish <- left_join(hi_fish_c, species, by = "spec_code") I do some cleaning, isolate the species list, and further explore what is offered in FishBase. FishBase has a LOT of information available. I want to add some ecological and common name information to my data frame of Hawaiian fish. Then, I select for the columns and rows I need for my later analysis that contain currently present fish in Hawaii and their ecological traits of interest.
# Concatenate Genus and Species
hi_fish$genus_species <- paste(hi_fish$genus, hi_fish$species)
hi_fish <- hi_fish %>% relocate(genus_species, .after = spec_code) %>%
relocate(length, weight, .after = genus_species)
# Isolate list of species
hi_species_list <- as.character(hi_fish$genus_species)
# Check out what is available in the various tables
tables <- docs()
ecology <- fb_tbl("ecology")
com_names <- fb_tbl("comnames")
# Grab the ecological traits I'm looking for with my Hawaiian species list
hi_ecol <- ecology(species_list = hi_species_list,
fields = c("SpecCode", "CoralReefs", "FeedingType", "Schooling")) %>%
clean_names()# Theres duplicates here but I remove them later
hi_ecol %>% head() %>% kable()| spec_code | coral_reefs | feeding_type | schooling |
|---|---|---|---|
| 6652 | -1 | variable | -1 |
| 5689 | -1 | variable | 0 |
| 6630 | -1 | grazing on aquatic plants | 0 |
| 5402 | -1 | hunting macrofauna (predator) | 0 |
| 5403 | -1 | hunting macrofauna (predator) | 0 |
| 972 | 0 | hunting macrofauna (predator) | -1 |
# Grab the common names I'm looking for with my Hawaiian species list
hi_com_names <- common_names(species_list = hi_species_list,
fields = c("SpecCode", "ComName")) %>%
clean_names()
hi_com_names %>% head() %>% kable()| spec_code | species | com_name | language |
|---|---|---|---|
| 3 | Oreochromis mossambicus | African mouthbrooder | English |
| 3 | Oreochromis mossambicus | African perch | English |
| 3 | Oreochromis mossambicus | Blue bream | English |
| 3 | Oreochromis mossambicus | Blue tilapia | English |
| 3 | Oreochromis mossambicus | Common tilapia | English |
| 3 | Oreochromis mossambicus | Hawaiian perch | English |
# Check to see if any hi_ecol species are duplicated, then remove the duplicate
sum(duplicated(hi_ecol$spec_code))[1] 1
duplicated_species <- hi_ecol$spec_code[duplicated(hi_ecol$spec_code)]
hi_ecol[hi_ecol$spec_code %in% duplicated_species, ]# A tibble: 2 × 4
spec_code coral_reefs feeding_type schooling
<int> <int> <chr> <int>
1 1412 0 grazing on aquatic plants 0
2 1412 0 <NA> 0
hi_ecol <- hi_ecol[!duplicated(hi_ecol$spec_code), ]
sum(duplicated(hi_ecol$spec_code))[1] 0
# Combine data sets then select and filter
hi_fish_chars <- left_join(hi_fish, hi_ecol, by = "spec_code") %>%
select(c("spec_code", "genus_species", "length", "l_type_max_m", "weight",
"status", "current_presence", "genus", "species", "importance.y",
"price_categ", "coral_reefs", "feeding_type", "schooling")) %>%
filter(current_presence == "present") %>% # Data frame full of fish characteristics
rename(length_cm = length)
hi_fish_chars %>% head() %>% kable()| spec_code | genus_species | length_cm | l_type_max_m | weight | status | current_presence | genus | species | importance.y | price_categ | coral_reefs | feeding_type | schooling |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | Oreochromis mossambicus | 39.0 | SL | 1130 | introduced | present | Oreochromis | mossambicus | highly commercial | unknown | 0 | browsing on substrate | 0 |
| 7 | Coryphaena equiselis | 145.7 | TL | NA | native | present | Coryphaena | equiselis | minor commercial | unknown | 0 | hunting macrofauna (predator) | 0 |
| 10 | Alectis indica | 165.0 | TL | 25000 | native | present | Alectis | indica | commercial | medium | -1 | hunting macrofauna (predator) | 0 |
| 40 | Argyropelecus affinis | 8.4 | SL | NA | native | present | Argyropelecus | affinis | of no interest | unknown | 0 | selective plankton feeding | 0 |
| 46 | Melanolagus bericoides | 20.0 | SL | NA | native | present | Melanolagus | bericoides | of no interest | unknown | 0 | selective plankton feeding | 0 |
| 77 | Istiophorus platypterus | 348.0 | FL | 100240 | native | present | Istiophorus | platypterus | commercial | very high | -1 | hunting macrofauna (predator) | 0 |
Next, I access the IUCN Red List data through the IUCN Red List API. I load the entire list of Red List species in 16 requests – the number of available pages at the time of this analysis.
# Identify token for accessing IUCN API
iucn_token <- Sys.getenv("IUCN_KEY")
# Create an empty list to store the data frames
species_list <- list()
# Import all species on IUCN Redlist
for (i in 0:15) {
# Get data from API and assign to a variable with a name
species_list[[i + 1]] <- rl_sp(page = i, key = iucn_token)$result
}
# Combine all data frames into one and select only columns I need
all_iucn_species <- do.call(rbind, species_list) %>%
select(c("scientific_name",
"category",
"main_common_name")) %>%
rename(genus_species = scientific_name)
all_iucn_species %>% head() %>% kable()| genus_species | category | main_common_name |
|---|---|---|
| Aaadonta angaurana | CR | NA |
| Aaadonta constricta | EN | NA |
| Aaadonta fuscozonata | EN | NA |
| Aaadonta irregularis | CR | NA |
| Aaadonta kinlochi | CR | NA |
| Aaadonta pelewana | CR | NA |
Now I have a large data frame with all species on the Red List, their status, and their common name. I don’t need ALL species, though, so I need to combine these data with my Hawaii fish data from FishBase. I do this by left joining the Red List data to the Hawaii fish characteristics. I remove species that show up multiple times in the Red List data and might not be reliable.
# Join data
hi_fish_status <- left_join(hi_fish_chars, all_iucn_species,
by = "genus_species") # Duplicate rows introduced
# Identify which rows are here multiple times
status_unique <- as.data.frame(table(hi_fish_status$spec_code)) %>%
setNames(c("spec_code", "freq")) %>%
filter(!freq != 1) # remove rows w freq > 1
#view(status_unique)
# Recombine with status df
hi_fish_status$spec_code <- as.factor(hi_fish_status$spec_code)
hi_fish_status <- left_join(status_unique, hi_fish_status,
by = "spec_code")
hi_fish_status %>% head() %>% kable()| spec_code | freq | genus_species | length_cm | l_type_max_m | weight | status | current_presence | genus | species | importance.y | price_categ | coral_reefs | feeding_type | schooling | category | main_common_name |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | 1 | Oreochromis mossambicus | 39.0 | SL | 1130 | introduced | present | Oreochromis | mossambicus | highly commercial | unknown | 0 | browsing on substrate | 0 | VU | Mozambique Tilapia |
| 6 | 1 | Coryphaena hippurus | 210.0 | TL | 40000 | native | present | Coryphaena | hippurus | highly commercial | high | -1 | hunting macrofauna (predator) | 0 | LC | Common Dolphinfish |
| 7 | 1 | Coryphaena equiselis | 145.7 | TL | NA | native | present | Coryphaena | equiselis | minor commercial | unknown | 0 | hunting macrofauna (predator) | 0 | LC | Pompano Dolphinfish |
| 10 | 1 | Alectis indica | 165.0 | TL | 25000 | native | present | Alectis | indica | commercial | medium | -1 | hunting macrofauna (predator) | 0 | LC | Indian Threadfish |
| 40 | 1 | Argyropelecus affinis | 8.4 | SL | NA | native | present | Argyropelecus | affinis | of no interest | unknown | 0 | selective plankton feeding | 0 | LC | Pacific Hatchet Fish |
| 46 | 1 | Melanolagus bericoides | 20.0 | SL | NA | native | present | Melanolagus | bericoides | of no interest | unknown | 0 | selective plankton feeding | 0 | LC | Bigscale Deepsea Smelt |
I remove columns where fish were data deficient or not assessed. Then, consistent with Munstermann et al., living species listed as ‘Vulnerable’, ‘Endangered’, or ‘Critically Endangered’ were categorized as ‘Threatened’. Living species listed as ‘Least Concern’ and ‘Near Threatened’ were categorized as ‘Nonthreatened’ 3.
# Drop all rows with na values of interest
hi_status_drop_na <- hi_fish_status %>%
filter(!category == "NA") %>%
filter(!category == "DD")
# Make a binary column with 1 as some level of concern and 0 as least concern
tidy_fish_data <- hi_status_drop_na %>%
mutate(is_of_concern = case_when(category == "CR" |
category == "EN" |
category == "VU" ~ 1,
category == "LR/nt" |
category == "NT" |
category == "LC" ~ 0)) %>%
mutate(coral_reefs = coral_reefs * - 1) %>%
mutate(reef_associated = case_when(coral_reefs == 1 ~ "yes",
coral_reefs == 0 ~ "no")) %>%
mutate(is_endemic = case_when(status == "endemic" ~ "yes",
status == "native" |
status == "introduced" ~ "no")) %>%
select(-c("freq"))Results
hi_status_drop_na %>%
arrange(genus_species) %>%
slice_head(n = 20) %>% kable()| spec_code | freq | genus_species | length_cm | l_type_max_m | weight | status | current_presence | genus | species | importance.y | price_categ | coral_reefs | feeding_type | schooling | category | main_common_name |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 11144 | 1 | Abantennarius analis | 7.8 | SL | NA | native | present | Abantennarius | analis | of no interest | unknown | NA | NA | NA | LC | Tailjet Anglerfish |
| 5402 | 1 | Abantennarius coccineus | 13.0 | TL | NA | native | present | Abantennarius | coccineus | of no interest | unknown | -1 | hunting macrofauna (predator) | 0 | LC | Freckled Anglerfish |
| 69666 | 1 | Abantennarius drombus | 12.0 | SL | NA | endemic | present | Abantennarius | drombus | NA | NA | NA | NA | NA | LC | Hawaiian Freckled Frogfish |
| 27298 | 1 | Abantennarius duescus | 3.0 | SL | NA | native | present | Abantennarius | duescus | NA | unknown | NA | NA | NA | LC | Side-Jet Frogfish |
| 972 | 1 | Ablennes hians | 140.0 | TL | 4800 | native | present | Ablennes | hians | minor commercial | high | 0 | hunting macrofauna (predator) | -1 | LC | Flat Needlefish |
| 6652 | 1 | Abudefduf abdominalis | 30.0 | TL | NA | native | present | Abudefduf | abdominalis | NA | unknown | -1 | variable | -1 | LC | Green Damselfish |
| 5689 | 1 | Abudefduf sordidus | 24.0 | TL | NA | native | present | Abudefduf | sordidus | minor commercial | unknown | -1 | variable | 0 | LC | Yellow-banded Sergeant-major |
| 6630 | 1 | Abudefduf vaigiensis | 20.0 | TL | NA | native | present | Abudefduf | vaigiensis | subsistence fisheries | unknown | -1 | grazing on aquatic plants | 0 | LC | Five-banded Sergeant-major |
| 89 | 1 | Acanthocybium solandri | 250.0 | TL | 83000 | native | present | Acanthocybium | solandri | commercial | very high | 0 | hunting macrofauna (predator) | 0 | LC | Wahoo |
| 4306 | 1 | Acanthurus achilles | 24.0 | TL | NA | native | present | Acanthurus | achilles | commercial | medium | -1 | grazing on aquatic plants | 0 | LC | Achilles Tang |
| 4750 | 1 | Acanthurus blochii | 45.0 | SL | NA | native | present | Acanthurus | blochii | commercial | medium | -1 | grazing on aquatic plants | 0 | LC | Ringtail Surgeonfish |
| 1256 | 1 | Acanthurus dussumieri | 54.0 | TL | NA | native | present | Acanthurus | dussumieri | minor commercial | medium | -1 | browsing on substrate | 0 | LC | Eyestripe Surgeonfish |
| 4736 | 1 | Acanthurus guttatus | 26.0 | TL | NA | native | present | Acanthurus | guttatus | commercial | medium | -1 | browsing on substrate | 0 | LC | Whitespotted Surgeonfish |
| 4737 | 1 | Acanthurus leucopareius | 25.0 | TL | NA | native | present | Acanthurus | leucopareius | minor commercial | medium | -1 | browsing on substrate | 0 | LC | Whitebar Surgeonfish |
| 1258 | 1 | Acanthurus lineatus | 38.0 | TL | NA | native | present | Acanthurus | lineatus | commercial | medium | -1 | browsing on substrate | 0 | LC | Lined Surgeonfish |
| 6011 | 1 | Acanthurus nigricans | 36.0 | FL | 620 | native | present | Acanthurus | nigricans | subsistence fisheries | medium | -1 | browsing on substrate | 0 | LC | Golden Rim Surgeonfish |
| 4739 | 1 | Acanthurus nigrofuscus | 21.0 | TL | NA | native | present | Acanthurus | nigrofuscus | commercial | medium | -1 | browsing on substrate | 0 | LC | Brown Surgeonfish |
| 4738 | 1 | Acanthurus nigroris | 25.0 | TL | NA | native | present | Acanthurus | nigroris | NA | medium | -1 | browsing on substrate | 0 | LC | Bluelined Surgeonfish |
| 4744 | 1 | Acanthurus olivaceus | 35.0 | TL | NA | native | present | Acanthurus | olivaceus | commercial | medium | -1 | grazing on aquatic plants | 0 | LC | Orange band Surgeonfish |
| 4734 | 1 | Acanthurus thompsoni | 27.0 | TL | NA | native | present | Acanthurus | thompsoni | NA | medium | -1 | selective plankton feeding | 0 | LC | Thompson’s Surgeonfish |
Conclusion
Now I have a data frame of extant fish species in Hawaii, whether or not they are threatened, and some of their associated characteristics. These data are ready for logistic regression or classification analyses. The data are ready to be exported and are much smaller than all of the existing data in FishBase and the Red List.
Sources
1 “IUCN,” IUCN Red List of Threatened Species. Version 2022-1, 2022. https://www.iucnredlist.org/ (accessed Dec. 02, 2022).
2 R. Froese and D. Pauly, “FishBase,” 2022. www.fishbase.org
3 B. J. Cardinale et al., “Biodiversity loss and its impact on humanity,” Nature, vol. 486, no. 7401, Art. no. 7401, Jun. 2012, doi: 10.1038/nature11148. [3] M. J. Munstermann et al., “A global ecological signal of extinction risk in terrestrial vertebrates,” Conserv. Biol., vol. 36, no. 3, p. e13852, 2022, doi: 10.1111/cobi.13852.