Quantifying uncertainty is crucial for robust biodiversity assessments. The b3gbi package provides a flexible and powerful way to calculate confidence intervals (CIs) for biodiversity indicators using bootstrapping methods, leveraging the dubicube package for robust cube-level resampling.
To maintain a clean and efficient workflow, uncertainty calculation is decoupled from the initial indicator calculation. This “two-step” process allows users to first explore their data and indicators quickly, and then perform computationally intensive bootstrapping only when needed.
The standard workflow for adding uncertainty to an indicator is as follows:
total_occ_ts(),
pielou_evenness_ts()).add_ci() function.In this example, we calculate total occurrences for mammals in Denmark and then add 95% confidence intervals using cube-level bootstrapping.
library(b3gbi)
# 1. Process the cube
denmark_cube <- process_cube(system.file("extdata",
"denmark_mammals_cube_eqdgc.csv",
package = "b3gbi"))
# 2. Calculate a time series of total occurrences
# By default, CIs are NOT calculated during this step
occ_ts <- total_occ_ts(denmark_cube)
# 3. Add confidence intervals using add_ci()
# This step uses the dubicube package for robust bootstrapping
occ_ts_with_ci <- add_ci(occ_ts, num_bootstrap = 100) # Using 100 for speed in this example
# 4. Plot the result
plot(occ_ts_with_ci, title = "Total Occurrences with 95% CI")The add_ci() function supports two levels of
bootstrapping, selectable via the bootstrap_level
argument:
bootstrap_level = "cube")This is the default and recommended method. It resamples the raw occurrence records within the data cube. The function automatically determines whether to use group-specific resampling (for species-level indicators) or whole-cube resampling (for aggregate indicators) based on the indicator type.
It leverages the dubicube package to ensure that
indicators are recalculated correctly for each bootstrap replicate.
bootstrap_level = "indicator")This method resamples the calculated indicator values themselves.
The add_ci() function automatically applies appropriate
bootstrapping strategies based on the indicator type. This ensures
statistically correct confidence intervals without requiring manual
configuration:
Group-Specific Bootstrapping: This method
isolates resampling strictly within each grouping variable. For
indicators like Total Occurrences (total_occ), which
calculate a raw count per year, resampling only within that year
provides mathematically correct variance without cross-contamination
between years.
Whole-Cube Bootstrapping: This method resamples
the entire dataset at once. This is automatically applied to aggregate
indicators (evenness, rarity, density metrics) as well as species-level
indicators (spec_occ, spec_range). For
species-level indicators, whole-cube resampling is computationally
optimized to process thousands of species simultaneously in a single
pass while correctly capturing changes in species composition during
resampling.
Bounded Indicators: Evenness metrics
(pielou_evenness, williams_evenness)
automatically use logit transformation to ensure confidence intervals
remain within valid [0, 1] bounds.
These defaults are applied internally, but can be overridden using
boot_args if needed.
The add_ci() function provides several arguments for
fine-tuning the CI calculation:
| Argument | Description | Default |
|---|---|---|
num_bootstrap |
Number of bootstrap replicates. | 1000 |
ci_type |
Type of bootstrap interval. See Types of Confidence Intervals below. | "perc" |
confidence_level |
The confidence level (e.g., 0.95). | 0.95 |
boot_args |
A list of additional arguments for
dubicube::bootstrap_cube(). |
list() |
ci_args |
A list of additional arguments for
dubicube::calculate_bootstrap_ci(). |
list() |
The ci_type argument allows you to choose between
different methods for calculating bootstrap confidence intervals. These
are passed directly to the dubicube package:
"perc"): (Default) The
simplest and most commonly used method. It uses the ordered bootstrap
replicates to find the intervals (e.g., the 2.5th and 97.5th percentiles
for a 95% CI). It is transformation-invariant and generally robust for
biodiversity indicators."bca"): An improvement over the percentile method
that adjusts for both bias and skewness in the bootstrap distribution.
It is highly recommended for accuracy but requires more
computation."norm"): Assumes
the bootstrap distribution is approximately normal. It uses the standard
error of the bootstrap replicates to calculate the interval. While fast,
it may be less accurate for skewed distributions typical of biodiversity
metrics."basic"): Also known as the
“reverse percentile” method. It uses the percentiles of the distribution
of the difference between the bootstrap replicates and the original
estimate.If you need to pass specific parameters to the underlying
dubicube functions (e.g., setting a specific seed), you can
use the boot_args and ci_args parameters:
While add_ci() automatically selects appropriate
settings, you can override these defaults using boot_args
and ci_args. This is useful when you need specialized
behavior:
# Example: Calculate CIs for evenness on raw scale (no logit transformation)
evenness_raw <- add_ci(my_evenness_indicator,
num_bootstrap = 1000,
boot_args = list(trans = identity,
inv_trans = identity))
# Example: Force bias correction for total_occ
occ_with_bias <- add_ci(my_occ_indicator,
num_bootstrap = 1000,
ci_args = list(no_bias = FALSE))Common override parameters include: - trans /
inv_trans: Transformation functions - no_bias:
Logical, disable bias correction (default varies by indicator) -
group_specific: Logical, force group-specific vs whole-cube
bootstrapping
Confidence intervals can be added post-hoc for the following indicators:
total_occ)occ_density)newness)williams_evenness)pielou_evenness)ab_rarity)area_rarity)spec_occ)spec_range)While most indicators rely on dubicube for calculating
confidence intervals, the Hill diversity indicators (hill0,
hill1, hill2) compute their metrics using the
iNEXT package, which runs an intensive coverage-based
rarefaction algorithm. iNEXT possesses its own highly
optimized, native bootstrapping engine.
When you pass a Hill indicator to add_ci(), the function
will automatically detect it, bypass dubicube’s cube-level
resampling to prevent computationally disastrous double-bootstrapping,
and seamlessly delegate to iNEXT’s internal engine (forcing
bootstrap_level = "indicator").
Certain indicators cannot have confidence intervals added via
add_ci():
obs_richness):
Bootstrapping observed richness is often not statistically sensible;
consider using Hill numbers for estimated richness instead.cum_richness):
The cumulative nature of this metric makes standard bootstrapping
inappropriate.occ_turnover):
Similar to cumulative richness, the temporal dependency requires
specialized methods.tax_distinct):
Requires specialized randomization tests rather than standard
bootstrapping.While b3gbi focuses on indicator calculation and visualization, the underlying dubicube package offers powerful functions for initial data exploration and quality assessment of your biodiversity data cubes.
To learn more about the full capabilities of dubicube, including in-depth tutorials and advanced data processing techniques, please visit the dubicube documentation.
The add_ci() function provides a unified and robust
interface for adding uncertainty to your biodiversity assessments. By
leveraging the power of dubicube, b3gbi
ensures that your results are not just numbers, but scientifically
grounded estimates with clearly defined confidence boundaries.
With automatic indicator configuration, most users can rely on
sensible defaults while advanced users retain full control over the
bootstrapping process through the boot_args and
ci_args parameters.