Title: | Simulating Biodiversity Data Cubes |
---|---|
Description: | This R package provides a simulation framework for biodiversity data cubes. This can start from simulating multiple species distributed in a landscape over a temporal scope. In a second phase, the simulation of a variety of observation processes and effort can generate actual occurrence datasets. Based on their (simulated) spatial uncertainty, occurrences can then be designated to a grid to form a data cube. |
Authors: | Ward Langeraert [aut, cre] (<https://orcid.org/0000-0002-5900-8109>,
Research Institute for Nature and Forest (INBO)),
Wissam Barhdadi [ctb] |
Maintainer: | Ward Langeraert <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.1.2 |
Built: | 2025-02-13 11:44:41 UTC |
Source: | https://github.com/b-cubed-eu/gcube |
This function adds a column to the input dataframe or sf object containing the coordinate uncertainty for each observation, measured in meters.
add_coordinate_uncertainty(observations, coords_uncertainty_meters = 25)
add_coordinate_uncertainty(observations, coords_uncertainty_meters = 25)
observations |
An sf object with POINT geometry or a simple dataframe representing the observations. This object contains the observation points to which the coordinate uncertainty will be added. |
coords_uncertainty_meters |
A numeric value or a vector of numeric values representing the coordinate uncertainty (in meters) associated with each observation. If a single numeric value is provided, it will be applied to all observations. If a numeric vector is provided, it must be the same length as the number of observations. |
The input data frame or an sf object with POINT geometry, with an
additional column named coordinateUncertaintyInMeters
that contains the
coordinate uncertainty values in meters.
Other main:
filter_observations()
,
grid_designation()
,
sample_observations()
,
simulate_occurrences()
# Create dataframe with sampling status column observations_data <- data.frame( time_point = 1, sampling_prob = seq(0.5, 1, 0.1) ) # provide a fixed uncertainty for all points add_coordinate_uncertainty( observations_data, coords_uncertainty_meters = 1000 ) # add variability in uncertainty. For example, using gamma distribution uncertainty_vec <- seq(50, 100, 10) add_coordinate_uncertainty( observations_data, coords_uncertainty_meters = uncertainty_vec )
# Create dataframe with sampling status column observations_data <- data.frame( time_point = 1, sampling_prob = seq(0.5, 1, 0.1) ) # provide a fixed uncertainty for all points add_coordinate_uncertainty( observations_data, coords_uncertainty_meters = 1000 ) # add variability in uncertainty. For example, using gamma distribution uncertainty_vec <- seq(50, 100, 10) add_coordinate_uncertainty( observations_data, coords_uncertainty_meters = uncertainty_vec )
This function adds a sampling bias weight column to an sf object containing occurrences. The sampling probabilities are based on bias weights within each cell of a provided grid layer.
apply_manual_sampling_bias(occurrences_sf, bias_weights)
apply_manual_sampling_bias(occurrences_sf, bias_weights)
occurrences_sf |
An sf object with POINT geometry representing the occurrences. |
bias_weights |
An |
An sf object with POINT geometry that includes a bias_weight
column containing the sampling probabilities based on the sampling bias.
Other detection:
apply_polygon_sampling_bias()
# Load packages library(sf) library(dplyr) library(ggplot2) # Create polygon plgn <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7, 9, 5, 2)))) # Get occurrence points occurrences_sf <- simulate_occurrences(plgn) # Create grid with bias weights grid <- st_make_grid( plgn, n = c(10, 10), square = TRUE) %>% st_sf() grid$bias_weight <- runif(nrow(grid), min = 0, max = 1) # Calculate occurrence bias occurrence_bias <- apply_manual_sampling_bias(occurrences_sf, grid) occurrence_bias # Visualise where the bias is ggplot() + geom_sf(data = plgn) + geom_sf(data = grid, alpha = 0) + geom_sf(data = occurrence_bias, aes(colour = bias_weight)) + geom_sf_text(data = grid, aes(label = round(bias_weight, 2))) + theme_minimal()
# Load packages library(sf) library(dplyr) library(ggplot2) # Create polygon plgn <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7, 9, 5, 2)))) # Get occurrence points occurrences_sf <- simulate_occurrences(plgn) # Create grid with bias weights grid <- st_make_grid( plgn, n = c(10, 10), square = TRUE) %>% st_sf() grid$bias_weight <- runif(nrow(grid), min = 0, max = 1) # Calculate occurrence bias occurrence_bias <- apply_manual_sampling_bias(occurrences_sf, grid) occurrence_bias # Visualise where the bias is ggplot() + geom_sf(data = plgn) + geom_sf(data = grid, alpha = 0) + geom_sf(data = occurrence_bias, aes(colour = bias_weight)) + geom_sf_text(data = grid, aes(label = round(bias_weight, 2))) + theme_minimal()
This function adds a sampling bias weight column to an sf
object containing
occurrences based on a given polygonal area. The bias is determined by the
specified bias strength, which adjusts the probability of sampling within
the polygonal area.
apply_polygon_sampling_bias(occurrences_sf, bias_area, bias_strength = 1)
apply_polygon_sampling_bias(occurrences_sf, bias_area, bias_strength = 1)
occurrences_sf |
An sf object with POINT geometry representing the occurrences. |
bias_area |
An sf object with POLYGON geometry specifying the area where sampling will be biased. |
bias_strength |
A positive numeric value that represents the strength of
the bias to be applied within the |
An sf object with POINT geometry that includes a bias_weight
column containing the sampling probabilities based on the bias area and
strength.
Other detection:
apply_manual_sampling_bias()
# Load packages library(sf) library(dplyr) library(ggplot2) # Simulate some occurrence data with coordinates and time points num_points <- 10 occurrences <- data.frame( lon = runif(num_points, min = -180, max = 180), lat = runif(num_points, min = -90, max = 90), time_point = 1 ) # Convert the occurrence data to an sf object occurrences_sf <- st_as_sf(occurrences, coords = c("lon", "lat")) # Create bias_area polygon overlapping at least two of the points selected_observations <- st_union(occurrences_sf[2:3,]) bias_area <- st_convex_hull(selected_observations) %>% st_buffer(dist = 50) %>% st_as_sf() occurrence_bias_sf <- apply_polygon_sampling_bias( occurrences_sf, bias_area, bias_strength = 2) occurrence_bias_sf # Visualise where the bias is occurrence_bias_sf %>% mutate(bias_weight = as.factor(round(bias_weight, 3))) %>% ggplot() + geom_sf(data = bias_area) + geom_sf(aes(colour = bias_weight)) + theme_minimal()
# Load packages library(sf) library(dplyr) library(ggplot2) # Simulate some occurrence data with coordinates and time points num_points <- 10 occurrences <- data.frame( lon = runif(num_points, min = -180, max = 180), lat = runif(num_points, min = -90, max = 90), time_point = 1 ) # Convert the occurrence data to an sf object occurrences_sf <- st_as_sf(occurrences, coords = c("lon", "lat")) # Create bias_area polygon overlapping at least two of the points selected_observations <- st_union(occurrences_sf[2:3,]) bias_area <- st_convex_hull(selected_observations) %>% st_buffer(dist = 50) %>% st_as_sf() occurrence_bias_sf <- apply_polygon_sampling_bias( occurrences_sf, bias_area, bias_strength = 2) occurrence_bias_sf # Visualise where the bias is occurrence_bias_sf %>% mutate(bias_weight = as.factor(round(bias_weight, 3))) %>% ggplot() + geom_sf(data = bias_area) + geom_sf(aes(colour = bias_weight)) + theme_minimal()
This function creates a raster with a spatial pattern for the area of a polygon.
create_spatial_pattern( polygon, resolution, spatial_pattern = c("random", "clustered"), seed = NA, n_sim = 1 )
create_spatial_pattern( polygon, resolution, spatial_pattern = c("random", "clustered"), seed = NA, n_sim = 1 )
polygon |
An sf object with POLYGON geometry. |
resolution |
A numeric value defining the resolution of the raster cells. |
spatial_pattern |
Specifies the desired spatial pattern. It can
be a character string ( |
seed |
A positive numeric value setting the seed for random number
generation to ensure reproducibility. If |
n_sim |
Number of simulations. Each simulation is a different layer in the raster. Default is 1. |
The spatial_pattern
argument changes the range parameter of the
spherical variogram model. spatial_pattern = 1
means the range has the same
size as the grid cell, which is defined in the resolution
argument. The
function gstat::vgm()
is used to implement the spherical variogram model.
An object of class SpatRaster with a spatial pattern for the area of
the given polygon with n_sim
layers sampling_p'n_sim'
containing the
sampling probabilities from the raster grid for each simulation.
gstat::vgm()
and its range
argument
Other occurrence:
sample_occurrences_from_raster()
,
simulate_random_walk()
,
simulate_timeseries()
# Load packages library(sf) library(ggplot2) library(tidyterra) # Create polygon plgn <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7, 9, 5, 2)))) # 1. Random spatial pattern rs_pattern_random <- create_spatial_pattern( polygon = plgn, resolution = 0.1, spatial_pattern = "random", seed = 123) ggplot() + geom_spatraster(data = rs_pattern_random) + scale_fill_continuous(type = "viridis") + theme_minimal() # 2. Clustered spatial pattern rs_pattern_clustered <- create_spatial_pattern( polygon = plgn, resolution = 0.1, spatial_pattern = "clustered", seed = 123) ggplot() + geom_spatraster(data = rs_pattern_clustered) + scale_fill_continuous(type = "viridis") + theme_minimal() # 3. User defined spatial pattern # Large scale clustering rs_pattern_large <- create_spatial_pattern( polygon = plgn, resolution = 0.1, spatial_pattern = 100, seed = 123) ggplot() + geom_spatraster(data = rs_pattern_large) + scale_fill_continuous(type = "viridis") + theme_minimal()
# Load packages library(sf) library(ggplot2) library(tidyterra) # Create polygon plgn <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7, 9, 5, 2)))) # 1. Random spatial pattern rs_pattern_random <- create_spatial_pattern( polygon = plgn, resolution = 0.1, spatial_pattern = "random", seed = 123) ggplot() + geom_spatraster(data = rs_pattern_random) + scale_fill_continuous(type = "viridis") + theme_minimal() # 2. Clustered spatial pattern rs_pattern_clustered <- create_spatial_pattern( polygon = plgn, resolution = 0.1, spatial_pattern = "clustered", seed = 123) ggplot() + geom_spatraster(data = rs_pattern_clustered) + scale_fill_continuous(type = "viridis") + theme_minimal() # 3. User defined spatial pattern # Large scale clustering rs_pattern_large <- create_spatial_pattern( polygon = plgn, resolution = 0.1, spatial_pattern = 100, seed = 123) ggplot() + geom_spatraster(data = rs_pattern_large) + scale_fill_continuous(type = "viridis") + theme_minimal()
This function filters observations from all occurrences based on the
sampling_status
column, typically created by the sample_observations()
function.
filter_observations(observations_total, invert = FALSE)
filter_observations(observations_total, invert = FALSE)
observations_total |
An sf object with POINT geometry or a simple
dataframe with |
invert |
Logical. If |
A data frame or an sf object with POINT geometry containing the
filtered observations. If invert = FALSE
, the function returns detected
occurrences. If invert = TRUE
, it returns all other occurrences.
Other main:
add_coordinate_uncertainty()
,
grid_designation()
,
sample_observations()
,
simulate_occurrences()
# Create dataframe with sampling status column occurrences_data <- data.frame( time_point = 1, sampling_prob = seq(0.5, 1, 0.1), sampling_status = rep(c("undetected", "detected"), each = 3) ) # Keep detected occurrences filter_observations(occurrences_data) # Keep undetected occurrences filter_observations(occurrences_data, invert = TRUE)
# Create dataframe with sampling status column occurrences_data <- data.frame( time_point = 1, sampling_prob = seq(0.5, 1, 0.1), sampling_status = rep(c("undetected", "detected"), each = 3) ) # Keep detected occurrences filter_observations(occurrences_data) # Keep undetected occurrences filter_observations(occurrences_data, invert = TRUE)
This function generates a random taxonomic hierarchy for a specified numbers of species, genera, families, orders, classes, phyla, and kingdoms. The output is a data frame with the hierarchical classification for each species.
generate_taxonomy( num_species, num_genera, num_families, num_orders = 1, num_classes = 1, num_phyla = 1, num_kingdoms = 1, seed = NA )
generate_taxonomy( num_species, num_genera, num_families, num_orders = 1, num_classes = 1, num_phyla = 1, num_kingdoms = 1, seed = NA )
num_species |
Number of species to generate, or a dataframe. With a dataframe, the function will create a species with taxonomic hierarchy for each row. The original columns of the dataframe will be retained in the output. |
num_genera |
Number of genera to generate. |
num_families |
Number of families to generate. |
num_orders |
Number of orders to generate. Defaults to 1. |
num_classes |
Number of classes to generate. Defaults to 1. |
num_phyla |
Number of phyla to generate. Defaults to 1. |
num_kingdoms |
Number of kingdoms to generate. Defaults to 1. |
seed |
A positive numeric value setting the seed for random number
generation to ensure reproducibility. If |
The function works by randomly assigning species to genera, genera to families, families to orders, orders to classes, classes to phyla, and phyla to kingdoms. Sampling is done with replacement, allowing multiple lower-level taxa (e.g., species) to be assigned to the same higher-level taxon (e.g., genus).
A data frame with the taxonomic classification of each species. If
num_species
is a dataframe, the taxonomic classification is added to this
input dataframe. The original columns of the dataframe will be retained in
the output.
Other multispecies:
map_add_coordinate_uncertainty()
,
map_filter_observations()
,
map_grid_designation()
,
map_sample_observations()
,
map_simulate_occurrences()
# 1. Create simple taxonomic hierarchy generate_taxonomy( num_species = 5, num_genera = 3, num_families = 2, seed = 123) # 2. Add taxonomic hierarchy to a dataframe existing_df <- data.frame( count = c(1, 2, 5, 4, 8, 9, 3), det_prob = c(0.9, 0.9, 0.9, 0.8, 0.5, 0.2, 0.2) ) generate_taxonomy( num_species = existing_df, num_genera = 4, num_families = 2, seed = 125)
# 1. Create simple taxonomic hierarchy generate_taxonomy( num_species = 5, num_genera = 3, num_families = 2, seed = 123) # 2. Add taxonomic hierarchy to a dataframe existing_df <- data.frame( count = c(1, 2, 5, 4, 8, 9, 3), det_prob = c(0.9, 0.9, 0.9, 0.8, 0.5, 0.2, 0.2) ) generate_taxonomy( num_species = existing_df, num_genera = 4, num_families = 2, seed = 125)
This function designates observations to cells of a given grid to create an aggregated data cube.
grid_designation( observations, grid, id_col = "row_names", seed = NA, aggregate = TRUE, randomisation = c("uniform", "normal"), p_norm = ifelse(tolower(randomisation[1]) == "uniform", NA, 0.95) )
grid_designation( observations, grid, id_col = "row_names", seed = NA, aggregate = TRUE, randomisation = c("uniform", "normal"), p_norm = ifelse(tolower(randomisation[1]) == "uniform", NA, 0.95) )
observations |
An sf object with POINT geometry and a |
grid |
An sf object with POLYGON geometry (usually a grid) to which observations should be designated. |
id_col |
The column name containing unique IDs for each grid cell. If
|
seed |
A positive numeric value setting the seed for random number
generation to ensure reproducibility. If |
aggregate |
Logical. If |
randomisation |
Character. Method used for sampling within the
uncertainty circle around each observation. |
p_norm |
A numeric value between 0 and 1, used only if
|
If aggregate = TRUE
, an sf object with POLYGON geometry
containing the grid cells, an n
column with the number of observations per
grid cell, and a min_coord_uncertainty
column with the minimum coordinate
uncertainty per grid cell. If aggregate = FALSE
, an sf object with POINT
geometry containing the sampled observations within the uncertainty circles,
and a coordinateUncertaintyInMeters
column with the coordinate uncertainty
for each observation.
Other main:
add_coordinate_uncertainty()
,
filter_observations()
,
sample_observations()
,
simulate_occurrences()
library(sf) library(dplyr) # Create four random points n_points <- 4 xlim <- c(3841000, 3842000) ylim <- c(3110000, 3112000) coordinate_uncertainty <- rgamma(n_points, shape = 5, rate = 0.1) observations_sf <- data.frame( lat = runif(n_points, ylim[1], ylim[2]), long = runif(n_points, xlim[1], xlim[2]), time_point = 1, coordinateUncertaintyInMeters = coordinate_uncertainty ) %>% st_as_sf(coords = c("long", "lat"), crs = 3035) # Add buffer uncertainty in meters around points observations_buffered <- observations_sf %>% st_buffer(observations_sf$coordinateUncertaintyInMeters) # Create grid grid_df <- st_make_grid( observations_buffered, square = TRUE, cellsize = c(200, 200) ) %>% st_sf() # Create occurrence cube grid_designation( observations = observations_sf, grid = grid_df, seed = 123 )
library(sf) library(dplyr) # Create four random points n_points <- 4 xlim <- c(3841000, 3842000) ylim <- c(3110000, 3112000) coordinate_uncertainty <- rgamma(n_points, shape = 5, rate = 0.1) observations_sf <- data.frame( lat = runif(n_points, ylim[1], ylim[2]), long = runif(n_points, xlim[1], xlim[2]), time_point = 1, coordinateUncertaintyInMeters = coordinate_uncertainty ) %>% st_as_sf(coords = c("long", "lat"), crs = 3035) # Add buffer uncertainty in meters around points observations_buffered <- observations_sf %>% st_buffer(observations_sf$coordinateUncertaintyInMeters) # Create grid grid_df <- st_make_grid( observations_buffered, square = TRUE, cellsize = c(200, 200) ) %>% st_sf() # Create occurrence cube grid_designation( observations = observations_sf, grid = grid_df, seed = 123 )
add_coordinate_uncertainty()
over multiple speciesThis function executes add_coordinate_uncertainty()
over multiple rows of a
dataframe, representing different species, with potentially
different function arguments over multiple columns.
map_add_coordinate_uncertainty(df, nested = TRUE, arg_list = NA)
map_add_coordinate_uncertainty(df, nested = TRUE, arg_list = NA)
df |
A dataframe containing multiple rows, each representing a
different species. The columns are function arguments with values used for
mapping |
nested |
Logical. If |
arg_list |
A named list or |
In case of nested = TRUE
, a dataframe identical to df
, but each
sf object with POINT geometry in the list-column observations
now has an
additional column coordinateUncertaintyInMeters
added by
add_coordinate_uncertainty()
. In case of nested = FALSE
, this
list-column is expanded into additional rows and columns.
Other multispecies:
generate_taxonomy()
,
map_filter_observations()
,
map_grid_designation()
,
map_sample_observations()
,
map_simulate_occurrences()
# Load packages library(sf) library(dplyr) # Create polygon plgn <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7, 9, 5, 2)))) ## Example with simple column names # Specify dataframe for 3 species with custom function arguments species_dataset_df <- tibble( taxonID = c("species1", "species2", "species3"), species_range = rep(list(plgn), 3), initial_average_occurrences = c(50, 100, 200), n_time_points = rep(6, 3), temporal_function = c(simulate_random_walk, simulate_random_walk, NA), sd_step = c(1, 1, NA), spatial_pattern = "random", detection_probability = c(0.8, 0.9, 1), invert = FALSE, coords_uncertainty_meters = c(25, 30, 50), seed = 123) # Simulate occurrences sim_occ1 <- map_simulate_occurrences(df = species_dataset_df) # Sample observations samp_obs1 <- map_sample_observations(df = sim_occ1) # Filter observations filter_obs1 <- map_filter_observations(df = samp_obs1) # Add coordinate uncertainty obs_uncertainty_nested <- map_add_coordinate_uncertainty(df = filter_obs1) obs_uncertainty_nested ## Example with deviating column names # Specify dataframe for 3 species with custom function arguments species_dataset_df2 <- species_dataset_df %>% rename(polygon = species_range, sd = sd_step, det_prob = detection_probability, inv = invert, coord_uncertainty = coords_uncertainty_meters) # Create named list for argument conversion arg_conv_list <- list( species_range = "polygon", sd_step = "sd", detection_probability = "det_prob", invert = "inv", coords_uncertainty_meters = "coord_uncertainty" ) # Simulate occurrences sim_occ2 <- map_simulate_occurrences( df = species_dataset_df2, arg_list = arg_conv_list) # Sample observations samp_obs2 <- map_sample_observations( df = sim_occ2, arg_list = arg_conv_list) # Filter observations filter_obs2 <- map_filter_observations( df = samp_obs2, arg_list = arg_conv_list) # Add coordinate uncertainty map_add_coordinate_uncertainty( df = filter_obs2, arg_list = arg_conv_list)
# Load packages library(sf) library(dplyr) # Create polygon plgn <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7, 9, 5, 2)))) ## Example with simple column names # Specify dataframe for 3 species with custom function arguments species_dataset_df <- tibble( taxonID = c("species1", "species2", "species3"), species_range = rep(list(plgn), 3), initial_average_occurrences = c(50, 100, 200), n_time_points = rep(6, 3), temporal_function = c(simulate_random_walk, simulate_random_walk, NA), sd_step = c(1, 1, NA), spatial_pattern = "random", detection_probability = c(0.8, 0.9, 1), invert = FALSE, coords_uncertainty_meters = c(25, 30, 50), seed = 123) # Simulate occurrences sim_occ1 <- map_simulate_occurrences(df = species_dataset_df) # Sample observations samp_obs1 <- map_sample_observations(df = sim_occ1) # Filter observations filter_obs1 <- map_filter_observations(df = samp_obs1) # Add coordinate uncertainty obs_uncertainty_nested <- map_add_coordinate_uncertainty(df = filter_obs1) obs_uncertainty_nested ## Example with deviating column names # Specify dataframe for 3 species with custom function arguments species_dataset_df2 <- species_dataset_df %>% rename(polygon = species_range, sd = sd_step, det_prob = detection_probability, inv = invert, coord_uncertainty = coords_uncertainty_meters) # Create named list for argument conversion arg_conv_list <- list( species_range = "polygon", sd_step = "sd", detection_probability = "det_prob", invert = "inv", coords_uncertainty_meters = "coord_uncertainty" ) # Simulate occurrences sim_occ2 <- map_simulate_occurrences( df = species_dataset_df2, arg_list = arg_conv_list) # Sample observations samp_obs2 <- map_sample_observations( df = sim_occ2, arg_list = arg_conv_list) # Filter observations filter_obs2 <- map_filter_observations( df = samp_obs2, arg_list = arg_conv_list) # Add coordinate uncertainty map_add_coordinate_uncertainty( df = filter_obs2, arg_list = arg_conv_list)
filter_observations()
over multiple speciesThis function executes filter_observations()
over multiple rows of a
dataframe, representing different species, with potentially
different function arguments over multiple columns.
map_filter_observations(df, nested = TRUE, arg_list = NA)
map_filter_observations(df, nested = TRUE, arg_list = NA)
df |
A dataframe containing multiple rows, each representing a
different species. The columns are function arguments with values used for
mapping |
nested |
Logical. If |
arg_list |
A named list or |
In case of nested = TRUE
, a dataframe identical to df
, with an
extra list-column called observations
containing an sf object with POINT
geometry or simple dataframe for each row computed by
filter_observations()
. In case of nested = FALSE
, this list-column is
expanded into additional rows and columns.
Other multispecies:
generate_taxonomy()
,
map_add_coordinate_uncertainty()
,
map_grid_designation()
,
map_sample_observations()
,
map_simulate_occurrences()
# Load packages library(sf) library(dplyr) # Create polygon plgn <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7, 9, 5, 2)))) ## Example with simple column names # Specify dataframe for 3 species with custom function arguments species_dataset_df <- tibble( taxonID = c("species1", "species2", "species3"), species_range = rep(list(plgn), 3), initial_average_occurrences = c(50, 100, 200), n_time_points = rep(6, 3), temporal_function = c(simulate_random_walk, simulate_random_walk, NA), sd_step = c(1, 1, NA), spatial_pattern = "random", detection_probability = c(0.8, 0.9, 1), invert = FALSE, seed = 123) # Simulate occurrences sim_occ1 <- map_simulate_occurrences(df = species_dataset_df) # Sample observations samp_obs1 <- map_sample_observations(df = sim_occ1) # Filter observations filter_obs_nested <- map_filter_observations(df = samp_obs1) filter_obs_nested ## Example with deviating column names # Specify dataframe for 3 species with custom function arguments species_dataset_df2 <- species_dataset_df %>% rename(polygon = species_range, sd = sd_step, det_prob = detection_probability, inv = invert) # Create named list for argument conversion arg_conv_list <- list( species_range = "polygon", sd_step = "sd", detection_probability = "det_prob", invert = "inv" ) # Simulate occurrences sim_occ2 <- map_simulate_occurrences( df = species_dataset_df2, arg_list = arg_conv_list) # Sample observations samp_obs2 <- map_sample_observations( df = sim_occ2, arg_list = arg_conv_list) # Filter observations map_filter_observations( df = samp_obs2, arg_list = arg_conv_list)
# Load packages library(sf) library(dplyr) # Create polygon plgn <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7, 9, 5, 2)))) ## Example with simple column names # Specify dataframe for 3 species with custom function arguments species_dataset_df <- tibble( taxonID = c("species1", "species2", "species3"), species_range = rep(list(plgn), 3), initial_average_occurrences = c(50, 100, 200), n_time_points = rep(6, 3), temporal_function = c(simulate_random_walk, simulate_random_walk, NA), sd_step = c(1, 1, NA), spatial_pattern = "random", detection_probability = c(0.8, 0.9, 1), invert = FALSE, seed = 123) # Simulate occurrences sim_occ1 <- map_simulate_occurrences(df = species_dataset_df) # Sample observations samp_obs1 <- map_sample_observations(df = sim_occ1) # Filter observations filter_obs_nested <- map_filter_observations(df = samp_obs1) filter_obs_nested ## Example with deviating column names # Specify dataframe for 3 species with custom function arguments species_dataset_df2 <- species_dataset_df %>% rename(polygon = species_range, sd = sd_step, det_prob = detection_probability, inv = invert) # Create named list for argument conversion arg_conv_list <- list( species_range = "polygon", sd_step = "sd", detection_probability = "det_prob", invert = "inv" ) # Simulate occurrences sim_occ2 <- map_simulate_occurrences( df = species_dataset_df2, arg_list = arg_conv_list) # Sample observations samp_obs2 <- map_sample_observations( df = sim_occ2, arg_list = arg_conv_list) # Filter observations map_filter_observations( df = samp_obs2, arg_list = arg_conv_list)
grid_designation()
over multiple speciesThis function executes grid_designation()
over multiple rows of a
dataframe, representing different species, with potentially
different function arguments over multiple columns.
map_grid_designation(df, nested = TRUE, arg_list = NA)
map_grid_designation(df, nested = TRUE, arg_list = NA)
df |
A dataframe containing multiple rows, each representing a
different species. The columns are function arguments with values used for
mapping |
nested |
Logical. If |
arg_list |
A named list or |
In case of nested = TRUE
, a dataframe identical to df
, but each
sf object with POINT geometry in the list-column
observations
now has an additional column coordinateUncertaintyInMeters
added by grid_designation()
. In case of nested = FALSE
, this
list-column is expanded into additional rows and columns.
Other multispecies:
generate_taxonomy()
,
map_add_coordinate_uncertainty()
,
map_filter_observations()
,
map_sample_observations()
,
map_simulate_occurrences()
# Load packages library(sf) library(dplyr) # Create polygon plgn <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7, 9, 5, 2)))) # Create grid cube_grid <- st_make_grid( st_buffer(plgn, 25), n = c(20, 20), square = TRUE) %>% st_sf() ## Example with simple column names # Specify dataframe for 3 species with custom function arguments species_dataset_df <- tibble( taxonID = c("species1", "species2", "species3"), species_range = rep(list(plgn), 3), initial_average_occurrences = c(50, 100, 200), n_time_points = rep(6, 3), temporal_function = c(simulate_random_walk, simulate_random_walk, NA), sd_step = c(1, 1, NA), spatial_pattern = "random", detection_probability = c(0.8, 0.9, 1), invert = FALSE, coords_uncertainty_meters = c(25, 30, 50), grid = rep(list(cube_grid), 3), seed = 123) # Simulate occurrences sim_occ1 <- map_simulate_occurrences(df = species_dataset_df) # Sample observations samp_obs1 <- map_sample_observations(df = sim_occ1) # Filter observations filter_obs1 <- map_filter_observations(df = samp_obs1) # Add coordinate uncertainty obs_uncertainty1 <- map_add_coordinate_uncertainty(df = filter_obs1) # Grid designation occ_cube_nested <- map_grid_designation(df = obs_uncertainty1) occ_cube_nested ## Example with deviating column names # Specify dataframe for 3 species with custom function arguments species_dataset_df2 <- species_dataset_df %>% rename(polygon = species_range, sd = sd_step, det_prob = detection_probability, inv = invert, coord_uncertainty = coords_uncertainty_meters, raster = grid) # Create named list for argument conversion arg_conv_list <- list( species_range = "polygon", sd_step = "sd", detection_probability = "det_prob", invert = "inv", coords_uncertainty_meters = "coord_uncertainty", grid = "raster" ) # Simulate occurrences sim_occ2 <- map_simulate_occurrences( df = species_dataset_df2, arg_list = arg_conv_list) # Sample observations samp_obs2 <- map_sample_observations( df = sim_occ2, arg_list = arg_conv_list) # Filter observations filter_obs2 <- map_filter_observations( df = samp_obs2, arg_list = arg_conv_list) # Add coordinate uncertainty obs_uncertainty2 <- map_add_coordinate_uncertainty( df = filter_obs2, arg_list = arg_conv_list) # Grid designation map_grid_designation( df = obs_uncertainty2, arg_list = arg_conv_list)
# Load packages library(sf) library(dplyr) # Create polygon plgn <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7, 9, 5, 2)))) # Create grid cube_grid <- st_make_grid( st_buffer(plgn, 25), n = c(20, 20), square = TRUE) %>% st_sf() ## Example with simple column names # Specify dataframe for 3 species with custom function arguments species_dataset_df <- tibble( taxonID = c("species1", "species2", "species3"), species_range = rep(list(plgn), 3), initial_average_occurrences = c(50, 100, 200), n_time_points = rep(6, 3), temporal_function = c(simulate_random_walk, simulate_random_walk, NA), sd_step = c(1, 1, NA), spatial_pattern = "random", detection_probability = c(0.8, 0.9, 1), invert = FALSE, coords_uncertainty_meters = c(25, 30, 50), grid = rep(list(cube_grid), 3), seed = 123) # Simulate occurrences sim_occ1 <- map_simulate_occurrences(df = species_dataset_df) # Sample observations samp_obs1 <- map_sample_observations(df = sim_occ1) # Filter observations filter_obs1 <- map_filter_observations(df = samp_obs1) # Add coordinate uncertainty obs_uncertainty1 <- map_add_coordinate_uncertainty(df = filter_obs1) # Grid designation occ_cube_nested <- map_grid_designation(df = obs_uncertainty1) occ_cube_nested ## Example with deviating column names # Specify dataframe for 3 species with custom function arguments species_dataset_df2 <- species_dataset_df %>% rename(polygon = species_range, sd = sd_step, det_prob = detection_probability, inv = invert, coord_uncertainty = coords_uncertainty_meters, raster = grid) # Create named list for argument conversion arg_conv_list <- list( species_range = "polygon", sd_step = "sd", detection_probability = "det_prob", invert = "inv", coords_uncertainty_meters = "coord_uncertainty", grid = "raster" ) # Simulate occurrences sim_occ2 <- map_simulate_occurrences( df = species_dataset_df2, arg_list = arg_conv_list) # Sample observations samp_obs2 <- map_sample_observations( df = sim_occ2, arg_list = arg_conv_list) # Filter observations filter_obs2 <- map_filter_observations( df = samp_obs2, arg_list = arg_conv_list) # Add coordinate uncertainty obs_uncertainty2 <- map_add_coordinate_uncertainty( df = filter_obs2, arg_list = arg_conv_list) # Grid designation map_grid_designation( df = obs_uncertainty2, arg_list = arg_conv_list)
sample_observations()
over multiple speciesThis function executes sample_observations()
over multiple rows of a
dataframe, representing different species, with potentially
different function arguments over multiple columns.
map_sample_observations(df, nested = TRUE, arg_list = NA)
map_sample_observations(df, nested = TRUE, arg_list = NA)
df |
A dataframe containing multiple rows, each representing a
different species. The columns are function arguments with values used for
mapping |
nested |
Logical. If |
arg_list |
A named list or |
In case of nested = TRUE
, a dataframe identical to df
, with an
extra list-column called occurrences
containing an sf object with POINT
geometry for each row computed by sample_observations()
. In case of
nested = FALSE
, this list-column is expanded into additional rows and
columns.
Other multispecies:
generate_taxonomy()
,
map_add_coordinate_uncertainty()
,
map_filter_observations()
,
map_grid_designation()
,
map_simulate_occurrences()
# Load packages library(sf) library(dplyr) # Create polygon plgn <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7, 9, 5, 2)))) ## Example with simple column names # Specify dataframe for 3 species with custom function arguments species_dataset_df <- tibble( taxonID = c("species1", "species2", "species3"), species_range = rep(list(plgn), 3), initial_average_occurrences = c(50, 100, 200), n_time_points = rep(6, 3), temporal_function = c(simulate_random_walk, simulate_random_walk, NA), sd_step = c(1, 1, NA), spatial_pattern = "random", detection_probability = c(0.8, 0.9, 1), seed = 123) # Simulate occurrences sim_occ1 <- map_simulate_occurrences(df = species_dataset_df) # Sample observations samp_obs_nested <- map_sample_observations(df = sim_occ1) samp_obs_nested ## Example with deviating column names # Specify dataframe for 3 species with custom function arguments species_dataset_df2 <- species_dataset_df %>% rename(polygon = species_range, sd = sd_step, det_prob = detection_probability) # Create named list for argument conversion arg_conv_list <- list( species_range = "polygon", sd_step = "sd", detection_probability = "det_prob" ) # Simulate occurrences sim_occ2 <- map_simulate_occurrences( df = species_dataset_df2, arg_list = arg_conv_list) # Sample observations map_sample_observations( df = sim_occ2, arg_list = arg_conv_list)
# Load packages library(sf) library(dplyr) # Create polygon plgn <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7, 9, 5, 2)))) ## Example with simple column names # Specify dataframe for 3 species with custom function arguments species_dataset_df <- tibble( taxonID = c("species1", "species2", "species3"), species_range = rep(list(plgn), 3), initial_average_occurrences = c(50, 100, 200), n_time_points = rep(6, 3), temporal_function = c(simulate_random_walk, simulate_random_walk, NA), sd_step = c(1, 1, NA), spatial_pattern = "random", detection_probability = c(0.8, 0.9, 1), seed = 123) # Simulate occurrences sim_occ1 <- map_simulate_occurrences(df = species_dataset_df) # Sample observations samp_obs_nested <- map_sample_observations(df = sim_occ1) samp_obs_nested ## Example with deviating column names # Specify dataframe for 3 species with custom function arguments species_dataset_df2 <- species_dataset_df %>% rename(polygon = species_range, sd = sd_step, det_prob = detection_probability) # Create named list for argument conversion arg_conv_list <- list( species_range = "polygon", sd_step = "sd", detection_probability = "det_prob" ) # Simulate occurrences sim_occ2 <- map_simulate_occurrences( df = species_dataset_df2, arg_list = arg_conv_list) # Sample observations map_sample_observations( df = sim_occ2, arg_list = arg_conv_list)
simulate_occurrences()
over multiple speciesThis function executes simulate_occurrences()
over multiple rows of a
dataframe, representing different species, with potentially
different function arguments over multiple columns.
map_simulate_occurrences(df, nested = TRUE, arg_list = NA)
map_simulate_occurrences(df, nested = TRUE, arg_list = NA)
df |
A dataframe containing multiple rows, each representing a
different species. The columns are function arguments with values used for
mapping |
nested |
Logical. If |
arg_list |
A named list or |
In case of nested = TRUE
, a dataframe identical to df
, with an
extra list-column called occurrences
containing an sf object with POINT
geometry for each row computed by simulate_occurrences()
. In case of
nested = FALSE
, this list-column is expanded into additional rows and
columns.
Other multispecies:
generate_taxonomy()
,
map_add_coordinate_uncertainty()
,
map_filter_observations()
,
map_grid_designation()
,
map_sample_observations()
# Load packages library(sf) library(dplyr) # Create polygon plgn <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7, 9, 5, 2)))) ## Example with simple column names # Specify dataframe for 3 species with custom function arguments species_dataset_df <- tibble( taxonID = c("species1", "species2", "species3"), species_range = rep(list(plgn), 3), initial_average_occurrences = c(50, 100, 200), n_time_points = rep(6, 3), temporal_function = c(simulate_random_walk, simulate_random_walk, NA), sd_step = c(1, 1, NA), spatial_pattern = "random", seed = 123) # Simulate occurrences sim_occ_nested <- map_simulate_occurrences(df = species_dataset_df) sim_occ_nested ## Example with deviating column names # Specify dataframe for 3 species with custom function arguments species_dataset_df2 <- species_dataset_df %>% rename(polygon = species_range, sd = sd_step) # Create named list for argument conversion arg_conv_list <- list( species_range = "polygon", sd_step = "sd" ) # Simulate occurrences map_simulate_occurrences( df = species_dataset_df2, arg_list = arg_conv_list)
# Load packages library(sf) library(dplyr) # Create polygon plgn <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7, 9, 5, 2)))) ## Example with simple column names # Specify dataframe for 3 species with custom function arguments species_dataset_df <- tibble( taxonID = c("species1", "species2", "species3"), species_range = rep(list(plgn), 3), initial_average_occurrences = c(50, 100, 200), n_time_points = rep(6, 3), temporal_function = c(simulate_random_walk, simulate_random_walk, NA), sd_step = c(1, 1, NA), spatial_pattern = "random", seed = 123) # Simulate occurrences sim_occ_nested <- map_simulate_occurrences(df = species_dataset_df) sim_occ_nested ## Example with deviating column names # Specify dataframe for 3 species with custom function arguments species_dataset_df2 <- species_dataset_df %>% rename(polygon = species_range, sd = sd_step) # Create named list for argument conversion arg_conv_list <- list( species_range = "polygon", sd_step = "sd" ) # Simulate occurrences map_simulate_occurrences( df = species_dataset_df2, arg_list = arg_conv_list)
This function executes a cube simulation function (simulate_occurrences()
,
sample_observations()
, filter_observations()
,
add_coordinate_uncertainty()
, or grid_designation()
) over multiple rows
of a dataframe with potentially different function arguments over multiple
columns.
map_simulation_functions(f, df, nested = TRUE)
map_simulation_functions(f, df, nested = TRUE)
f |
One of five cube simulation functions: |
df |
A dataframe containing multiple rows, each representing a
different species. The columns are function arguments with values used for
mapping |
nested |
Logical. If |
In case of nested = TRUE
, a dataframe identical to df
, with an
extra list-column called mapped_col
containing an sf object for each row
computed by the function specified in f
. In case of nested = FALSE
, this
list-column is expanded into additional rows and columns.
# Load packages library(sf) library(dplyr) # Create polygon plgn <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7, 9, 5, 2)))) ## Example with simple column names # Specify dataframe for 3 species with custom function arguments species_dataset_df <- tibble( taxonID = c("species1", "species2", "species3"), species_range = rep(list(plgn), 3), initial_average_occurrences = c(50, 100, 500), n_time_points = rep(6, 3), temporal_function = c(simulate_random_walk, simulate_random_walk, NA), sd_step = c(1, 1, NA), spatial_pattern = "random", seed = 123) # Simulate occurrences sim_occ_raw <- map_simulation_functions( f = simulate_occurrences, df = species_dataset_df) sim_occ_raw # Unnest output and create sf object sim_occ_raw_unnested <- map_simulation_functions( f = simulate_occurrences, df = species_dataset_df, nested = FALSE) sim_occ_raw_unnested %>% st_sf()
# Load packages library(sf) library(dplyr) # Create polygon plgn <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7, 9, 5, 2)))) ## Example with simple column names # Specify dataframe for 3 species with custom function arguments species_dataset_df <- tibble( taxonID = c("species1", "species2", "species3"), species_range = rep(list(plgn), 3), initial_average_occurrences = c(50, 100, 500), n_time_points = rep(6, 3), temporal_function = c(simulate_random_walk, simulate_random_walk, NA), sd_step = c(1, 1, NA), spatial_pattern = "random", seed = 123) # Simulate occurrences sim_occ_raw <- map_simulation_functions( f = simulate_occurrences, df = species_dataset_df) sim_occ_raw # Unnest output and create sf object sim_occ_raw_unnested <- map_simulation_functions( f = simulate_occurrences, df = species_dataset_df, nested = FALSE) sim_occ_raw_unnested %>% st_sf()
This function samples a new observations point of a species within the uncertainty circle around each observation assuming a bivariate Normal distribution.
sample_from_binormal_circle(observations, p_norm = 0.95, seed = NA)
sample_from_binormal_circle(observations, p_norm = 0.95, seed = NA)
observations |
An sf object with POINT geometry and a |
p_norm |
A numeric value between 0 and 1. The proportion of all possible samples from a bivariate Normal distribution that fall within the uncertainty circle. Default is 0.95. See Details. |
seed |
A positive numeric value setting the seed for random number
generation to ensure reproducibility. If |
A new observation point is sampled from a bivariate Normal
distribution with means equal to the X and Y coordinates of its original
observation point and variances equal to
(-coordinateUncertaintyInMeters
^2) / (2 * log(1 - p_norm
)),
ensuring p_norm
% of all possible samples fall within the uncertainty
circle.
An sf object with POINT geometry containing the locations of the
sampled occurrences and a coordinateUncertaintyInMeters
column containing
the coordinate uncertainty for each observation.
Other designation:
sample_from_uniform_circle()
library(sf) library(dplyr) # Create four random points n_points <- 4 xlim <- c(3841000, 3842000) ylim <- c(3110000, 3112000) coordinate_uncertainty <- rgamma(n_points, shape = 5, rate = 0.1) observations_sf <- data.frame( lat = runif(n_points, ylim[1], ylim[2]), long = runif(n_points, xlim[1], xlim[2]), time_point = 1, coordinateUncertaintyInMeters = coordinate_uncertainty ) %>% st_as_sf(coords = c("long", "lat"), crs = 3035) # Sample points within uncertainty circles according to normal rules sample_from_binormal_circle( observations = observations_sf, p_norm = 0.95, seed = 123 )
library(sf) library(dplyr) # Create four random points n_points <- 4 xlim <- c(3841000, 3842000) ylim <- c(3110000, 3112000) coordinate_uncertainty <- rgamma(n_points, shape = 5, rate = 0.1) observations_sf <- data.frame( lat = runif(n_points, ylim[1], ylim[2]), long = runif(n_points, xlim[1], xlim[2]), time_point = 1, coordinateUncertaintyInMeters = coordinate_uncertainty ) %>% st_as_sf(coords = c("long", "lat"), crs = 3035) # Sample points within uncertainty circles according to normal rules sample_from_binormal_circle( observations = observations_sf, p_norm = 0.95, seed = 123 )
This function samples a new observations point of a species within the uncertainty circle around each observation assuming a Uniform distribution.
sample_from_uniform_circle(observations, seed = NA)
sample_from_uniform_circle(observations, seed = NA)
observations |
An sf object with POINT geometry and a |
seed |
A positive numeric value setting the seed for random number
generation to ensure reproducibility. If |
An sf object with POINT geometry containing the locations of the
sampled occurrences and a coordinateUncertaintyInMeters
column containing
the coordinate uncertainty for each observation.
Other designation:
sample_from_binormal_circle()
library(sf) # Create four random points n_points <- 4 xlim <- c(3841000, 3842000) ylim <- c(3110000, 3112000) coordinate_uncertainty <- rgamma(n_points, shape = 5, rate = 0.1) observations_sf <- data.frame( lat = runif(n_points, ylim[1], ylim[2]), long = runif(n_points, xlim[1], xlim[2]), time_point = 1, coordinateUncertaintyInMeters = coordinate_uncertainty ) %>% st_as_sf(coords = c("long", "lat"), crs = 3035) # Sample points within uncertainty circles according to uniform rules sample_from_uniform_circle( observations = observations_sf, seed = 123 )
library(sf) # Create four random points n_points <- 4 xlim <- c(3841000, 3842000) ylim <- c(3110000, 3112000) coordinate_uncertainty <- rgamma(n_points, shape = 5, rate = 0.1) observations_sf <- data.frame( lat = runif(n_points, ylim[1], ylim[2]), long = runif(n_points, xlim[1], xlim[2]), time_point = 1, coordinateUncertaintyInMeters = coordinate_uncertainty ) %>% st_as_sf(coords = c("long", "lat"), crs = 3035) # Sample points within uncertainty circles according to uniform rules sample_from_uniform_circle( observations = observations_sf, seed = 123 )
The function computes observations from occurrences based on detection probability and sampling bias by implementing a Bernoulli trial.
sample_observations( occurrences, detection_probability = 1, sampling_bias = c("no_bias", "polygon", "manual"), bias_area = NA, bias_strength = 1, bias_weights = NA, seed = NA )
sample_observations( occurrences, detection_probability = 1, sampling_bias = c("no_bias", "polygon", "manual"), bias_area = NA, bias_strength = 1, bias_weights = NA, seed = NA )
occurrences |
An sf object with POINT geometry representing the occurrences. |
detection_probability |
A numeric value between 0 and 1 representing the probability of detecting the species. |
sampling_bias |
A character string specifying the method to generate a
sampling bias.
Options are
|
bias_area |
An |
bias_strength |
A positive numeric value, or |
bias_weights |
A grid layer (an sf object with POLYGON geometry), or
|
seed |
A positive numeric value setting the seed for random number
generation to ensure reproducibility. If |
An sf object with POINT geometry containing the locations of the occurrence with detection status. The object includes the following columns:
detection_probability
The detection probability for each occurrence (will be the same for all).
bias_weight
The sampling probability based on sampling bias for each occurrence.
sampling_probability
The combined sampling probability from detection probability and sampling bias for each occurrence.
sampling_status
Indicates whether the occurrence was detected
("detected"
) or not ("undetected"
). Detected occurrences are called
observations.
Other main:
add_coordinate_uncertainty()
,
filter_observations()
,
grid_designation()
,
simulate_occurrences()
# Load packages library(sf) library(dplyr) # Simulate some occurrence data with coordinates and time points num_points <- 10 occurrences <- data.frame( lon = runif(num_points, min = -180, max = 180), lat = runif(num_points, min = -90, max = 90), time_point = 0 ) # Convert the occurrence data to an sf object occurrences_sf <- st_as_sf(occurrences, coords = c("lon", "lat")) # 1. Sample observations without sampling bias sample_observations( occurrences_sf, detection_probability = 0.8, sampling_bias = "no_bias", seed = 123 ) # 2. Sample observations with sampling bias in a polygon # Create bias_area polygon overlapping two of the points selected_observations <- st_union(occurrences_sf[2:3,]) bias_area <- st_convex_hull(selected_observations) %>% st_buffer(dist = 50) %>% st_as_sf() sample_observations( occurrences_sf, detection_probability = 0.8, sampling_bias = "polygon", bias_area = bias_area, bias_strength = 2, seed = 123 ) # 3. Sample observations with sampling bias given manually in a grid # Create raster grid with bias weights between 0 and 1 grid <- st_make_grid(occurrences_sf) %>% st_sf() %>% mutate(bias_weight = runif(n(), min = 0, max = 1)) sample_observations( occurrences_sf, detection_probability = 0.8, sampling_bias = "manual", bias_weights = grid, seed = 123 )
# Load packages library(sf) library(dplyr) # Simulate some occurrence data with coordinates and time points num_points <- 10 occurrences <- data.frame( lon = runif(num_points, min = -180, max = 180), lat = runif(num_points, min = -90, max = 90), time_point = 0 ) # Convert the occurrence data to an sf object occurrences_sf <- st_as_sf(occurrences, coords = c("lon", "lat")) # 1. Sample observations without sampling bias sample_observations( occurrences_sf, detection_probability = 0.8, sampling_bias = "no_bias", seed = 123 ) # 2. Sample observations with sampling bias in a polygon # Create bias_area polygon overlapping two of the points selected_observations <- st_union(occurrences_sf[2:3,]) bias_area <- st_convex_hull(selected_observations) %>% st_buffer(dist = 50) %>% st_as_sf() sample_observations( occurrences_sf, detection_probability = 0.8, sampling_bias = "polygon", bias_area = bias_area, bias_strength = 2, seed = 123 ) # 3. Sample observations with sampling bias given manually in a grid # Create raster grid with bias weights between 0 and 1 grid <- st_make_grid(occurrences_sf) %>% st_sf() %>% mutate(bias_weight = runif(n(), min = 0, max = 1)) sample_observations( occurrences_sf, detection_probability = 0.8, sampling_bias = "manual", bias_weights = grid, seed = 123 )
This function draws point occurrences from a spatial random field represented by a raster. Points are sampled based on the values in the raster, with the number of occurrences specified for each time step.
sample_occurrences_from_raster(raster, time_series, seed = NA)
sample_occurrences_from_raster(raster, time_series, seed = NA)
raster |
A SpatRaster object (see |
time_series |
A vector with the number of occurrences per time point. |
seed |
A positive numeric value setting the seed for random number
generation to ensure reproducibility. If |
An sf object with POINT geometry containing the locations of the
simulated occurrences, a time_point
column indicating the associated
time point for each occurrence and columns used as weights for sampling.
If the raster is created with create_spatial_pattern()
, the column
sampling_p1
is used.
Other occurrence:
create_spatial_pattern()
,
simulate_random_walk()
,
simulate_timeseries()
# Load packages library(sf) library(ggplot2) library(tidyterra) # Create polygon plgn <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7, 9, 5, 2)))) ## Medium scale clustering # Create the random field rs_pattern_clustered <- create_spatial_pattern( polygon = plgn, resolution = 0.1, spatial_pattern = "clustered", seed = 123) # Sample 200 occurrences from random field pts_occ_clustered <- sample_occurrences_from_raster( raster = rs_pattern_clustered, time_series = 200, seed = 123) ggplot() + geom_spatraster(data = rs_pattern_clustered) + geom_sf(data = pts_occ_clustered) + scale_fill_continuous(type = "viridis") + theme_minimal() ## Large scale clustering # Create the random field rs_pattern_large <- create_spatial_pattern( polygon = plgn, resolution = 0.1, spatial_pattern = 100, seed = 123) # Sample 200 occurrences from random field pts_occ_large <- sample_occurrences_from_raster( raster = rs_pattern_large, time_series = 200, seed = 123) ggplot() + geom_spatraster(data = rs_pattern_large) + geom_sf(data = pts_occ_large) + scale_fill_continuous(type = "viridis") + theme_minimal()
# Load packages library(sf) library(ggplot2) library(tidyterra) # Create polygon plgn <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7, 9, 5, 2)))) ## Medium scale clustering # Create the random field rs_pattern_clustered <- create_spatial_pattern( polygon = plgn, resolution = 0.1, spatial_pattern = "clustered", seed = 123) # Sample 200 occurrences from random field pts_occ_clustered <- sample_occurrences_from_raster( raster = rs_pattern_clustered, time_series = 200, seed = 123) ggplot() + geom_spatraster(data = rs_pattern_clustered) + geom_sf(data = pts_occ_clustered) + scale_fill_continuous(type = "viridis") + theme_minimal() ## Large scale clustering # Create the random field rs_pattern_large <- create_spatial_pattern( polygon = plgn, resolution = 0.1, spatial_pattern = 100, seed = 123) # Sample 200 occurrences from random field pts_occ_large <- sample_occurrences_from_raster( raster = rs_pattern_large, time_series = 200, seed = 123) ggplot() + geom_spatraster(data = rs_pattern_large) + geom_sf(data = pts_occ_large) + scale_fill_continuous(type = "viridis") + theme_minimal()
This function simulates occurrences of a species within a specified spatial and/or temporal extent.
simulate_occurrences( species_range, initial_average_occurrences = 50, spatial_pattern = c("random", "clustered"), n_time_points = 1, temporal_function = NA, ..., seed = NA )
simulate_occurrences( species_range, initial_average_occurrences = 50, spatial_pattern = c("random", "clustered"), n_time_points = 1, temporal_function = NA, ..., seed = NA )
species_range |
An sf object with POLYGON geometry indicating the spatial extend to simulate occurrences. |
initial_average_occurrences |
A positive numeric value indicating the
average number of occurrences to be simulated within the extent of
|
spatial_pattern |
Specifies the spatial pattern of occurrences. It can
be a character string ( |
n_time_points |
A positive integer specifying the number of time points to simulate. |
temporal_function |
A function generating a trend in number of
occurrences over time, or |
... |
Additional arguments to be passed to |
seed |
A positive numeric value setting the seed for random number
generation to ensure reproducibility. If |
An sf object with POINT geometry containing the locations of the
simulated occurrences, a time_point
column indicating the associated
time point for each occurrence and a sampling_p1
column indicating the
sampling probability associated with the spatial pattern (see
create_spatial_pattern()
).
Other main:
add_coordinate_uncertainty()
,
filter_observations()
,
grid_designation()
,
sample_observations()
# Load packages library(sf) library(ggplot2) # Create polygon plgn <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7, 9, 5, 2)))) # 1. Random spatial pattern with 4 time points occ_sf <- simulate_occurrences( species_range = plgn, n_time_points = 4, initial_average_occurrences = 100, seed = 123) ggplot() + geom_sf(data = occ_sf) + geom_sf(data = plgn, fill = NA) + facet_wrap("time_point") + labs( title = "Occurrences with random spatial and temporal pattern", subtitle = "4 time steps") + theme_minimal() # 2. Highly clustered spatial pattern with 6 time points occ_sf_100 <- simulate_occurrences( species_range = plgn, spatial_pattern = 100, n_time_points = 6, initial_average_occurrences = 100, seed = 123) ggplot() + geom_sf(data = occ_sf_100) + geom_sf(data = plgn, fill = NA) + facet_wrap("time_point") + labs( title = "Occurrences with structured spatial and temporal pattern", subtitle = "6 time steps") + theme_minimal()
# Load packages library(sf) library(ggplot2) # Create polygon plgn <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7, 9, 5, 2)))) # 1. Random spatial pattern with 4 time points occ_sf <- simulate_occurrences( species_range = plgn, n_time_points = 4, initial_average_occurrences = 100, seed = 123) ggplot() + geom_sf(data = occ_sf) + geom_sf(data = plgn, fill = NA) + facet_wrap("time_point") + labs( title = "Occurrences with random spatial and temporal pattern", subtitle = "4 time steps") + theme_minimal() # 2. Highly clustered spatial pattern with 6 time points occ_sf_100 <- simulate_occurrences( species_range = plgn, spatial_pattern = 100, n_time_points = 6, initial_average_occurrences = 100, seed = 123) ggplot() + geom_sf(data = occ_sf_100) + geom_sf(data = plgn, fill = NA) + facet_wrap("time_point") + labs( title = "Occurrences with structured spatial and temporal pattern", subtitle = "6 time steps") + theme_minimal()
This function simulates a timeseries for the average number of occurrences of a species using a random walk over time.
simulate_random_walk( initial_average_occurrences = 50, n_time_points = 10, sd_step = 0.05, seed = NA )
simulate_random_walk( initial_average_occurrences = 50, n_time_points = 10, sd_step = 0.05, seed = NA )
initial_average_occurrences |
A positive numeric value indicating the average number of occurrences to be simulated at the first time point. |
n_time_points |
A positive integer specifying the number of time points to simulate. |
sd_step |
A positive numeric value indicating the standard deviation of the random steps. |
seed |
A positive numeric value setting the seed for random number
generation to ensure reproducibility. If |
A vector of integers of length n_time_points
with the average
number of occurrences.
Other occurrence:
create_spatial_pattern()
,
sample_occurrences_from_raster()
,
simulate_timeseries()
simulate_random_walk( initial_average_occurrences = 50, n_time_points = 10, sd_step = 1, seed = 123 )
simulate_random_walk( initial_average_occurrences = 50, n_time_points = 10, sd_step = 1, seed = 123 )
This function simulates a timeseries for the number of occurrences of a species.
simulate_timeseries( initial_average_occurrences = 50, n_time_points = 1, temporal_function = NA, ..., seed = NA )
simulate_timeseries( initial_average_occurrences = 50, n_time_points = 1, temporal_function = NA, ..., seed = NA )
initial_average_occurrences |
A positive numeric value indicating the average number of occurrences to be simulated at the first time point. This value serves as the mean (lambda) of a Poisson distribution. |
n_time_points |
A positive integer specifying the number of time points to simulate. |
temporal_function |
A function generating a trend in number of
occurrences over time, or |
... |
Additional arguments to be passed to |
seed |
A positive numeric value setting the seed for random number
generation to ensure reproducibility. If |
A vector of integers of length n_time_points
with the number of
occurrences.
Other occurrence:
create_spatial_pattern()
,
sample_occurrences_from_raster()
,
simulate_random_walk()
# 1. Use the function simulate_random_walk() simulate_timeseries( initial_average_occurrences = 50, n_time_points = 10, temporal_function = simulate_random_walk, sd_step = 1, seed = 123 ) # 2. Using your own custom function, e.g. this linear function my_own_linear_function <- function( initial_average_occurrences = initial_average_occurrences, n_time_points = n_time_points, coef) { # Calculate new average abundances over time time <- seq_len(n_time_points) - 1 lambdas <- initial_average_occurrences + (coef * time) # Identify where the lambda values become 0 or lower zero_or_lower_index <- which(lambdas <= 0) # If any lambda becomes 0 or lower, set all subsequent lambdas to 0 if (length(zero_or_lower_index) > 0) { zero_or_lower_indices <- zero_or_lower_index[1]:n_time_points lambdas[zero_or_lower_indices] <- 0 } # Return average abundances return(lambdas) } # Draw n_sim number of occurrences from Poisson distribution using # the custom function n_sim <- 10 n_time_points <- 50 slope <- 1 list_abundances <- vector("list", length = n_sim) # Loop n_sim times over simulate_timeseries() for (i in seq_len(n_sim)) { abundances <- simulate_timeseries( initial_average_occurrences = 50, n_time_points = n_time_points, temporal_function = my_own_linear_function, coef = slope ) list_abundances[[i]] <- data.frame( time = seq_along(abundances), abundance = abundances, sim = i ) } # Combine list of dataframes data_abundances <- do.call(rbind.data.frame, list_abundances) # Plot the simulated abundances over time using ggplot2 library(ggplot2) ggplot(data_abundances, aes(x = time, y = abundance, colour = factor(sim))) + geom_line() + labs( x = "Time", y = "Species abundance", title = paste( n_sim, "simulated trends using custom linear function", "with slope", slope ) ) + scale_y_continuous(limits = c(0, NA)) + scale_x_continuous(breaks = seq(0, n_time_points, 5)) + theme_minimal() + theme(legend.position = "")
# 1. Use the function simulate_random_walk() simulate_timeseries( initial_average_occurrences = 50, n_time_points = 10, temporal_function = simulate_random_walk, sd_step = 1, seed = 123 ) # 2. Using your own custom function, e.g. this linear function my_own_linear_function <- function( initial_average_occurrences = initial_average_occurrences, n_time_points = n_time_points, coef) { # Calculate new average abundances over time time <- seq_len(n_time_points) - 1 lambdas <- initial_average_occurrences + (coef * time) # Identify where the lambda values become 0 or lower zero_or_lower_index <- which(lambdas <= 0) # If any lambda becomes 0 or lower, set all subsequent lambdas to 0 if (length(zero_or_lower_index) > 0) { zero_or_lower_indices <- zero_or_lower_index[1]:n_time_points lambdas[zero_or_lower_indices] <- 0 } # Return average abundances return(lambdas) } # Draw n_sim number of occurrences from Poisson distribution using # the custom function n_sim <- 10 n_time_points <- 50 slope <- 1 list_abundances <- vector("list", length = n_sim) # Loop n_sim times over simulate_timeseries() for (i in seq_len(n_sim)) { abundances <- simulate_timeseries( initial_average_occurrences = 50, n_time_points = n_time_points, temporal_function = my_own_linear_function, coef = slope ) list_abundances[[i]] <- data.frame( time = seq_along(abundances), abundance = abundances, sim = i ) } # Combine list of dataframes data_abundances <- do.call(rbind.data.frame, list_abundances) # Plot the simulated abundances over time using ggplot2 library(ggplot2) ggplot(data_abundances, aes(x = time, y = abundance, colour = factor(sim))) + geom_line() + labs( x = "Time", y = "Species abundance", title = paste( n_sim, "simulated trends using custom linear function", "with slope", slope ) ) + scale_y_continuous(limits = c(0, NA)) + scale_x_continuous(breaks = seq(0, n_time_points, 5)) + theme_minimal() + theme(legend.position = "")