| Title: | Workflow to visualise trait dispersion and assess invasibility |
|---|---|
| Description: | Utilities for assembling invader–resident interaction tensors, computing trait-based competition kernels from distance matrices, deriving environmental suitability kernels, calculating invasion fitness, and summarising trait dispersion metrics. Includes helpers to predict site × species responses for residents and simulated invaders. |
| Authors: | Sandra MacFadyen [aut, cre] |
| Maintainer: | Sandra MacFadyen <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-05-31 07:14:38 UTC |
| Source: | https://github.com/b-cubed-eu/invasimapr |
assemble_matrices() creates the standard inputs used throughout the
invasion-fitness pipeline from either a single long-format table or a set of
pre-assembled tables. It returns aligned objects including:
site_df: site by x,y coordinates
env_df: site by environment numeric matrix
comm_res: site by resident abundance matrix
pa_res: site by resident presence-absence matrix
traits_res: resident by trait data frame (mixed types allowed)
assemble_matrices( long_df = NULL, site_df = NULL, env_df = NULL, comm_res = NULL, traits_res = NULL, comm_long = c("auto", "long", "wide"), site_col = "site", x_col = "x", y_col = "y", species_col = "species", count_col = "count", env_cols = NULL, env_prefix = "^env", trait_cols = NULL, trait_prefix = "^trait", drop_empty_sites = TRUE, drop_empty_species = TRUE, return_diversity = TRUE, make_plots = FALSE )assemble_matrices( long_df = NULL, site_df = NULL, env_df = NULL, comm_res = NULL, traits_res = NULL, comm_long = c("auto", "long", "wide"), site_col = "site", x_col = "x", y_col = "y", species_col = "species", count_col = "count", env_cols = NULL, env_prefix = "^env", trait_cols = NULL, trait_prefix = "^trait", drop_empty_sites = TRUE, drop_empty_species = TRUE, return_diversity = TRUE, make_plots = FALSE )
long_df |
Optional long-format data frame with at least the columns
|
site_df |
Optional data frame with columns |
env_df |
Optional site by environment numeric data frame or matrix. Row names must correspond to site identifiers. |
comm_res |
Optional site by resident numeric matrix or data frame (wide),
or a long-format table with columns |
traits_res |
Optional resident by trait data frame with row names corresponding to species identifiers. |
comm_long |
Character string indicating how to interpret a separately
supplied |
site_col, x_col, y_col, species_col, count_col
|
Column names used when
building from |
env_cols |
Character vector of environment column names to extract. |
env_prefix |
Regular expression used to select environment columns. |
trait_cols |
Character vector of trait column names to extract. |
trait_prefix |
Regular expression used to select trait columns. |
drop_empty_sites |
Logical. If |
drop_empty_species |
Logical. If |
return_diversity |
Logical. If |
make_plots |
Logical. If |
Build standardized invader predictors r_is_z, C_is_z, S_is_z (expects env/traits frames already projected to model columns)
build_invader_predictors( fit_r, env_df_model, traits_inv_model, sites, inv_ids, r_mu_s, r_sd_s, W_site, gower_all, res_ids, sigma_alpha, C_mu_s, C_sd_s, S_s_z, verbose = TRUE )build_invader_predictors( fit_r, env_df_model, traits_inv_model, sites, inv_ids, r_mu_s, r_sd_s, W_site, gower_all, res_ids, sigma_alpha, C_mu_s, C_sd_s, S_s_z, verbose = TRUE )
fit_r |
Fitted residents-only |
env_df_model |
Data frame of environment predictors used by the model
(rows = sites). Either |
traits_inv_model |
Data frame of invader traits used by the model
(rows = invaders). Includes |
sites |
Character vector of site identifiers (must match row names of
|
inv_ids |
Character vector of invader identifiers (must match row names
of |
r_mu_s |
Numeric vector of site-wise means used to standardize abiotic
suitability ( |
r_sd_s |
Numeric vector of site-wise standard deviations used to
standardize abiotic suitability ( |
W_site |
Site × resident weighting matrix (e.g. relative abundance or presence–absence weights). |
gower_all |
Square matrix or |
res_ids |
Character vector of resident species identifiers. |
sigma_alpha |
Numeric kernel bandwidth for crowding effects. If |
C_mu_s |
Numeric vector of site-wise means used to standardize crowding
( |
C_sd_s |
Numeric vector of site-wise standard deviations used to
standardize crowding ( |
S_s_z |
Numeric vector of standardized site saturation values. |
verbose |
Logical; if |
build_model_formula() assembles a right-hand side (RHS) for a GLMM (or LM/GLM)
from environment terms, trait terms, and (optionally) all pairwise
environment-by-trait interactions. It also appends random-effects structures,
such as (1 | site) + (1 | species) and optional zero-correlation random
slopes like (0 + r_z || site).
You can pass the terms directly as character vectors, or let the function
derive them from env_df and/or trait_df column names.
build_model_formula( response = "abundance", env_terms = NULL, trait_terms = NULL, env_df = NULL, trait_df = NULL, include_intercept = TRUE, include_env_main = TRUE, include_trait_main = TRUE, include_env_trait_interactions = TRUE, extra_fixed = NULL, random_intercepts = c("site", "species"), random_slopes = NULL, backend = c("glmmTMB", "lme4"), verbose = FALSE )build_model_formula( response = "abundance", env_terms = NULL, trait_terms = NULL, env_df = NULL, trait_df = NULL, include_intercept = TRUE, include_env_main = TRUE, include_trait_main = TRUE, include_env_trait_interactions = TRUE, extra_fixed = NULL, random_intercepts = c("site", "species"), random_slopes = NULL, backend = c("glmmTMB", "lme4"), verbose = FALSE )
response |
Character scalar. Name of the response on the LHS
(default |
env_terms |
Optional character vector of environment term names to include
as fixed effects. If |
trait_terms |
Optional character vector of trait term names to include
as fixed effects. If |
env_df |
Optional data frame containing environment predictors; used only
to infer |
trait_df |
Optional data frame containing trait predictors; used only
to infer |
include_intercept |
Logical. Include the fixed-effect intercept?
If |
include_env_main |
Logical. Include environment main effects?
Default |
include_trait_main |
Logical. Include trait main effects?
Default |
include_env_trait_interactions |
Logical. Include all pairwise
environment-by-trait interactions? Implemented as
|
extra_fixed |
Optional character vector of additional fixed-effect terms
to append verbatim (e.g., |
random_intercepts |
Character vector of grouping factors for random
intercepts (e.g., |
random_slopes |
Named list of the form
|
backend |
Character flag used only for messaging; both lme4
and glmmTMB accept the same syntax here. Default |
verbose |
Logical. If |
Build a GLMM-ready model formula from trait and environment terms
An object of class formula, e.g.:
abundance ~ env1 + env2 + tr1 + tr2 + (env1 + env2):(tr1 + tr2) + (1 | site) + (1 | species) + (0 + r_z || site)
## Not run: # Toy data set.seed(1) env_df_z = data.frame(env1 = rnorm(10), env2 = rnorm(10)) traits_res_glmm = data.frame(tr1 = rnorm(5), tr2 = rnorm(5)) fml = build_model_formula( response = "abundance", env_df = env_df_z, trait_df = traits_res_glmm, random_intercepts = c("site","species"), random_slopes = list(site = c("r_z","C_z")) ) fml # Fit a GLMM (example; requires your long residents×sites table `dat_r`) # library(glmmTMB) # fit = glmmTMB::glmmTMB(fml, family = glmmTMB::tweedie(link="log"), data = dat_r) # summary(fit) ## End(Not run)## Not run: # Toy data set.seed(1) env_df_z = data.frame(env1 = rnorm(10), env2 = rnorm(10)) traits_res_glmm = data.frame(tr1 = rnorm(5), tr2 = rnorm(5)) fml = build_model_formula( response = "abundance", env_df = env_df_z, trait_df = traits_res_glmm, random_intercepts = c("site","species"), random_slopes = list(site = c("r_z","C_z")) ) fml # Fit a GLMM (example; requires your long residents×sites table `dat_r`) # library(glmmTMB) # fit = glmmTMB::glmmTMB(fml, family = glmmTMB::tweedie(link="log"), data = dat_r) # summary(fit) ## End(Not run)
Given 2D trait coordinates for residents and invaders (typically PCoA scores), this function:
estimates a robust centre and covariance for residents (MCD via MASS::cov.rob, fallback to classical),
computes Mahalanobis distances for residents and invaders,
converts distances to centrality using the resident distance CDF
(centrality = 1 - F(d); 1 = core, 0 = peripheral),
determines whether invaders sit inside the resident convex hull (familiar) or outside (novel),
returns a tidy table and three ggplot figures.
compute_centrality_hull( Q_res, Q_inv, ellipse_level = 0.5, point_size = 2.8, alpha = 0.95, stroke = 0.7, rank_by = c("centrality", "distance"), peripheral_first = TRUE, palette = "viridis" )compute_centrality_hull( Q_res, Q_inv, ellipse_level = 0.5, point_size = 2.8, alpha = 0.95, stroke = 0.7, rank_by = c("centrality", "distance"), peripheral_first = TRUE, palette = "viridis" )
Q_res |
Data frame with resident coordinates; must contain columns |
Q_inv |
Data frame with invader coordinates; must contain |
ellipse_level |
Numeric in (0,1); confidence level for the resident normal ellipse
in the trait-plane plot. Default |
point_size |
Numeric; point size in the trait-plane plot. Default |
alpha |
Numeric in (0,1]; point alpha. Default |
stroke |
Numeric; outline stroke width for points. Default |
rank_by |
Character; one of |
peripheral_first |
Logical; if |
palette |
Centrality fill palette; passed to |
Centrality and convex-hull membership in trait space
Mahalanobis distances may fail if the covariance is nearly singular; a tiny ridge is added
internally if needed. Hull membership is computed with sp when available,
otherwise with sf if available; if neither is installed and residents < 3,
hull membership is returned as NA.
A list with:
df — tidy table with columns:
id, grp ("resident"|"invader"), tr1, tr2,
d_md (Mahalanobis), d_eu (Euclidean), centrality (0–1),
in_hull (logical for invaders; residents = TRUE).
center, cov — robust centre and covariance used.
hull_df — closed ring (tr1,tr2) of resident convex hull, or NULL if <3 residents.
plots — list with:
p_trait (trait-plane scatter),
p_dist (distance distributions),
p_rank (invader ranking with hull flags; NULL if no invaders).
## Not run: # Assume you already computed PCoA coordinates: # Q_res, Q_inv each with columns tr1, tr2 and rownames as IDs. out = compute_centrality_hull(Q_res, Q_inv) head(out$df) out$plots$p_trait out$plots$p_dist out$plots$p_rank ## End(Not run)## Not run: # Assume you already computed PCoA coordinates: # Q_res, Q_inv each with columns tr1, tr2 and rownames as IDs. out = compute_centrality_hull(Q_res, Q_inv) head(out$df) out$plots$p_trait out$plots$p_dist out$plots$p_rank ## End(Not run)
Invasion fitness integrates trait-space geometry
(distances, overlaps, convex hulls, centroids) with abiotic suitability
(alignment of invader traits to the local environment), niche crowding
(overlap with resident trait space weighted by composition), and resident
competition (site saturation).
compute_establishment_probability() maps to
probabilities of establishment using a unified interface:
Probit: , where is a
scalar, the residual standard deviation from a fitted auxiliary model,
or a cell-wise predictive standard deviation.
Logistic: , where
is a scale parameter.
Hard rule: , yielding a binary map.
If lambda_is is not supplied, the function builds it from standardized
predictors using
.
compute_establishment_probability( lambda_is = NULL, r_is_z = NULL, C_is_z = NULL, S_is_z = NULL, gamma = 1, alpha = NULL, beta = NULL, kappa = 0, method = c("probit", "logit", "hard"), sigma = NULL, tau = 1, fit = NULL, predictive = FALSE, sigma_mat = NULL, use_vcov = FALSE, Q_inv = NULL, site_df = NULL, return_long = TRUE, make_plots = TRUE, option_label = NULL )compute_establishment_probability( lambda_is = NULL, r_is_z = NULL, C_is_z = NULL, S_is_z = NULL, gamma = 1, alpha = NULL, beta = NULL, kappa = 0, method = c("probit", "logit", "hard"), sigma = NULL, tau = 1, fit = NULL, predictive = FALSE, sigma_mat = NULL, use_vcov = FALSE, Q_inv = NULL, site_df = NULL, return_long = TRUE, make_plots = TRUE, option_label = NULL )
lambda_is |
Optional matrix of invasion fitness values with dimensions
|
r_is_z, C_is_z, S_is_z
|
Optional matrices of standardized abiotic suitability,
niche crowding, and saturation. Used only when |
gamma |
Optional vector of length |
alpha |
Optional vector or matrix of crowding penalties. By convention, these values are constrained to be non-negative. |
beta |
Optional vector of saturation penalties. Signed values allow facilitation. |
kappa |
Optional scalar offset added to invasion fitness (default 0). |
method |
Character string; one of |
sigma |
Numeric scalar standard deviation for the probit transform. |
tau |
Numeric scalar scale parameter for the logistic transform. |
fit |
Optional fitted model object used to obtain a residual standard
deviation when |
predictive |
Logical; if |
sigma_mat |
Optional matrix of predictive standard deviations with the
same dimensions as |
use_vcov |
Logical; if |
Q_inv |
Optional data frame of invader trait scores with columns
|
site_df |
Optional site metadata table with columns |
return_long |
Logical; if |
make_plots |
Logical; if |
option_label |
Optional label attached to the output. |
Convert invasion fitness to probabilistic establishment (probit, logit, hard)
For the probit method, probabilities are computed as
. A scalar or cell-wise standard
deviation may be used.
For the logistic method, probabilities are computed as
.
The hard rule returns a binary indicator equal to one when
.
A list with components:
p_is: matrix of establishment probabilities
lambda_is: invasion fitness matrix
sigma_used: standard deviation used by the probit transform
method: transformation method
option_label: label used for summaries
prob_long: long-format table (optional)
plots: list of plots (optional)
## Minimal example (toy shapes) set.seed(1) S = 6; I = 4 sites = paste0("s", 1:S) inv = paste0("i", 1:I) r_is_z = matrix(rnorm(S*I), S, I, dimnames=list(sites, inv)) C_is_z = matrix(rnorm(S*I), S, I, dimnames=dimnames(r_is_z)) S_is_z = matrix(rep(scale(rnorm(S)), each=I), S, I, dimnames=dimnames(r_is_z)) gamma = setNames(runif(I, 0.5, 1.2), inv) alpha = setNames(runif(I, 0.2, 1.0), inv) beta = setNames(runif(I, 0.1, 0.6), inv) # Build lambda internally, then get logistic probabilities out_logit = compute_establishment_probability( r_is_z=r_is_z, C_is_z=C_is_z, S_is_z=S_is_z, gamma=gamma, alpha=alpha, beta=beta, method="logit", tau=1, return_long=FALSE, make_plots=FALSE ) str(out_logit$p_is) # Probit with a scalar sigma out_probit = compute_establishment_probability( r_is_z=r_is_z, C_is_z=C_is_z, S_is_z=S_is_z, gamma=gamma, alpha=alpha, beta=beta, method="probit", sigma=1 ) # View site-mean probability map (requires ggplot2) if (requireNamespace("ggplot2", quietly=TRUE)) print(out_probit$plots$site_mean) # Hard rule (lambda>0) out_hard = compute_establishment_probability( r_is_z=r_is_z, C_is_z=C_is_z, S_is_z=S_is_z, gamma=gamma, alpha=alpha, beta=beta, method="hard", return_long=TRUE, make_plots=FALSE ) table(out_hard$prob_long$val) # 0/1 counts## Minimal example (toy shapes) set.seed(1) S = 6; I = 4 sites = paste0("s", 1:S) inv = paste0("i", 1:I) r_is_z = matrix(rnorm(S*I), S, I, dimnames=list(sites, inv)) C_is_z = matrix(rnorm(S*I), S, I, dimnames=dimnames(r_is_z)) S_is_z = matrix(rep(scale(rnorm(S)), each=I), S, I, dimnames=dimnames(r_is_z)) gamma = setNames(runif(I, 0.5, 1.2), inv) alpha = setNames(runif(I, 0.2, 1.0), inv) beta = setNames(runif(I, 0.1, 0.6), inv) # Build lambda internally, then get logistic probabilities out_logit = compute_establishment_probability( r_is_z=r_is_z, C_is_z=C_is_z, S_is_z=S_is_z, gamma=gamma, alpha=alpha, beta=beta, method="logit", tau=1, return_long=FALSE, make_plots=FALSE ) str(out_logit$p_is) # Probit with a scalar sigma out_probit = compute_establishment_probability( r_is_z=r_is_z, C_is_z=C_is_z, S_is_z=S_is_z, gamma=gamma, alpha=alpha, beta=beta, method="probit", sigma=1 ) # View site-mean probability map (requires ggplot2) if (requireNamespace("ggplot2", quietly=TRUE)) print(out_probit$plots$site_mean) # Hard rule (lambda>0) out_hard = compute_establishment_probability( r_is_z=r_is_z, C_is_z=C_is_z, S_is_z=S_is_z, gamma=gamma, alpha=alpha, beta=beta, method="hard", return_long=TRUE, make_plots=FALSE ) table(out_hard$prob_long$val) # 0/1 counts
for multiple model optionscompute_invasion_fitness() evaluates the invasion-fitness surface
over sites () and invaders () using three
standardized predictors:
Abiotic suitability (alignment between invader traits
and the local environment),
Niche crowding (overlap with resident trait space,
weighted by composition),
Resident competition (site-level saturation).
Five model variants are supported:
Option A:
Option B: global slope
Option C: invader-specific slopes
Option D: site-varying slopes and penalties
Option E: signed saturation effect
Optionally, an offset can be calibrated so that the mean resident
invasion fitness is approximately zero when resident standardized matrices
are supplied.
compute_invasion_fitness( r_is_z, C_is_z, S_is_z, option = c("A", "B", "C", "D", "E"), alpha_i = NULL, beta_i = NULL, theta0 = 1, theta_i = NULL, Gamma_is = NULL, alpha_is = NULL, beta_signed_i = NULL, calibrate_kappa = FALSE, r_js_z = NULL, C_js_z = NULL, S_js_z = NULL, Q_res = NULL, a0 = NULL, a1 = NULL, a2 = NULL, b0 = NULL, b1 = NULL, b2 = NULL, site_df = NULL, return_long = TRUE, label = NULL )compute_invasion_fitness( r_is_z, C_is_z, S_is_z, option = c("A", "B", "C", "D", "E"), alpha_i = NULL, beta_i = NULL, theta0 = 1, theta_i = NULL, Gamma_is = NULL, alpha_is = NULL, beta_signed_i = NULL, calibrate_kappa = FALSE, r_js_z = NULL, C_js_z = NULL, S_js_z = NULL, Q_res = NULL, a0 = NULL, a1 = NULL, a2 = NULL, b0 = NULL, b1 = NULL, b2 = NULL, site_df = NULL, return_long = TRUE, label = NULL )
r_is_z |
Matrix of standardized abiotic suitability with dimensions
|
C_is_z |
Matrix of standardized niche crowding with dimensions
|
S_is_z |
Matrix of standardized site saturation with dimensions
|
option |
Character string specifying the model option. One of
|
alpha_i |
Optional named vector of invader-level crowding sensitivities. |
beta_i |
Optional named vector of invader-level saturation sensitivities. |
theta0 |
Global abiotic slope used in options A, B, and E. |
theta_i |
Optional invader-specific abiotic slopes used in option C. |
Gamma_is |
Optional site by invader matrix of abiotic slopes used in option D. |
alpha_is |
Optional site by invader matrix of crowding penalties used in option D. |
beta_signed_i |
Optional signed saturation sensitivities used in option E. |
calibrate_kappa |
Logical; if |
r_js_z, C_js_z, S_js_z
|
Optional resident standardized matrices required
when |
Q_res |
Optional data frame of resident trait-plane scores. |
a0, a1, a2, b0, b1, b2
|
Optional numeric coefficients used to derive resident
analog slopes when calibrating |
site_df |
Optional site metadata table with columns |
return_long |
Logical; if |
label |
Optional character label attached to the output. |
Compute invasion fitness with multiple model options
The invasion fitness is computed as:
Option A:
Option B:
Option C:
Option D:
Option E:
When calibrate_kappa = TRUE, the offset is chosen so that the mean
resident invasion fitness equals zero.
A list containing:
lambda_is: invasion fitness matrix
GI: abiotic slope used
AI: crowding penalties used
BI: saturation penalties used
kappa: calibration offset
option: option label
lambda_long: tidy table (optional)
## --------------------------------------------------------------- ## Minimal reproducible example (toy dimensions, no real ecology) ## --------------------------------------------------------------- ## S = number of sites, I = number of invaders S = 5 I = 3 set.seed(1) ## r_is_z : intrinsic growth component ## rows = sites (s), columns = invaders (i) r_is_z = matrix( rnorm(S * I), nrow = S, ncol = I, dimnames = list(paste0("s", 1:S), paste0("i", 1:I)) ) ## C_is_z : crowding / competition component (same shape as r_is_z) C_is_z = matrix( rnorm(S * I), nrow = S, ncol = I, dimnames = dimnames(r_is_z) ) ## S_is_z : site-only environmental gradient ## generated per-site, then broadcast across invaders S_is_z = matrix( rep(scale(rnorm(S)), each = I), nrow = S, ncol = I, dimnames = dimnames(r_is_z) ) ## Invader-specific baseline parameters alpha_i = setNames(runif(I, 0.2, 1.0), colnames(r_is_z)) # baseline crowding sensitivity beta_i = setNames(runif(I, 0.1, 0.5), colnames(r_is_z)) # strength of S effect ## --------------------------------------------------------------- ## Option A: fixed gamma = 1, no S effect (kappa = 0) ## --------------------------------------------------------------- outA = compute_invasion_fitness( r_is_z, C_is_z, S_is_z, option = "A", alpha_i = alpha_i, beta_i = beta_i, theta0 = 1, return_long = FALSE ) ## --------------------------------------------------------------- ## Option B: gamma shared across invaders (gamma = theta_0) ## --------------------------------------------------------------- outB = compute_invasion_fitness( r_is_z, C_is_z, S_is_z, option = "B", alpha_i = alpha_i, beta_i = beta_i, theta0 = 0.8, return_long = FALSE ) ## --------------------------------------------------------------- ## Option C: invader-specific gamma_i ## --------------------------------------------------------------- theta_i = setNames( runif(I, 0.5, 1.2), colnames(r_is_z) ) outC = compute_invasion_fitness( r_is_z, C_is_z, S_is_z, option = "C", alpha_i = alpha_i, beta_i = beta_i, theta_i = theta_i, return_long = FALSE ) ## --------------------------------------------------------------- ## Option D: fully site-varying Gamma_is and alpha_is ## --------------------------------------------------------------- ## gamma_is : site x invader matrix of density-dependence scalars ## constructed by repeating gamma_i across sites Gamma_is = matrix( rep(theta_i, each = nrow(r_is_z)), nrow = nrow(r_is_z), ncol = ncol(r_is_z), dimnames = dimnames(r_is_z) ) ## alpha_is : site x invader crowding sensitivity ## start from invader-specific alpha_i and allow small site-level variation alpha_is = matrix( NA_real_, nrow = nrow(r_is_z), ncol = ncol(r_is_z), dimnames = dimnames(r_is_z) ) for (j in seq_len(ncol(r_is_z))) { alpha_is[, j] = alpha_i[j] } ## Add site-level noise and enforce positivity. ## pmax() drops matrix attributes, so we re-wrap explicitly. alpha_is = { tmp = pmax( 0, alpha_is + matrix( rnorm(length(alpha_is), 0, 0.1), nrow = nrow(r_is_z), ncol = ncol(r_is_z) ) ) matrix( tmp, nrow = nrow(r_is_z), ncol = ncol(r_is_z), dimnames = dimnames(r_is_z) ) } outD = compute_invasion_fitness( r_is_z, C_is_z, S_is_z, option = "D", alpha_is = alpha_is, beta_i = beta_i, Gamma_is = Gamma_is, return_long = FALSE ) ## --------------------------------------------------------------- ## Option E: signed S effect (positive or negative beta_i) ## --------------------------------------------------------------- beta_signed_i = setNames( rnorm(I, 0, 0.3), colnames(r_is_z) ) outE = compute_invasion_fitness( r_is_z, C_is_z, S_is_z, option = "E", alpha_i = alpha_i, beta_signed_i = beta_signed_i, theta0 = 1, return_long = FALSE )## --------------------------------------------------------------- ## Minimal reproducible example (toy dimensions, no real ecology) ## --------------------------------------------------------------- ## S = number of sites, I = number of invaders S = 5 I = 3 set.seed(1) ## r_is_z : intrinsic growth component ## rows = sites (s), columns = invaders (i) r_is_z = matrix( rnorm(S * I), nrow = S, ncol = I, dimnames = list(paste0("s", 1:S), paste0("i", 1:I)) ) ## C_is_z : crowding / competition component (same shape as r_is_z) C_is_z = matrix( rnorm(S * I), nrow = S, ncol = I, dimnames = dimnames(r_is_z) ) ## S_is_z : site-only environmental gradient ## generated per-site, then broadcast across invaders S_is_z = matrix( rep(scale(rnorm(S)), each = I), nrow = S, ncol = I, dimnames = dimnames(r_is_z) ) ## Invader-specific baseline parameters alpha_i = setNames(runif(I, 0.2, 1.0), colnames(r_is_z)) # baseline crowding sensitivity beta_i = setNames(runif(I, 0.1, 0.5), colnames(r_is_z)) # strength of S effect ## --------------------------------------------------------------- ## Option A: fixed gamma = 1, no S effect (kappa = 0) ## --------------------------------------------------------------- outA = compute_invasion_fitness( r_is_z, C_is_z, S_is_z, option = "A", alpha_i = alpha_i, beta_i = beta_i, theta0 = 1, return_long = FALSE ) ## --------------------------------------------------------------- ## Option B: gamma shared across invaders (gamma = theta_0) ## --------------------------------------------------------------- outB = compute_invasion_fitness( r_is_z, C_is_z, S_is_z, option = "B", alpha_i = alpha_i, beta_i = beta_i, theta0 = 0.8, return_long = FALSE ) ## --------------------------------------------------------------- ## Option C: invader-specific gamma_i ## --------------------------------------------------------------- theta_i = setNames( runif(I, 0.5, 1.2), colnames(r_is_z) ) outC = compute_invasion_fitness( r_is_z, C_is_z, S_is_z, option = "C", alpha_i = alpha_i, beta_i = beta_i, theta_i = theta_i, return_long = FALSE ) ## --------------------------------------------------------------- ## Option D: fully site-varying Gamma_is and alpha_is ## --------------------------------------------------------------- ## gamma_is : site x invader matrix of density-dependence scalars ## constructed by repeating gamma_i across sites Gamma_is = matrix( rep(theta_i, each = nrow(r_is_z)), nrow = nrow(r_is_z), ncol = ncol(r_is_z), dimnames = dimnames(r_is_z) ) ## alpha_is : site x invader crowding sensitivity ## start from invader-specific alpha_i and allow small site-level variation alpha_is = matrix( NA_real_, nrow = nrow(r_is_z), ncol = ncol(r_is_z), dimnames = dimnames(r_is_z) ) for (j in seq_len(ncol(r_is_z))) { alpha_is[, j] = alpha_i[j] } ## Add site-level noise and enforce positivity. ## pmax() drops matrix attributes, so we re-wrap explicitly. alpha_is = { tmp = pmax( 0, alpha_is + matrix( rnorm(length(alpha_is), 0, 0.1), nrow = nrow(r_is_z), ncol = ncol(r_is_z) ) ) matrix( tmp, nrow = nrow(r_is_z), ncol = ncol(r_is_z), dimnames = dimnames(r_is_z) ) } outD = compute_invasion_fitness( r_is_z, C_is_z, S_is_z, option = "D", alpha_is = alpha_is, beta_i = beta_i, Gamma_is = Gamma_is, return_long = FALSE ) ## --------------------------------------------------------------- ## Option E: signed S effect (positive or negative beta_i) ## --------------------------------------------------------------- beta_signed_i = setNames( rnorm(I, 0, 0.3), colnames(r_is_z) ) outE = compute_invasion_fitness( r_is_z, C_is_z, S_is_z, option = "E", alpha_i = alpha_i, beta_signed_i = beta_signed_i, theta0 = 1, return_long = FALSE )
via Hellinger composition and Gower–Gaussian trait kernelGiven a site × resident community matrix and a resident traits table, this function:
normalizes site composition with Hellinger transformation (),
computes Gower distances among residents (mixed types supported),
converts distances to a Gaussian similarity kernel with bandwidth ,
returns the resident crowding matrix (sites × residents),
optionally row-z-scores for within-site contrasts,
optionally draws a heatmap of (row-z) and a geographic map of site-level summaries.
compute_resident_crowding( comm_res, traits_res, site_df = NULL, overlay_sf = NULL, method_comp = "hellinger", distance_metric = "gower", sigma_alpha = NULL, row_z = TRUE, quantile_probs = 0.95, do_heatmap = TRUE, show_names = TRUE, cluster_rows = TRUE, cluster_cols = TRUE, heatmap_palette = viridisLite::viridis, do_map = TRUE )compute_resident_crowding( comm_res, traits_res, site_df = NULL, overlay_sf = NULL, method_comp = "hellinger", distance_metric = "gower", sigma_alpha = NULL, row_z = TRUE, quantile_probs = 0.95, do_heatmap = TRUE, show_names = TRUE, cluster_rows = TRUE, cluster_cols = TRUE, heatmap_palette = viridisLite::viridis, do_map = TRUE )
comm_res |
A numeric matrix or data.frame of sites × resident species (non-negative abundances or transformed counts). Row names must be site IDs. |
traits_res |
A data.frame of resident species × traits. Row names must be
resident IDs that match the column names of |
site_df |
Optional data.frame with columns |
overlay_sf |
Optional |
method_comp |
Character method for |
distance_metric |
Character distance name for |
sigma_alpha |
Optional numeric bandwidth for the Gaussian kernel. If |
row_z |
Logical; if |
quantile_probs |
Numeric vector of probabilities to summarize site-level crowding quantiles
(e.g., |
do_heatmap |
Logical; if |
show_names |
Logical; show row/column names on the heatmap. Default |
cluster_rows |
Logical; cluster sites on the heatmap. Default |
cluster_cols |
Logical; cluster residents on the heatmap. Default |
heatmap_palette |
Function returning colors (e.g., |
do_map |
Logical; if |
Compute resident crowding from community composition and trait similarity
The Hellinger transformation places compositions in a Euclidean-friendly space.
Gower distance handles mixed trait types. Distances are converted to similarities with
, and the diagonal is set to 0 to remove
self-crowding. Crowding for resident at site is the weighted sum of trait-similar
resident composition at that site: .
A named list with:
W_site — Hellinger composition matrix (sites × residents).
D_res — resident × resident Gower distance matrix.
sigma_alpha — Gaussian bandwidth used.
K_res_res — Gaussian similarity kernel with zero diagonal.
C_js — resident crowding matrix (sites × residents).
C_js_z — row-z-scored version of C_js (if row_z=TRUE).
site_summary — data.frame with per-site summaries (mean and requested quantiles).
heatmap — pheatmap object or NULL if not drawn.
map — ggplot object or NULL if not drawn.
## Not run: # Fake data: 6 sites × 5 residents set.seed(42) comm_res = matrix(rpois(30, lambda = 5), nrow = 6, dimnames = list(paste0("s", 1:6), paste0("sp", 1:5))) traits_res = data.frame( size = rnorm(5, 10, 2), diet = factor(sample(c("A","B"), 5, TRUE)), row.names = paste0("sp", 1:5) ) site_df = data.frame(site = paste0("s", 1:6), x = rep(1:3, each=2), y = rep(1:2, 3)) out = compute_resident_crowding( comm_res = comm_res, traits_res = traits_res, site_df = site_df, quantile_probs = 0.95, do_heatmap = TRUE, do_map = TRUE ) str(out, max.level = 1) out$heatmap # pheatmap object out$map # ggplot map of mean C per site ## End(Not run)## Not run: # Fake data: 6 sites × 5 residents set.seed(42) comm_res = matrix(rpois(30, lambda = 5), nrow = 6, dimnames = list(paste0("s", 1:6), paste0("sp", 1:5))) traits_res = data.frame( size = rnorm(5, 10, 2), diet = factor(sample(c("A","B"), 5, TRUE)), row.names = paste0("sp", 1:5) ) site_df = data.frame(site = paste0("s", 1:6), x = rep(1:3, each=2), y = rep(1:2, 3)) out = compute_resident_crowding( comm_res = comm_res, traits_res = traits_res, site_df = site_df, quantile_probs = 0.95, do_heatmap = TRUE, do_map = TRUE ) str(out, max.level = 1) out$heatmap # pheatmap object out$map # ggplot map of mean C per site ## End(Not run)
and global z-score
Implements three site-only saturation definitions and returns (i) the raw
site statistic , (ii) its global mean/SD, (iii) the global
z-scored site statistic , and (iv) a broadcast
site-resident matrix for downstream modelling.
compute_site_saturation(mode, comm_res, res_ids, mu_js = NULL)compute_site_saturation(mode, comm_res, res_ids, mu_js = NULL)
mode |
One of |
comm_res |
Site x resident matrix (or frame) of observed counts. Will be
coerced to a numeric matrix; non-numeric entries become |
res_ids |
Character vector of resident IDs (column names to consider).
Columns not found in |
mu_js |
Optional site x resident matrix (or frame) of expected abundances,
required for |
A list with components:
S_sNumeric vector (sites) of saturation values.
S_muGlobal mean of S_s.
S_sdGlobal SD of S_s (set to 1 if 0/NA).
S_s_zGlobal z-scores of S_s.
S_js_zSite x resident matrix broadcasting S_s_z across res_ids.
Let be observed resident counts and expected counts.
"opportunity_penalty"Abundance penalty: .
"modelled_dominance"Model-based crowding: .
Requires mu_js.
"evenness_deficit"Pielou's evenness deficit:
with Shannon and richness
; then .
After computing , a global z-score is applied:
,
and this is broadcast across residents to form (same value
for all within a site ).
Package reference: compute_site_saturation()
## Not run: # Evenness deficit from observed community sat = compute_site_saturation("evenness_deficit", comm_res, colnames(comm_res)) head(sat$S_s_z) # Modelled dominance using expected abundances mu_js sat2 = compute_site_saturation("modelled_dominance", comm_res, colnames(comm_res), mu_js = mu_js) dim(sat2$S_js_z) ## End(Not run)## Not run: # Evenness deficit from observed community sat = compute_site_saturation("evenness_deficit", comm_res, colnames(comm_res)) head(sat$S_s_z) # Modelled dominance using expected abundances mu_js sat2 = compute_site_saturation("modelled_dominance", comm_res, colnames(comm_res), mu_js = mu_js) dim(sat2$S_js_z) ## End(Not run)
Builds a unified trait space for resident and invader species by:
(1) computing a Gower distance on the stacked trait table,
(2) projecting to 2D via classical MDS/PCoA (stats::cmdscale),
(3) deriving the resident convex hull (realised niche region),
(4) estimating a kernel density over the 2D space, and
(5) optionally rendering a base R filled.contour map that overlays
the hull, all points (residents black, invaders red), and the cloud centroid
(white square with black outline).
Optionally it also computes a functional dendrogram (hclust) and a
pretty dendrogram plot (factoextra::fviz_dend) when available.
compute_trait_space( traits_res, traits_inv, k = 2, kde_n = 100, pad_prop = 0.1, main_title = "Trait space density with convex hull and centroid", legend_line = "Hull = realised niche; white square = centroid; black dots = residents; red dots = invaders", cex_main = 1, cex_sub = 0.72, cex_lab = 0.85, cex_axis = 0.75, highlight_level = 0.5, do_plot = TRUE, do_dend = FALSE )compute_trait_space( traits_res, traits_inv, k = 2, kde_n = 100, pad_prop = 0.1, main_title = "Trait space density with convex hull and centroid", legend_line = "Hull = realised niche; white square = centroid; black dots = residents; red dots = invaders", cex_main = 1, cex_sub = 0.72, cex_lab = 0.85, cex_axis = 0.75, highlight_level = 0.5, do_plot = TRUE, do_dend = FALSE )
traits_res |
A data.frame of resident species (rows) by traits (columns). Mixed types are supported (numeric, integer, factor, ordered) via Gower distance. Row names are used as resident IDs; if absent, sequential IDs are created. |
traits_inv |
A data.frame of invader species (rows) by traits (columns).
Must share the same trait columns (names and types) as |
k |
Integer embedding dimension for PCoA; default 2. |
kde_n |
Integer grid size for 2D kernel density estimation (per axis). Default |
pad_prop |
Numeric padding proportion added to the plotting range on each axis. Default |
main_title |
Character main title for the contour plot. |
legend_line |
Character subtitle used as an inline legend in the title area. |
cex_main, cex_sub, cex_lab, cex_axis
|
Numeric text scaling for main title, subtitle,
axis labels, and axis tick labels, respectively. Defaults |
highlight_level |
Numeric in (0,1]; draws a bold contour at this proportion
of the maximum density (e.g., |
do_plot |
Logical; if |
do_dend |
Logical; if |
Compute and plot shared trait space for residents and invaders
This function assumes that the trait columns in traits_inv are compatible with those in
traits_res. Gower distance (cluster::daisy) supports mixed data types and handles
factors/ordered factors appropriately. The PCoA (cmdscale) is computed on the Gower distances
and the first two axes are returned by default for visualisation. The resident convex hull is
computed on the resident scores only and is drawn as the "realised niche region".
A list with:
gower - Gower distance matrix (all species).
scores - Data frame of 2D PCoA scores (Q_all) with columns tr1,tr2.
Q_res, Q_inv - Subsets of scores for residents and invaders.
hull_res - Data frame (tr1,tr2) of the resident convex hull ring (closed), or NULL if < 3 residents.
centroid - Numeric vector of the overall (res+inv) centroid c(tr1, tr2).
density - List list(x, y, z) from MASS::kde2d.
colors - Named vector with color per species ID (residents black, invaders red).
hc - hclust object (if do_dend=TRUE), else NULL.
dend_plot - A ggplot object from factoextra::fviz_dend when available, else NULL.
xlim, ylim - Numeric length-2 ranges used for plotting with padding.
## Not run: # Minimal reproducible example with numeric traits set.seed(1) traits_res = data.frame( trait_1 = rnorm(20), trait_2 = rnorm(20, 2), row.names = paste0("sp", 1:20) ) traits_inv = data.frame( trait_1 = rnorm(5, 0.5), trait_2 = rnorm(5, 2.5), row.names = paste0("inv", 1:5) ) out = compute_trait_space( traits_res = traits_res, traits_inv = traits_inv, do_plot = TRUE, do_dend = TRUE, main_title = "Trait space density with hull and centroid", legend_line = "Hull = realised niche region; white square = centroid; black = residents; red = invaders" ) # Access returned objects head(out$scores) out$centroid if (!is.null(out$dend_plot)) print(out$dend_plot) ## End(Not run)## Not run: # Minimal reproducible example with numeric traits set.seed(1) traits_res = data.frame( trait_1 = rnorm(20), trait_2 = rnorm(20, 2), row.names = paste0("sp", 1:20) ) traits_inv = data.frame( trait_1 = rnorm(5, 0.5), trait_2 = rnorm(5, 2.5), row.names = paste0("inv", 1:5) ) out = compute_trait_space( traits_res = traits_res, traits_inv = traits_inv, do_plot = TRUE, do_dend = TRUE, main_title = "Trait space density with hull and centroid", legend_line = "Hull = realised niche region; white square = centroid; black = residents; red = invaders" ) # Access returned objects head(out$scores) out$centroid if (!is.null(out$dend_plot)) print(out$dend_plot) ## End(Not run)
From an auxiliary GLMM on standardized predictors, extract fixed-effect
systems for r_z, C_z, and S_z, evaluate them at invader trait
coordinates, and (optionally) test whether the abiotic slope on r_z
should vary with traits (tr1, tr2). Biotic terms are enforced as
nonnegative penalties on invasion fitness.
derive_sensitivities(fit_coeffs, Q_inv, inv_ids, lrt = TRUE)derive_sensitivities(fit_coeffs, Q_inv, inv_ids, lrt = TRUE)
fit_coeffs |
A fitted |
Q_inv |
Data frame with invader trait coordinates |
inv_ids |
Character vector of invader IDs specifying the output order. |
lrt |
Logical or character. If logical:
|
Biotic penalties (prior/constraint). Invasion fitness is modeled as
We therefore set and
, where
and are the fitted (possibly trait-varying) slopes
for C_z and S_z. Signed versions (before clamping) are returned for
diagnostics.
Testing trait-varying abiotic slope. The argument lrt controls
whether and how to test the joint null .
For backward compatibility, lrt = TRUE performs a Wald joint χ²
test using the fixed-effects variance–covariance (no refit).
lrt = FALSE performs no test.
Alternatively, pass a character string: "wald", "lrt", or "none".
"lrt" refits a reduced model (dropping both interactions) and runs a
likelihood-ratio test (anova(..., test = "Chisq")).
If the chosen test is significant at , the function
uses the trait-varying theta_i; otherwise a constant theta0 is used
for gamma_i.
A list with components:
Nonnegative penalties for C_z and S_z.
Signed (pre-clamp) slopes.
Intercept for r_z slope system.
Trait-varying r_z slope evaluated at invader traits.
Either theta_i (if test is significant) or a constant
vector equal to theta0 (if not).
Coefficient components for C_z and S_z
systems: main effect and interactions with tr1,tr2.
Per-invader tidy data frame with traits, signed slopes,
clamping flags, and theta_i/gamma_i.
Counts and proportions of clamped and .
Logical; TRUE if the chosen test was significant.
Back-compat alias of test_tab (data frame with test result).
Data frame with test statistic, df, and p-value (or NULL).
One of "wald", "lrt", or "none".
Reminder of the nonnegative-penalty prior.
glmmTMB::glmmTMB(), stats::anova()
fit_auxiliary_residents_glmm() assembles a long table of resident
site by species cells with standardized predictors
, , and , together with
two-dimensional trait scores (tr1, tr2). It then fits a Gaussian
generalized linear mixed model to estimate how slopes on abiotic suitability,
niche crowding, and site saturation vary across the trait plane.
Optional site-level random slopes can be included for r_z and
C_z. Random slopes for S_z are intentionally excluded because
S_z is site-only and has no within-site variation across residents.
This function is intentionally forgiving: non-matrix inputs are coerced using
as.matrix(), missing dimension names are repaired when possible, and
inputs are aligned to comm_res.
fit_auxiliary_residents_glmm( comm_res, r_js_z, C_js_z, S_js_z, Q_res, use_site_random_slopes = TRUE, family = gaussian(), control = glmmTMB::glmmTMBControl(optCtrl = list(iter.max = 1e+05, eval.max = 1e+05)), na_action = c("drop", "error"), verbose = TRUE )fit_auxiliary_residents_glmm( comm_res, r_js_z, C_js_z, S_js_z, Q_res, use_site_random_slopes = TRUE, family = gaussian(), control = glmmTMB::glmmTMBControl(optCtrl = list(iter.max = 1e+05, eval.max = 1e+05)), na_action = c("drop", "error"), verbose = TRUE )
comm_res |
Numeric matrix of site by resident abundances. Row names are site identifiers and column names are resident identifiers. |
r_js_z |
Numeric matrix of standardized abiotic suitability with the same
dimensions and names as |
C_js_z |
Numeric matrix of standardized niche crowding with the same
dimensions and names as |
S_js_z |
Numeric matrix of standardized site saturation broadcast across
residents, with the same dimensions and names as |
Q_res |
Data frame of resident trait scores. Must contain numeric columns
|
use_site_random_slopes |
Logical; if |
family |
GLM family for the conditional model. Default is
|
control |
Control object passed to |
na_action |
How to handle rows with missing values. Either
|
verbose |
Logical; if |
Fit an auxiliary residents-only GLMM on standardized predictors
The fixed-effects component of the model is
with random intercepts for species and site always included.
When use_site_random_slopes = TRUE, random slopes
and are added. Random slopes
for S_z are omitted because S_z is constant within each site
and therefore not identifiable.
A list with components:
fit: the fitted glmmTMB model
data: the long-format data frame used for fitting
formula: the model formula
args: resolved arguments used in the fit
This auxiliary model estimates trait-dependent slope systems that can be
mapped to invader-level parameters such as
, , and , and optionally to
site-varying counterparts via random slopes. These parameters enter directly
into the linear invasion-fitness decomposition
linking trait position to abiotic suitability, niche crowding, and resident competition.
## Not run: # Toy dimensions set.seed(1) S = 8 # sites J = 6 # residents sites = paste0("s", 1:S) res_ids = paste0("sp", 1:J) # Site x resident abundance comm_res = matrix(rpois(S*J, lambda = 2), S, J, dimnames = list(sites, res_ids)) # Standardized predictors (row-wise z by site for r and C; site-only for S) r_raw = matrix(rnorm(S*J), S, J, dimnames = dimnames(comm_res)) C_raw = matrix(rnorm(S*J), S, J, dimnames = dimnames(comm_res)) S_raw = matrix(rep(scale(rnorm(S))[,1], each = J), S, J, dimnames = dimnames(comm_res)) # Simple per-site z for demo r_js_z = t(scale(t(r_raw))); r_js_z[!is.finite(r_js_z)] = 0 C_js_z = t(scale(t(C_raw))); C_js_z[!is.finite(C_js_z)] = 0 S_js_z = S_raw # Resident trait-plane scores (PCoA on Gower in real workflow) Q_res = data.frame(tr1 = rnorm(J), tr2 = rnorm(J)) rownames(Q_res) = res_ids aux = fit_auxiliary_residents_glmm( comm_res = comm_res, r_js_z = r_js_z, C_js_z = C_js_z, S_js_z = S_js_z, Q_res = Q_res, use_site_random_slopes = TRUE ) summary(aux$fit) head(aux$data) aux$formula ## End(Not run)## Not run: # Toy dimensions set.seed(1) S = 8 # sites J = 6 # residents sites = paste0("s", 1:S) res_ids = paste0("sp", 1:J) # Site x resident abundance comm_res = matrix(rpois(S*J, lambda = 2), S, J, dimnames = list(sites, res_ids)) # Standardized predictors (row-wise z by site for r and C; site-only for S) r_raw = matrix(rnorm(S*J), S, J, dimnames = dimnames(comm_res)) C_raw = matrix(rnorm(S*J), S, J, dimnames = dimnames(comm_res)) S_raw = matrix(rep(scale(rnorm(S))[,1], each = J), S, J, dimnames = dimnames(comm_res)) # Simple per-site z for demo r_js_z = t(scale(t(r_raw))); r_js_z[!is.finite(r_js_z)] = 0 C_js_z = t(scale(t(C_raw))); C_js_z[!is.finite(C_js_z)] = 0 S_js_z = S_raw # Resident trait-plane scores (PCoA on Gower in real workflow) Q_res = data.frame(tr1 = rnorm(J), tr2 = rnorm(J)) rownames(Q_res) = res_ids aux = fit_auxiliary_residents_glmm( comm_res = comm_res, r_js_z = r_js_z, C_js_z = C_js_z, S_js_z = S_js_z, Q_res = Q_res, use_site_random_slopes = TRUE ) summary(aux$fit) head(aux$data) aux$formula ## End(Not run)
Given a binomial species name, this function retrieves optional metadata from Wikipedia (taxonomic summary, taxonomy, image, and a color palette) and joins relevant plant/trait data from a TRY-style or user-provided trait table. Fuzzy matching is used for both TRY and local tables to handle minor spelling or naming mismatches.
get_trait_data( species, remove_bg = FALSE, do_palette = TRUE, do_taxonomy = TRUE, do_summary = TRUE, do_image = TRUE, bg_thresh = 80, green_delta = 20, n_palette = 5, preview = FALSE, save_folder = NULL, use_try = FALSE, try_data = NULL, trait_species_col = "AccSpeciesName", local_trait_df = NULL, local_species_col = "species", max_dist = 1 )get_trait_data( species, remove_bg = FALSE, do_palette = TRUE, do_taxonomy = TRUE, do_summary = TRUE, do_image = TRUE, bg_thresh = 80, green_delta = 20, n_palette = 5, preview = FALSE, save_folder = NULL, use_try = FALSE, try_data = NULL, trait_species_col = "AccSpeciesName", local_trait_df = NULL, local_species_col = "species", max_dist = 1 )
species |
Character. Species name (binomial, e.g. |
remove_bg |
Logical. If |
do_palette, do_taxonomy, do_summary, do_image
|
Logical. Control which
metadata to scrape. All default to |
bg_thresh |
Integer. Deprecated/ignored when |
green_delta |
Integer. Deprecated/ignored when |
n_palette |
Integer. Number of colors to extract for the palette.
Default: |
preview |
Logical. Print the processed image (magick) in the console.
Default: |
save_folder |
Character or |
use_try |
Logical. If |
try_data |
Character (path) or |
trait_species_col |
Name of the species column in the TRY trait table.
Default: |
local_trait_df |
Optional. |
local_species_col |
Name of the species column in the local trait table.
Default: |
max_dist |
Numeric. Maximum distance for fuzzy matching (Jaro–Winkler
via |
When remove_bg = TRUE, the infobox image background is removed using the
remove.bg API via an internal helper (remove_bg_and_save()), the resulting
PNG is re-read with magick, and the palette is extracted from that
processed image. Set the environment variable REMOVE_BG_API_KEY to a valid
remove.bg API key before calling.
Wikipedia: summary via REST API; taxonomy parsed from the infobox.
Image: first infobox image is used; when remove_bg = TRUE
the function calls the remove.bg API. Set Sys.setenv(REMOVE_BG_API_KEY = "…").
Palette: simple k-means on non-transparent pixels of the (processed) PNG.
Traits (TRY): wide table produced from TraitName and numeric OrigValueStr.
Traits (local): returns all columns except the species column.
Dependencies: dplyr, purrr, tibble, fuzzyjoin, rvest,
httr, stringr, jsonlite, magick, abind.
A tibble (one row) with columns: species, optional metadata
(summary, taxonomy ranks, img_url, palette, image_file), and all
available trait columns found via TRY/local joins. image_file contains
the normalized path to the PNG used for palette/preview (or NA if none).
## Not run: # Using TRY table get_trait_data("Acacia karroo", use_try = TRUE, try_data = try_traits, trait_species_col = "SpeciesName") # Using a local trait table get_trait_data("Acraea horta", local_trait_df = traits, local_species_col = "species") # Metadata only, with background removal and saving to a folder Sys.setenv(REMOVE_BG_API_KEY = "<your-removebg-key>") get_trait_data("Acacia karroo", use_try = FALSE, remove_bg = TRUE, save_folder = "out/") ## End(Not run)## Not run: # Using TRY table get_trait_data("Acacia karroo", use_try = TRUE, try_data = try_traits, trait_species_col = "SpeciesName") # Using a local trait table get_trait_data("Acraea horta", local_trait_df = traits, local_species_col = "species") # Metadata only, with background removal and saving to a folder Sys.setenv(REMOVE_BG_API_KEY = "<your-removebg-key>") get_trait_data("Acacia karroo", use_try = FALSE, remove_bg = TRUE, save_folder = "out/") ## End(Not run)
Fits an auxiliary GLMM on resident data to estimate invader-level sensitivities
to crowding and saturation (, ), and abiotic conversion
slopes ( or ). When supported by the data, optional
site-level random slopes yield site-varying adjustments
(, ).
Results are written into the fit$sensitivities slot of an
new_invasimapr_fit object for downstream invasion-fitness and
establishment calculations.
learn_sensitivities(fit, use_site_random_slopes = TRUE, lrt = TRUE)learn_sensitivities(fit, use_site_random_slopes = TRUE, lrt = TRUE)
fit |
An object produced by prepare_inputs and
assemble_matrices, containing resident predictor matrices
( |
use_site_random_slopes |
Logical; if |
lrt |
Logical; if |
Fits an auxiliary GLMM on resident data to estimate invader-level
sensitivities to crowding and saturation (alpha_i, beta_i), and abiotic conversion
slopes (theta_i or gamma_i), with optional site-varying random slopes that yield
per-site adjustments (alpha_is, Gamma_is). Results are written into the
fit$sensitivities slot of an invasimapr_fit object for downstream
invasion-fitness and establishment steps.
Workflow
Fit an auxiliary GLMM on resident responses using
fit_auxiliary_residents_glmm, optionally including site-level
random slopes for and .
Convert GLMM coefficients to sensitivities
(, , or ) using
derive_sensitivities, returning signed and unsigned variants
plus inference summaries.
When supported, extract site-varying effects
(, ) via
site_varying_alpha_beta_gamma.
The resulting components are stored in fit$sensitivities, including:
global and trait-varying sensitivities;
inference diagnostics and clamping summaries;
optional site-varying matrices and compact decomposition tables.
The input fit object (invisibly), with an updated
fit$sensitivities list.
prepare_inputs, assemble_matrices, fit_auxiliary_residents_glmm, derive_sensitivities, site_varying_alpha_beta_gamma
## Not run: fit <- prepare_inputs( sites = site_df, residents = resident_df, invaders = invader_df, traits = trait_df ) fit <- learn_sensitivities(fit, use_site_random_slopes = TRUE) names(fit$sensitivities) ## End(Not run)## Not run: fit <- prepare_inputs( sites = site_df, residents = resident_df, invaders = invader_df, traits = trait_df ) fit <- learn_sensitivities(fit, use_site_random_slopes = TRUE) names(fit$sensitivities) ## End(Not run)
model_residents() fits a residents-only generalized linear mixed model
and constructs site-standardized predictors used for learning sensitivities
and projecting invaders. Specifically, the function:
builds a fixed-effects design from environment and resident traits,
optionally expands factors and applies PCA compression with stored centering and scaling metadata,
fits a glmmTMB model to resident abundance,
predicts resident linear predictors and applies
row-wise z-standardisation to obtain ,
standardises resident crowding and computes site
saturation summaries , , and
.
model_residents( fit, family = glmmTMB::tweedie(link = "log"), include_env_trait_interactions = TRUE, saturation_mode = c("evenness_deficit", "opportunity_penalty", "modelled_dominance"), robust_r = TRUE, robust_c = TRUE, fit_model = TRUE, max_dense_gb = 8, reduce_strategy = c("auto", "none", "no_interactions", "pca"), pca_env_k = 5, pca_trait_k = 5, verbose = TRUE )model_residents( fit, family = glmmTMB::tweedie(link = "log"), include_env_trait_interactions = TRUE, saturation_mode = c("evenness_deficit", "opportunity_penalty", "modelled_dominance"), robust_r = TRUE, robust_c = TRUE, fit_model = TRUE, max_dense_gb = 8, reduce_strategy = c("auto", "none", "no_interactions", "pca"), pca_env_k = 5, pca_trait_k = 5, verbose = TRUE )
fit |
An object of class |
family |
A |
include_env_trait_interactions |
Logical; whether to include environment by trait interactions when not reduced. |
saturation_mode |
Character string controlling site saturation
computation. Passed to |
robust_r, robust_c
|
Logical; whether to use robust row-wise
z-standardisation for |
fit_model |
Logical; if |
max_dense_gb |
Numeric; maximum allowed dense fixed-effects size (GB) in automatic reduction mode. |
reduce_strategy |
One of |
pca_env_k, pca_trait_k
|
Integers giving the number of retained principal components for environment and trait blocks. |
verbose |
Logical; if |
Model residents and build standardized predictors
Design construction.
Environment and trait tables are standardised prior to model fitting.
Character variables are coerced to factors to ensure stable dummy expansion.
Under PCA reduction, each block is one-hot encoded, column-standardised, and
passed to prcomp(). All centering, scaling, and rotation metadata are
stored to enable reproducible projection of invaders.
Preflight memory checks.
In automatic reduction mode, the dense fixed-effects design size is estimated
before fitting. If the design exceeds max_dense_gb, interactions are
removed and/or PCA compression is applied until the constraint is satisfied.
Z-standardisation. Row-wise z-standardisation stores site-level means and standard deviations for later use in invasion fitness and probability mapping.
The input fit object with updated components:
modelStandardised inputs, PCA objects, and metadata required for invader projection.
residentsResident GLMM fit, prediction grid, raw and standardised predictor matrices, and site saturation summaries.
model$pca_usedFlags indicating whether PCA was applied and the retained dimensionality.
The argument reduce_strategy controls fixed-effects complexity.
"auto": estimate dense design size and iteratively reduce
complexity until the memory budget is met.
"no_interactions": drop all environment by trait interactions.
"pca": dummy-expand, standardise, and apply prcomp() to
environment and trait blocks, retaining the first components.
"none": retain the full requested fixed-effects structure.
build_model_formula()
prep_resident_glmm()
compute_site_saturation()
prepare_trait_space()
predict_invaders()
## Not run: fit <- prepare_inputs(sites = site_df, residents = resident_df, invaders = invader_df, traits = trait_df) fit <- prepare_trait_space(fit, traits_inv) fit <- model_residents(fit, reduce_strategy = "auto", verbose = TRUE) # Inspect z-standardised predictors for residents dim(fit$residents$r_js_z); dim(fit$residents$C_js_z) fit$residents$messages ## End(Not run)## Not run: fit <- prepare_inputs(sites = site_df, residents = resident_df, invaders = invader_df, traits = trait_df) fit <- prepare_trait_space(fit, traits_inv) fit <- model_residents(fit, reduce_strategy = "auto", verbose = TRUE) # Inspect z-standardised predictors for residents dim(fit$residents$r_js_z); dim(fit$residents$C_js_z) fit$residents$messages ## End(Not run)
) and optional establishment probability ()Wraps
compute_invasion_fitness()
to generate invader-by-site invasion fitness under model
Options A-E, and (optionally) maps to probabilities via
compute_establishment_probability().
Supports calibration of so that resident mean
is approximately zero, and can overlay an sf boundary on map-like plots.
predict_establishment( fit, option = c("A", "B", "C", "D", "E"), calibrate_kappa = FALSE, prob_method = c(NULL, "probit", "logit", "hard"), prob_args = list(), method = NULL, prob_scale = NULL, boundary_sf = NULL, boundary_params = list(inherit.aes = FALSE, fill = NA, color = "black", size = 0.3) )predict_establishment( fit, option = c("A", "B", "C", "D", "E"), calibrate_kappa = FALSE, prob_method = c(NULL, "probit", "logit", "hard"), prob_args = list(), method = NULL, prob_scale = NULL, boundary_sf = NULL, boundary_params = list(inherit.aes = FALSE, fill = NA, color = "black", size = 0.3) )
fit |
An |
option |
Character, one of
|
calibrate_kappa |
Logical; if |
prob_method |
(legacy) |
prob_args |
(legacy) List of arguments passed to
|
method |
Alias of |
prob_scale |
Alias of |
boundary_sf |
Optional sf object to overlay on map-like probability plots. |
boundary_params |
Named list for |
What this wrapper does
Chooses the appropriate sensitivity inputs for the requested option
(e.g., theta_i for C, Gamma_is/alpha_is for D).
Calls ... with site-standardised inputs held in fit.
(Optional) Converts to probability via probit, logit,
or a hard threshold, forwarding prob_* settings.
(Optional) If boundary_sf is supplied and plots are available, overlays
the boundary on map-like plots using geom_sf().
Calibration (calibrate_kappa = TRUE)
Aligns invader-resident scales by shifting so that the resident
mean is ~0, using resident moments and trait-plane slopes stored in fit.
The input fit with:
$fitnessList returned by compute_invasion_fitness() (including
lambda_is, lambda_long, option, and any plots/maps).
$probIf requested, list returned by compute_establishment_probability() with
probability tables and plots (with boundary overlay applied when provided).
Fitness: compute_invasion_fitness()
Probabilities: compute_establishment_probability()
Sensitivities: learn_sensitivities()
Invader predictors: predict_invaders()
## Not run: fit = fit |> predict_invaders(traits_inv) |> predict_establishment(option = "C", calibrate_kappa = TRUE, method = "probit", prob_scale = list(sigma = 1)) # Overlay a boundary on probability maps fit = predict_establishment(fit, option = "D", method = "logit", prob_scale = list(tau = 1), boundary_sf = rsa_boundary) ## End(Not run)## Not run: fit = fit |> predict_invaders(traits_inv) |> predict_establishment(option = "C", calibrate_kappa = TRUE, method = "probit", prob_scale = list(sigma = 1)) # Overlay a boundary on probability maps fit = predict_establishment(fit, option = "D", method = "logit", prob_scale = list(tau = 1), boundary_sf = rsa_boundary) ## End(Not run)
Uses resident-side moments and the residents-only model to construct
invader-level predictors on the site-standardised scale:
, , .
Handles PCA projection with “dummy” factor expansion so that invader
covariates exactly match the training design used in
model_residents() / the stored resident GLMM.
predict_invaders(fit, traits_inv)predict_invaders(fit, traits_inv)
fit |
|
traits_inv |
data.frame of raw invader trait values; columns must match the resident trait names used at training (numerics and/or factors). |
Pipeline
Environment: if an environment PCA was used at training, rebuild the
environment design matrix with the saved factor map and project using the
stored rotation; otherwise reuse the stored env_df_z.
Traits: coerce invader raw trait factors to training levels (redirect
unseen levels to _other_ if present); rebuild the dummy matrix and
standardise with stored means/SDs; project via trait PCA if present; then
append any raw factor terms that remained in the fixed-effects formula.
Predictors: call
build_invader_predictors()
to compute , , using
resident moments, crowding kernels, and similarity structures stored in fit.
Assumptions & safeguards
Requires a trained resident model stored in fit$residents$fit_r and PCA
metadata in fit$model$*_pca* if PCA was used.
Uses resident moments (r_mu_s, r_sd_s, C_mu_s, C_sd_s) and
similarity/crowding information (W_site, gower).
Site ordering is taken from fit$meta$sites; inputs are conformed to it.
The input fit with an updated fit$invaders list containing:
traits_inv_rawRaw (unmodified) invader trait table (as supplied).
traits_inv_glmmDesign-scale trait data passed to the resident model (post dummy-expansion, standardisation, PCA, and factor alignment).
r_is_z, C_is_z, S_is_z
Site-standardised matrices for invaders.
dfTidy table used/returned by build_invader_predictors() for joins.
Many downstream steps (e.g. invasion fitness, establishment probability)
assume invader predictors live on the same centred/scaled basis as the
resident training data. This function guarantees alignment by: (i) coercing
raw trait factor levels to training levels (with _other_ fallbacks), (ii)
rebuilding the design matrix with the training dummy map, (iii) applying
the stored centring/scaling, and (iv) projecting through the stored PCA
rotations before calling build_invader_predictors().
Input assembly: assemble_matrices()
Predictor builder: build_invader_predictors()
Residents GLMM: fit_auxiliary_residents_glmm()
Sensitivities: learn_sensitivities()
## Not run: fit <- prepare_inputs(sites = site_df, residents = resident_df, invaders = invader_df, traits = trait_df) fit <- learn_sensitivities(fit) # New invader trait table (raw scale, same columns as residents' traits) inv_traits <- data.frame(height = c(1.3, 0.9), SLA = c(12, 18), life_form = c("shrub","forb"), row.names = c("spA","spB")) fit <- predict_invaders(fit, inv_traits) str(fit$invaders, 1) dim(fit$invaders$r_is_z) # |sites| × |invaders| ## End(Not run)## Not run: fit <- prepare_inputs(sites = site_df, residents = resident_df, invaders = invader_df, traits = trait_df) fit <- learn_sensitivities(fit) # New invader trait table (raw scale, same columns as residents' traits) inv_traits <- data.frame(height = c(1.3, 0.9), SLA = c(12, 18), life_form = c("shrub","forb"), row.names = c("spA","spB")) fit <- predict_invaders(fit, inv_traits) str(fit$invaders, 1) dim(fit$invaders$r_is_z) # |sites| × |invaders| ## End(Not run)
Creates a long data set with one row per (site, resident) pair, attaches standardized environment and resident traits, then fits a GLMM for residents. Also returns fixed-effect predictions on the link scale as a site×resident matrix (r_js) and the response-scale mean (mu_js).
prep_resident_glmm( comm_res, env_df_z, traits_res_glmm, fml, family = glmmTMB::tweedie(link = "log"), seed = 123, fit_model = TRUE )prep_resident_glmm( comm_res, env_df_z, traits_res_glmm, fml, family = glmmTMB::tweedie(link = "log"), seed = 123, fit_model = TRUE )
comm_res |
Numeric matrix (sites × residents) of observed counts or abundance. Row names are site IDs; column names are resident species IDs. |
env_df_z |
Data frame of standardized site-level environment.
Row names are site IDs and must match |
traits_res_glmm |
Data frame of standardized resident traits used in the GLMM.
Row names are resident IDs and must match |
fml |
A model formula for |
family |
A |
seed |
Integer. Random seed. |
fit_model |
logical; if FALSE, return dat_r/fml without fitting. |
list(dat_r, fit_r, grid_res, r_js, mu_js, sites, res_ids)
Thin wrapper around assemble_matrices that standardises and stores assembled inputs in a new_invasimapr_fit object for downstream modelling. Use this function to run the input-assembly step once and pass a single structured container through subsequent pipelines.
prepare_inputs(..., make_plots = FALSE)prepare_inputs(..., make_plots = FALSE)
... |
Arguments passed on to
|
make_plots |
Logical; if |
What this does
Calls assemble_matrices to build core site-by-species and site-by-trait matrices, perform consistency checks, and harmonise keys and factor levels.
Wraps the returned list into a lightweight S3 container of class
invasimapr_fit, with a dedicated inputs slot and a small meta
block containing basic counts used by later steps.
Object layout
invasimapr_fit inputs : list (output from assemble_matrices) meta : list (n_sites, n_residents, n_traits)
Notes
No mutation of the assembled inputs occurs here; the function only packages and annotates them.
To inspect the assembled data, use fit$inputs on the returned object.
An object of class invasimapr_fit with components:
The full list returned by assemble_matrices.
A list with elements n_sites, n_residents, and n_traits.
assemble_matrices, new_invasimapr_fit
## Not run: fit <- prepare_inputs( sites = site_df, residents = resident_df, invaders = invader_df, traits = trait_df, make_plots = TRUE ) str(fit, 1) str(fit$inputs, 1) fit$meta ## End(Not run)## Not run: fit <- prepare_inputs( sites = site_df, residents = resident_df, invaders = invader_df, traits = trait_df, make_plots = TRUE ) str(fit, 1) str(fit$inputs, 1) fit$meta ## End(Not run)
Orchestrates the traits → space → centrality/hull → crowding pipeline for
residents and invaders. Optionally standardises trait inputs, constructs a
joint trait space, computes convex hull / centroid / centrality, and derives
raw resident crowding using a Gaussian kernel (no per-site
standardisation here). Row-wise z-scoring is deferred to
model_residents().
Plot objects are always created and returned; they are only displayed when
show_plots = TRUE.
prepare_trait_space( fit, traits_inv, crowding_sigma = NULL, show_plots = TRUE, do_standardise = TRUE, row_z = FALSE )prepare_trait_space( fit, traits_inv, crowding_sigma = NULL, show_plots = TRUE, do_standardise = TRUE, row_z = FALSE )
fit |
An |
traits_inv |
Data frame (or matrix) of raw invader traits (rows = invaders; columns aligned to resident trait columns). |
crowding_sigma |
Optional numeric bandwidth for the resident crowding
kernel. If |
show_plots |
Logical. If |
do_standardise |
Logical. If |
row_z |
Logical. Whether to perform row-wise (site) z-scoring inside
|
Pipeline
(Optional) Standardise traits_res/traits_inv via
standardise_model_inputs()
when available; otherwise pass through raw traits.
Build a joint trait space with
compute_trait_space()
(returns Gower distances, ordination scores for residents Q_res and invaders
Q_inv, a density surface, dendrograms, etc.). Plots are requested but may
be hidden depending on show_plots.
Compute centrality & hull (resident convex hull, centroid, centrality
scores) using
compute_centrality_hull().
Compute resident crowding with
compute_resident_crowding(),
returning kernel weights K_res_res, distances D_res, site kernels W_site,
chosen sigma_alpha, and raw C_js (unless row_z = TRUE is requested).
Display control
To ensure reproducible plot objects without cluttering the console,
the function temporarily opens a null graphics device when show_plots = FALSE.
Assumptions & safeguards
Requires resident community and trait tables in fit$inputs.
If standardisation fails or is unavailable, the function proceeds with raw traits.
Site-wise z-scoring is intentionally not applied here by default.
The input invasimapr_fit with updated components:
$traitsList with gower, Q_res, Q_inv, hull, centroid,
density, stored plots (plots_ts, plots_ch), centrality table, and
hull vertices.
$crowdingList with W_site, D_res, sigma_alpha, K_res_res,
and raw C_js (unless row_z = TRUE).
$metaLightweight cache of residents, sites, invaders,
and n_invaders.
Trait space: compute_trait_space()
Centrality & hull: compute_centrality_hull()
Resident crowding: compute_resident_crowding()
Standardisation: standardise_model_inputs()
Downstream modelling: model_residents()
## Not run: # Compute all plots but do not display them fit2 <- prepare_trait_space(fit, traits_inv, show_plots = FALSE) # Display plots during construction fit3 <- prepare_trait_space(fit, traits_inv, show_plots = TRUE) # Use a fixed kernel bandwidth and request row-z inside the crowding step fit4 <- prepare_trait_space(fit, traits_inv, crowding_sigma = 0.35, row_z = TRUE) ## End(Not run)## Not run: # Compute all plots but do not display them fit2 <- prepare_trait_space(fit, traits_inv, show_plots = FALSE) # Display plots during construction fit3 <- prepare_trait_space(fit, traits_inv, show_plots = TRUE) # Use a fixed kernel bandwidth and request row-z inside the crowding step fit4 <- prepare_trait_space(fit, traits_inv, crowding_sigma = 0.35, row_z = TRUE) ## End(Not run)
Generate rows of trait values for hypothetical invaders
by resampling the empirical distribution of resident traits. By default,
traits are sampled independently by column, creating novel trait
combinations. Alternatively, entire rows can be resampled to preserve the
covariance structure of the resident trait space.
Row names of the returned object are set to the invader identifiers.
simulate_invaders( resident_traits, n_inv = 10, species_col = "species", trait_cols = NULL, mode = c("columnwise", "rowwise"), numeric_method = c("bootstrap", "normal", "uniform"), keep_bounds = TRUE, inv_prefix = "inv", keep_species_column = TRUE, seed = NULL )simulate_invaders( resident_traits, n_inv = 10, species_col = "species", trait_cols = NULL, mode = c("columnwise", "rowwise"), numeric_method = c("bootstrap", "normal", "uniform"), keep_bounds = TRUE, inv_prefix = "inv", keep_species_column = TRUE, seed = NULL )
resident_traits |
A data frame containing either a species identifier
column (specified by |
n_inv |
Integer; number of invaders to simulate (default |
species_col |
|
trait_cols |
|
mode |
Character; either |
numeric_method |
Character; for numeric traits in columnwise mode,
one of |
keep_bounds |
Logical; if |
inv_prefix |
Character; prefix used to construct invader identifiers. |
keep_species_column |
Logical; if |
seed |
|
Species identifiers
Species identifiers can be supplied via a dedicated column specified by
species_col, or taken from the row names of resident_traits when
species_col = NULL. Newly simulated invaders receive fresh, unique
identifiers constructed from inv_prefix, which become the row names of
the returned data frame. When species_col is not NULL, the same identifiers
are also stored in that column unless keep_species_column = FALSE.
Trait selection
If trait_cols is NULL, all columns except species_col (when present)
are treated as traits. Otherwise, only the intersection of trait_cols and
existing column names is used.
Sampling modes
mode = "columnwise": Each trait is generated independently.
Numeric traits can be drawn by bootstrap sampling, from a normal
distribution parameterised by the empirical mean and standard
deviation, or from a uniform distribution on the observed range.
Factor and character traits are sampled from observed values.
mode = "rowwise": Entire rows are resampled with replacement
from the resident trait table, preserving joint structure across
traits.
Identifier collisions If proposed invader identifiers collide with existing resident identifiers, they are made unique using make.unique.
A data frame of simulated invader traits. Row names correspond to invader
identifiers. If species_col is not NULL and keep_species_column = TRUE,
the species identifier column contains the same identifiers.
sample, make.unique, rnorm, runif, sd
## Not run: set.seed(1) residents <- data.frame( species = paste0("sp", 1:5), height = c(10.2, 9.8, 11.1, 10.5, 9.9), SLA = c(15.0, 15.2, 14.7, 15.5, 15.1), lifeform = factor(c("tree", "shrub", "shrub", "tree", "herb")) ) inv <- simulate_invaders( residents, n_inv = 4, species_col = "species", mode = "columnwise", numeric_method = "bootstrap" ) head(inv) ## End(Not run)## Not run: set.seed(1) residents <- data.frame( species = paste0("sp", 1:5), height = c(10.2, 9.8, 11.1, 10.5, 9.9), SLA = c(15.0, 15.2, 14.7, 15.5, 15.1), lifeform = factor(c("tree", "shrub", "shrub", "tree", "herb")) ) inv <- simulate_invaders( residents, n_inv = 4, species_col = "species", mode = "columnwise", numeric_method = "bootstrap" ) head(inv) ## End(Not run)
Constructs site-by-invader crowding penalties and density-dependence scalars by combining trait-derived slopes with site-level random effects.
Site-level random slopes for and are added to the
trait-dependent systems to produce site-varying crowding penalties
and density-dependence scalars .
Saturation effects are trait-varying only; no site-varying
is constructed because is site-only.
The function also exposes signed slopes and clamping diagnostics returned by
derive_sensitivities().
site_varying_alpha_beta_gamma( fit_coeffs, Q_inv, sites, inv_ids, lrt = TRUE, quiet = FALSE )site_varying_alpha_beta_gamma( fit_coeffs, Q_inv, sites, inv_ids, lrt = TRUE, quiet = FALSE )
fit_coeffs |
Fitted residents-only GLMM (e.g. |
Q_inv |
Data frame of invader trait scores with columns |
sites |
Character vector of site identifiers (rows of the output matrices). |
inv_ids |
Character vector of invader identifiers (columns of the output matrices). |
lrt |
Logical; passed to |
quiet |
Logical; if |
A list with the following elements:
alpha_is: matrix of site-by-invader crowding penalties
Gamma_is: matrix of site-by-invader density-dependence scalars
beta_i, beta_signed_i: invader-level saturation effects
alpha_signed_i: signed crowding slopes prior to clamping
theta_i, gamma_i: invader-level abiotic sensitivities
slope_C_i: trait-derived crowding slopes
delta_C_s, delta_r_s: site-level random effects
notes: character vector of diagnostics
df: tidy long-format data frame
For each site (row), subtract the row-mean and divide by a row-scale. The scale can be classical SD or a robust max(MAD, SD) with a small floor.
standardise_by_site(X, robust = FALSE)standardise_by_site(X, robust = FALSE)
X |
Numeric matrix (sites × species). |
robust |
Logical. If TRUE, uses max(MAD, SD, 1e-8) per row; else classical SD with a safe 1 fallback. |
List with z (z-scored matrix), mu (row means), sd (row scales).
## Not run: std = standardise_by_site(C_js, robust = TRUE) C_js_z = std$z ## End(Not run)## Not run: std = standardise_by_site(C_js, robust = TRUE) C_js_z = std$z ## End(Not run)
Column-wise z-scores environment and resident trait numerics, then
scales invader trait numerics using resident moments only (to avoid
information leakage). Invader factor/character columns are coerced to the
resident levels; unseen levels become NA. Optionally drops invader-only
columns so the resident/invader trait schemas match.
standardise_model_inputs( env_df = NULL, traits_res, traits_inv = NULL, drop_extra_invader_cols = FALSE, verbose = TRUE )standardise_model_inputs( env_df = NULL, traits_res, traits_inv = NULL, drop_extra_invader_cols = FALSE, verbose = TRUE )
env_df |
Optional |
traits_res |
|
traits_inv |
Optional |
drop_extra_invader_cols |
Logical; if |
verbose |
Logical; print messages about what was standardised/coerced. |
What gets standardised and how
Environment (env_df): numeric columns are z-scored (mean 0, sd 1);
non-numeric columns are kept as-is. Zero variance is guarded by setting sd=1.
Resident traits (traits_res): numeric columns are z-scored; mixed
types allowed—non-numeric columns are kept.
Invader traits (traits_inv): numeric columns are scaled using the
resident trait means/SDs only (never computed from invaders). Factor/
character columns are coerced to resident levels; unseen levels become NA.
Extra invader columns are dropped (with a note).
Returned objects
env_df_z: environment table with numeric columns standardised (or NULL)
traits_res_glmm: resident traits with numeric columns standardised
traits_inv_glmm: invader traits, scaled like residents + factor levels matched (or NULL)
moments: resident/reference moments used for scaling (env_*, trait_*)
info$notes: human-readable notes on coercions/drops
Where this is used in the workflow
Called explicitly prior to GLMM fitting and when harmonising invaders,
and implicitly by wrappers such as prepare_trait_space() (if available).
A named list with components:
Environment table with numeric columns z-scored (or NULL).
Resident trait table with numeric columns z-scored.
Invader trait table scaled to resident moments and factor levels matched (or NULL).
list(env_means, env_sds, trait_means_res, trait_sds_res) used for scaling.
list(notes=character()) with human-readable notes.
Column names and row names are preserved.
Zero-variance numeric columns use sd=1 so z-scores stay defined.
Invader trait numerics are always scaled by resident moments (no leakage).
Invader extra columns are dropped for alignment; missing required columns error.
prepare_inputs(), prepare_trait_space(), build_invader_predictors(),
compute_trait_space(), compute_resident_crowding()
# Minimal reproducible example ---------------------------------------------- set.seed(1) env_df = data.frame(elev = rnorm(5), temp = rnorm(5), zone = factor(sample(c("A","B"), 5, TRUE))) rownames(env_df) = paste0("s", 1:5) traits_res = data.frame( size = rlnorm(4), leaf = factor(c("broad","needle","broad","needle")), stringsAsFactors = FALSE ) rownames(traits_res) = paste0("sp", 1:4) traits_inv = data.frame( size = c(10, 1), leaf = factor(c("broad","unknown")) # 'unknown' -> NA after coercion ) rownames(traits_inv) = c("inv1","inv2") std = standardise_model_inputs(env_df, traits_res, traits_inv, verbose = FALSE) str(std, 1) head(std$traits_res_glmm) head(std$traits_inv_glmm) # note: 'leaf' for 'unknown' becomes NA # Workflow sketch ------------------------------------------------------------ # fit = prepare_inputs(long_df = longDF, ...) # gives fit$inputs$env_df, $traits_res # inv = simulate_invaders(fit$inputs$traits_res, n_inv = 10) # std = standardise_model_inputs(fit$inputs$env_df, fit$inputs$traits_res, inv) # std$traits_res_glmm; std$traits_inv_glmm # pass to GLMM / trait-space steps# Minimal reproducible example ---------------------------------------------- set.seed(1) env_df = data.frame(elev = rnorm(5), temp = rnorm(5), zone = factor(sample(c("A","B"), 5, TRUE))) rownames(env_df) = paste0("s", 1:5) traits_res = data.frame( size = rlnorm(4), leaf = factor(c("broad","needle","broad","needle")), stringsAsFactors = FALSE ) rownames(traits_res) = paste0("sp", 1:4) traits_inv = data.frame( size = c(10, 1), leaf = factor(c("broad","unknown")) # 'unknown' -> NA after coercion ) rownames(traits_inv) = c("inv1","inv2") std = standardise_model_inputs(env_df, traits_res, traits_inv, verbose = FALSE) str(std, 1) head(std$traits_res_glmm) head(std$traits_inv_glmm) # note: 'leaf' for 'unknown' becomes NA # Workflow sketch ------------------------------------------------------------ # fit = prepare_inputs(long_df = longDF, ...) # gives fit$inputs$env_df, $traits_res # inv = simulate_invaders(fit$inputs$traits_res, n_inv = 10) # std = standardise_model_inputs(fit$inputs$env_df, fit$inputs$traits_res, inv) # std$traits_res_glmm; std$traits_inv_glmm # pass to GLMM / trait-space steps
Invasion fitness integrates structure in trait space
(distances, overlaps, convex hull, cloud centroid) with abiotic suitability
(alignment to environment), niche crowding (overlap with residents weighted by
composition), and resident competition (site saturation).
summarise_invasiveness_invasibility() collapses the site-by-invader surface
into species-, trait-, and site-level summaries using either a probabilistic
measure or a hard rule .
Species invasiveness (per invader) summarises the breadth of establishment across sites:
or
Site invasibility (per site) quantifies openness to newcomers:
or
Trait invasiveness scores which invader traits explain variation in invasiveness,
using standardized slopes for continuous traits and ANOVA for
categorical traits.
summarise_invasiveness_invasibility( lambda_is = NULL, p_is = NULL, use_probabilistic = FALSE, prob_threshold = 0.5, site_df = NULL, traits_inv = NULL, comm_res = NULL, return_long = TRUE, make_plots = TRUE, label = NULL )summarise_invasiveness_invasibility( lambda_is = NULL, p_is = NULL, use_probabilistic = FALSE, prob_threshold = 0.5, site_df = NULL, traits_inv = NULL, comm_res = NULL, return_long = TRUE, make_plots = TRUE, label = NULL )
lambda_is |
Matrix of invasion fitness with dimensions sites by invaders. Row and column names are required. |
p_is |
Optional matrix of establishment probabilities with the same
dimensions as |
use_probabilistic |
Logical. If |
prob_threshold |
Numeric between 0 and 1. If probabilistic summaries are
used and a binary view is requested, values with
|
site_df |
Optional data frame with columns |
traits_inv |
Optional data frame of invader traits; row names must match invader IDs. Numeric and factor traits are supported. |
comm_res |
Optional site-by-resident matrix used to compute resident richness per site. |
return_long |
Logical. If |
make_plots |
Logical. If |
label |
Optional character label added to plot titles and metadata. |
Summarize invasion fitness into species, trait, and site metrics (with plots)
A list with the following components:
species: data frame of invader-level invasiveness metrics
site: data frame of site-level invasibility metrics
trait_effects: data frame of trait effect sizes
establish_long: tidy long-format table of the working surface
plots: list of ggplot objects (or NULL)
meta: list describing the summary mode and threshold
set.seed(42) S = 8; I = 5 sites = paste0("s", 1:S) inv = paste0("i", 1:I) # Fake fitness and probabilities lambda_is = matrix(rnorm(S*I, sd=1), S, I, dimnames=list(sites, inv)) p_is = pnorm(lambda_is) # crude probit just for the example # Minimal site coordinates for plotting (optional) site_df = data.frame(site = sites, x = rep(1:4, each=2)[1:S], y = rep(1:2, times=4)[1:S]) # Minimal trait table for invaders (one numeric, one factor) traits_inv = data.frame(trait_size = runif(I, 0, 1), trait_type = factor(sample(c("A","B"), I, TRUE)), row.names = inv) # 1) Probabilistic summaries (use p_is) outP = summarise_invasiveness_invasibility( lambda_is = lambda_is, p_is = p_is, use_probabilistic = TRUE, site_df = site_df, traits_inv = traits_inv, make_plots = FALSE ) names(outP) head(outP$species) # 2) Hard rule summaries (use I{lambda>0}) outH = summarise_invasiveness_invasibility( lambda_is = lambda_is, use_probabilistic = FALSE, site_df = site_df, traits_inv = traits_inv, make_plots = FALSE ) head(outH$site)set.seed(42) S = 8; I = 5 sites = paste0("s", 1:S) inv = paste0("i", 1:I) # Fake fitness and probabilities lambda_is = matrix(rnorm(S*I, sd=1), S, I, dimnames=list(sites, inv)) p_is = pnorm(lambda_is) # crude probit just for the example # Minimal site coordinates for plotting (optional) site_df = data.frame(site = sites, x = rep(1:4, each=2)[1:S], y = rep(1:2, times=4)[1:S]) # Minimal trait table for invaders (one numeric, one factor) traits_inv = data.frame(trait_size = runif(I, 0, 1), trait_type = factor(sample(c("A","B"), I, TRUE)), row.names = inv) # 1) Probabilistic summaries (use p_is) outP = summarise_invasiveness_invasibility( lambda_is = lambda_is, p_is = p_is, use_probabilistic = TRUE, site_df = site_df, traits_inv = traits_inv, make_plots = FALSE ) names(outP) head(outP$species) # 2) Hard rule summaries (use I{lambda>0}) outH = summarise_invasiveness_invasibility( lambda_is = lambda_is, use_probabilistic = FALSE, site_df = site_df, traits_inv = traits_inv, make_plots = FALSE ) head(outH$site)
Thin wrapper around summarise_invasiveness_invasibility that extracts inputs from a new_invasimapr_fit container, forwards optional arguments, and optionally overlays a boundary layer on the site map. This function is intended as a reporting step after computing invasion fitness and/or establishment probabilities.
summarise_results( fit, boundary_sf = NULL, boundary_params = list(inherit.aes = FALSE, fill = NA, color = "black", size = 0.3), traits_inv = NULL, ... )summarise_results( fit, boundary_sf = NULL, boundary_params = list(inherit.aes = FALSE, fill = NA, color = "black", size = 0.3), traits_inv = NULL, ... )
fit |
An object produced by the invasimapr workflow, containing
|
boundary_sf |
Optional object of class |
boundary_params |
Named list of aesthetics passed to
|
traits_inv |
Optional override of the invader trait table used for
invader-ranked summaries. If |
... |
Additional arguments forwarded to summarise_invasiveness_invasibility. |
Inputs used from the container
fit$fitness$lambda_is: site-by-invader invasion fitness matrix (optional).
fit$prob$p_is: site-by-invader establishment probability matrix (optional).
fit$inputs$site_df: site metadata with coordinates x and y (optional).
fit$inputs$comm_res: resident community matrix for context summaries (optional).
invader trait tables stored in the container, used for invader rankings.
At least one of lambda_is or p_is must be present; otherwise an error
is thrown.
Behaviour
Extract invasion fitness and/or establishment probability matrices from the container.
Select site metadata and an effective invader trait table when available.
Call summarise_invasiveness_invasibility to construct tidy summary tables and plots (site maps, rankings, heatmaps).
If a boundary layer is supplied and plots are available, overlay it
on the site map using geom_sf().
Output layout
The returned container gains a summary element mirroring the structure
returned by summarise_invasiveness_invasibility, typically containing
summary tables, plots, and the arguments used.
The input fit object with an added or updated fit$summary list.
Invisibly returns fit to support piping.
compute_invasion_fitness, compute_establishment_probability, summarise_invasiveness_invasibility, assemble_matrices
## Not run: fit <- prepare_inputs( sites = site_df, residents = resident_df, invaders = invader_df, traits = trait_df ) fit <- learn_sensitivities(fit) fit <- predict_invaders(fit, traits_inv = invader_traits) fit <- summarise_results(fit) fit$summary$plots$site_map ## End(Not run)## Not run: fit <- prepare_inputs( sites = site_df, residents = resident_df, invaders = invader_df, traits = trait_df ) fit <- learn_sensitivities(fit) fit <- predict_invaders(fit, traits_inv = invader_traits) fit <- summarise_results(fit) fit$summary$plots$site_map ## End(Not run)