---
title: "Invasion fitness synthesis"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Invasion fitness synthesis}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment  = "#>",
  fig.path = "figures/4-",
  out.width = "100%",
  echo     = TRUE,
  message  = FALSE,
  warning  = FALSE
)

set.seed(42)

# Set BCUBED_DOCS_BUILD=true in CI / docs conversion to skip heavy chunks if needed.
BCUBED_DOCS_BUILD = identical(Sys.getenv("BCUBED_DOCS_BUILD"), "true")
```

```{r read_files, include=FALSE}
# At the start of the next vignette, explicitly load files from previous vignette:
inputs <- readRDS(
  system.file("extdata", "inputs_vignettes.rds", package = "invasimapr")
)

fit = inputs$fit
site_df = inputs$site_df
fitness_df = inputs$fitness_df
lambda_long = inputs$lambda_long
df_inv = inputs$df_inv
traits_inv = inputs$traits_inv
```

---

## Install and load `invasimapr` and other core packages

Install from GitHub (recommended for development) or CRAN (if available) as before. 

```{r install-invasimapr, include=FALSE}
# install.packages("remotes")
# remotes::install_github("b-cubed-eu/invasimapr")

# If invasimapr is on CRAN:
# install.packages("invasimapr")

pkgs = c("invasimapr", "ggplot2", "terra", "dplyr", "tidyr")

if (!all(vapply(pkgs, requireNamespace, logical(1), quietly = TRUE))) {
  knitr::knit_exit()
}

deps = c(
  # Core workflow
  "dplyr","tidyr","purrr","tibble","stringr",
  "ggplot2",
  # Modelling / stats
  "glmmTMB","performance","Matrix","glmnet","MASS","vegan","cluster","matrixStats",
  # Spatial
  "sf",
  # Viz helpers used in later steps
  "viridis","pheatmap","factoextra"
)

is_installed = vapply(deps, requireNamespace, logical(1), quietly = TRUE)
missing = deps[!is_installed]

cat("### Dependency check\n\n")
if (length(missing) == 0) {
  cat("- All recommended packages are available.\n")
} else {
  cat("- Missing packages (install before running the full workflow):\n")
  cat(paste0("  - ", missing, collapse = "\n"), "\n")
}
```

---

## Synthesising invasion-fitness insights

This section condenses the high-dimensional output into **system-level distributions**, **top/bottom ranks**, **trait correlates**, and **key-invader maps**. The goal is a compact dashboard for reporting and decision-support.

:hourglass_flowing_sand: **Step-by-step** in this section:

1.  Summarise the distribution of $\lambda_{is}$,
2.  Identify top/bottom invaders and sites by mean fitness,
3.  Link invader traits to fitness via correlations/means, and
4.  Visualise spatial patterns for the most and least concerning invaders.

---

### Distribution of invasion fitness values

The histogram of all $\lambda_{is}$ values gives a system-wide view of how often establishment conditions are favourable vs. exclusionary.

:bulb: **Interpretation**: Right-skew suggests many marginal combinations but a meaningful tail of high-opportunity cases; left-skew suggests strong overall resistance.

:warning: **Checks/tips**: Inspect tails; compare across specification options (A-E) to test robustness.

```{r fitness-dist, fig.alt="Distribution of invasion fitness values"}
ggplot2::ggplot(fitness_df, ggplot2::aes(x = lambda_mean, fill = ggplot2::after_stat(x))) +
  ggplot2::geom_histogram(bins = 40, color = "grey30") +
  ggplot2::scale_fill_viridis_c(option = "magma", guide = "none") +
  ggplot2::labs(title = "Distribution of invasion fitness values (all invader × site)",
                x = expression("Invasion fitness " * lambda[i*s]),
                y = "Frequency") +
  ggplot2::theme_minimal(base_size = 12)
```

> :chart_with_upwards_trend: **Figure 29**: The histogram shows the **distribution of invasion fitness values (**$\lambda$ᵢₛ) across all invader--site combinations. Most $\lambda$ values fall between −3 and 0, with a strong concentration around −2, indicating that the majority of potential invasions are constrained by either abiotic conditions or biotic resistance. Only a small fraction of cases approach or exceed $\lambda$ \> 0 (to the right of the origin), highlighting that successful establishment opportunities are relatively rare. The long left tail (\< −4) represents invader--site pairings where conditions are strongly unfavorable.

:bar_chart: Overall, the figure emphasizes that the **invasion landscape is dominated by resistance** rather than opportunity, but with occasional site--invader matches that may permit establishment. **This pattern is consistent with theoretical expectations of community assembly, where most introductions fail but a minority may find niches that allow them to persist.**

---

### Top and bottom invaders and sites

Ranking by mean $\lambda_{is}$ highlights consistently permissive sites and consistently risky invaders.

:bulb: **Interpretation**: Top invaders are candidates for vigilant monitoring; top sites are **hotspots** for surveillance or preventative management.

:warning: **Checks/tips**: Report uncertainty (e.g., bootstrapped means) when used for policy.

```{r top-bottom, echo=FALSE}
# Top/bottom invaders
invader_means = lambda_long |>
  dplyr::group_by(invader) |>
  dplyr::summarise(mean_lambda = mean(lambda, na.rm = TRUE), .groups = "drop") |>
  dplyr::arrange(dplyr::desc(mean_lambda))
top3_inv    = head(invader_means, 3)
bottom3_inv = tail(invader_means, 3)

# Top/bottom sites
site_means = df_inv |>
  dplyr::group_by(site) |>
  dplyr::summarise(mean_lambda = mean(lambda, na.rm = TRUE), .groups = "drop") |>
  dplyr::arrange(dplyr::desc(mean_lambda))
top3_site    = head(site_means, 3)
bottom3_site = tail(site_means, 3)

cat("==== Top 3 Invaders by Mean Invasion Fitness ====\n",
    paste0(seq_len(nrow(top3_inv)), ". ", top3_inv$invader, ": ",
           round(top3_inv$mean_lambda, 3), collapse = "\n"), "\n\n",
    "==== Bottom 3 Invaders by Mean Invasion Fitness ====\n",
    paste0(seq_len(nrow(bottom3_inv)), ". ", bottom3_inv$invader, ": ",
           round(bottom3_inv$mean_lambda, 3), collapse = "\n"), "\n\n",
    "==== Top 3 Sites by Mean Invasion Fitness ====\n",
    paste0(seq_len(nrow(top3_site)), ". ", top3_site$site, ": ",
           round(top3_site$mean_lambda, 3), collapse = "\n"), "\n\n",
    "==== Bottom 3 Sites by Mean Invasion Fitness ====\n",
    paste0(seq_len(nrow(bottom3_site)), ". ", bottom3_site$site, ": ",
           round(bottom3_site$mean_lambda, 3), collapse = "\n"), "\n", sep = "")
```

---

### Functional correlates of invasion success

We relate invader traits to mean $\lambda_{is}$ to identify functional profiles linked to high or low establishment.

:bulb: **Interpretation**: Strong positive (or negative) correlations for continuous traits, and high/low trait-level means for categorical traits, suggest **mechanistic drivers** worth testing experimentally.

:warning: **Checks/tips**: Standardise continuous traits; ensure categorical levels are well populated; treat results as exploratory unless validated.

```{r top-traits, echo=FALSE}
cont_cols = names(traits_inv)[grepl("^trait_cont", names(traits_inv))]
cat_cols  = names(traits_inv)[grepl("^trait_cat",  names(traits_inv))]
trait_cols = c(cont_cols, cat_cols)

inv_mean = df_inv |>
  dplyr::group_by(invader) |>
  dplyr::summarise(mean_lambda = mean(lambda, na.rm = TRUE), .groups = "drop") |>
  dplyr::rename(species = invader)

inv_trait_lambda = traits_inv |>
  tibble::rownames_to_column("species") |>
  dplyr::semi_join(inv_mean, by = "species") |>
  dplyr::left_join(inv_mean, by = "species") |>
  dplyr::mutate(
    dplyr::across(dplyr::matches("^trait_cat"),  ~ as.factor(.)),
    dplyr::across(dplyr::matches("^trait_cont"), ~ as.numeric(.))
  )

trait_cor = sapply(cont_cols, function(tr) {
  stats::cor(inv_trait_lambda[[tr]], inv_trait_lambda$mean_lambda, use = "pairwise.complete.obs")
})
trait_cor = sort(trait_cor, decreasing = TRUE)

cat("==== Top 3 Continuous Traits by Correlation with Mean Invasion Fitness ====\n")
for (nm in names(head(trait_cor, 3))) cat(sprintf("%s: %.3f\n", nm, trait_cor[nm]))
cat("\n==== Bottom 3 Continuous Traits by Correlation with Mean Invasion Fitness ====\n")
for (nm in names(tail(trait_cor, 3))) cat(sprintf("%s: %.3f\n", nm, trait_cor[nm])); cat("\n")

trait_cat_means = lapply(cat_cols, function(tr) tapply(inv_trait_lambda$mean_lambda,
                                                       inv_trait_lambda[[tr]], mean))
names(trait_cat_means) = cat_cols

cat("==== Categorical Traits: Top Value per Trait by Mean Invasion Fitness ====\n")
for (tr in cat_cols) {
  levs = trait_cat_means[[tr]]
  cat(sprintf("%s: %s (%.2f)\n", tr, names(levs)[which.max(levs)], max(levs)))
}
cat("\n")
```

---

### Faceted maps for key invaders

We visualise spatial patterns for the **top** and **bottom** invaders identified above to link traits, environments, and community context.

:bulb: **Interpretation**: Hotspots for top invaders mark priority regions; uniformly low values for bottom invaders highlight robust sites or mismatched niches.

:warning: **Checks/tips**: Use a centred colour scale; show boundaries for orientation; keep facet order consistent with ranks.

---

:hourglass_flowing_sand: **Step-by-step** in this section: 1. select invaders &#x2794; 2. compute z-MAD standardized fitness &#x2794; 3. build facet labels ordered by mean $\lambda$ &#x2794; 4. map with a centered palette &#x2794; 5. show relative advantage vs site mean &#x2794; 6. optionally overlay contours ($\lambda>0$); top decile).

#### Select key invaders (top & bottom)

```{r invader-key}
# Objects assumed: df_inv (site, invader, x, y, lambda), rsa (sf boundary)
# Also assumed: top3_inv, bottom3_inv with columns 'invader'

# Select key invaders
key_invaders = c(top3_inv$invader, bottom3_inv$invader) |> unique()

lambda_key = df_inv |>
  dplyr::filter(invader %in% key_invaders) |>
  dplyr::mutate(invader = factor(invader, levels = key_invaders))
dim(lambda_key)
```


#### Compute z-MAD standardization of ($\lambda$)

```{r invader-zMAD}
# z-MAD for lambda (avoid divide-by-zero)
med = median(lambda_key$lambda, na.rm = TRUE)
mad_ = mad(lambda_key$lambda, na.rm = TRUE)
if (!is.finite(mad_) || mad_ == 0) mad_ = sd(lambda_key$lambda, na.rm = TRUE)

lambda_key = lambda_key |>
  dplyr::mutate(lambda_zmad = (lambda - med) / mad_)
dim(lambda_key)
```


#### Order facets by mean ($\lambda$) and build labels

```{r invader-facet_mean}
# Order facets by mean lambda and create labels
lab_means = lambda_key |>
  dplyr::group_by(.data$invader) |>
  dplyr::summarise(mu = mean(.data$lambda, na.rm = TRUE), .groups = "drop") |>
  dplyr::arrange(dplyr::desc(.data$mu)) |>
  dplyr::mutate(label = sprintf("%s (mean \u03BB = %.2f)", .data$invader, .data$mu))

lab_levels = lab_means$label

lambda_key2 <- lambda_key |>
  dplyr::left_join(lab_means |> dplyr::select(invader, label), by = "invader", suffix = c(".x", ".y")) |>
  dplyr::mutate(label = dplyr::coalesce(label.y, label.x)) |>   # prefer joined label
  dplyr::select(-label.x, -label.y) |>
  dplyr::mutate(label = factor(label, levels = lab_levels))
names(lambda_key2)

```


#### Ordered faceted maps (centered palette)

```{r invader-facet_order, fig.alt="Spatial distribution of invasion fitness"}
# Load the national boundary
library(ggplot2)
rsa = sf::st_read(system.file("extdata", "rsa.shp", package = "invasimapr"))

# Diverging palette + clearer midpoint
ggplot2::ggplot(lambda_key2, ggplot2::aes(x, y, fill = lambda_zmad)) +
  ggplot2::geom_tile() +
  ggplot2::geom_sf(data = rsa, inherit.aes = FALSE, fill = NA, color = "black", linewidth = 0.5) +
  ggplot2::facet_wrap(~ label, ncol = 3) +
  scico::scale_fill_scico(
    palette = "vik", limits = c(-4, 4), oob = scales::squish,
    name = "Lambda (z-MAD)", midpoint = 0
  ) +
  ggplot2::labs(title = "Spatial invasion fitness (facets ordered by mean \u03BB)",
       x = "Longitude", y = "Latitude") +
  theme_minimal(12) +
  theme(panel.grid = ggplot2::element_blank())
```

> :chart_with_upwards_trend: **Figure 30**: This figure compares the **spatial distribution of invasion fitness** ($\lambda$, expressed as `z-MAD`) for the three highest- and three lowest-ranked invaders, ordered by their mean $\lambda$ values. Across invaders, spatial variation is structured by a common underlying pattern: inland and northeastern regions generally exhibit higher-than-expected invasion fitness (warm tones), whereas coastal and southwestern areas are consistently characterized by lower fitness (cool tones). The top-ranked invaders (inv5, inv9, inv10) show broader and more contiguous zones of positive $\lambda$, indicating relatively favourable conditions across multiple regions. In contrast, the bottom-ranked invaders (inv8, inv2, inv1) display more extensive and spatially consistent negative $\lambda$ values, with only small and fragmented pockets of opportunity.


#### Alternative gradient (viridis/magma)

```{r invader-facet_gradient, fig.alt="Spatial distribution of invasion fitness (magma colour)"}
ggplot2::ggplot(lambda_key, ggplot2::aes(x = x, y = y, fill = lambda_zmad)) +
  ggplot2::geom_tile() +
  ggplot2::geom_sf(data = rsa, inherit.aes = FALSE, fill = NA, color = "black", linewidth = 0.5) +
  ggplot2::facet_wrap(~ invader, ncol = 3) +
  ggplot2::scale_fill_gradientn(
    colours  = viridisLite::magma(256, direction = -1),
    rescaler = function(x, ...) scales::rescale_mid(x, mid = 0),
    limits   = c(-4, 4), oob = scales::squish,
    name     = "Lambda (z-MAD)"
  ) +
  labs(title = "Spatial invasion fitness for top- and bottom-ranked invaders",
       x = "Longitude", y = "Latitude") +
  theme_minimal(12) +
  theme(panel.grid = ggplot2::element_blank())
```

> :chart_with_upwards_trend: **Figure 31**: Same maps with an alternative diverging look using `magma` centered at 0.


#### Invader advantage relative to **site** mean

```{r invader-facet_rel, fig.alt="Deviations from site mean"}
# Emphasize invader-specific advantage relative to the same sites
site_means = df_inv |>
  dplyr::group_by(site) |>
  dplyr::summarise(site_mean = mean(lambda, na.rm = TRUE), .groups = "drop")

lambda_rel = lambda_key |>
  dplyr::left_join(site_means, by = "site") |>
  dplyr::mutate(delta_lambda = lambda - site_mean)

ggplot2::ggplot(lambda_rel, aes(x, y, fill = delta_lambda)) +
  ggplot2::geom_tile() +
  ggplot2::geom_sf(data = rsa, inherit.aes = FALSE, fill = NA, color = "black", linewidth = 0.5) +
  ggplot2::facet_wrap(~ invader, ncol = 3) +
  ggplot2::scale_fill_gradient2(
    name = expression(Delta * lambda),
    low = "#3b4cc0", mid = "white", high = "#b40426", midpoint = 0
  ) +
  labs(title = "Invader advantage relative to site mean (\u0394\u03BB)",
       x = "Longitude", y = "Latitude") +
  theme_minimal(12) +
  theme(panel.grid = ggplot2::element_blank())
```

> :chart_with_upwards_trend: **Figure 32**: Deviations from each site’s mean ($\lambda$) (positive = invader outperforms the typical invader at that site).


:bar_chart: Together, these contrasts highlight that **invader performance is shaped by both shared environmental templates and invader-specific deviations**, with higher-ranked invaders better aligned with spatial hotspots of opportunity.

:sparkles: **Overall importance**: This compact set---distribution, ranks, trait links, and maps---forms a practical **reporting dashboard** for stakeholders.

:bulb: **Summary.** You now have system-level summaries, interpretable ranks, functional signals, and spatial views---ready for communication and action.

:warning: **Checks/tips**: 
* Use a **centered** color scale so zero is visually neutral.
* Keep facet **order** consistent (use the `label` factor).
* If thresholds should be **invader-specific**, compute per-invader `thr_i` and draw contours per group (loop or `group_split()`), since `breaks` can’t vary within a single `geom_contour()` call.
* For publication, export figures with fixed dimensions and fonts (e.g., `ggsave(..., width=7, height=6, dpi=300)`).


```{r save-state, include=FALSE}
# # Save state for downstream vignettes
# # Each vignette runs in a clean R session. We therefore **persist state explicitly**.
# dir.create("_data", showWarnings = FALSE, recursive = TRUE)
# 
# saveRDS(
#   list(
#     fit = fit,
#     site_df = site_df,
#     fitness_df = fitness_df
#   ),
#   # file = "_data/fit_synthesis_inputs.rds",
#   file = "_data/inputs_4.rds",
#   compress = "xz"
# )
```

---

## Session information

```{r session-info}
sessionInfo()
```