--- title: "Stratification" output: rmarkdown::html_vignette description: > Learn how to use strat* functions. vignette: > %\VignetteIndexEntry{Stratification} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{css,echo=FALSE} .infobox { padding: 1em 1em 1em 4em; margin-bottom: 10px; border: 2px solid #43602d; border-radius: 10px; background: #738e41 5px center/3em no-repeat; } .caution { background-image: url("https://tgoodbody.github.io/sgsR/logo.png"); } ``` ```{r,warning=F,message=F,echo=FALSE,results=FALSE} library(sgsR) library(terra) library(sf) #--- Load mraster and access files ---# r <- system.file("extdata", "mraster.tif", package = "sgsR") #--- load the mraster using the terra package ---# mraster <- terra::rast(r) a <- system.file("extdata", "access.shp", package = "sgsR") #--- load the access vector using the sf package ---# access <- sf::st_read(a, quiet = TRUE) #--- apply quantiles algorithm to metrics raster ---# sraster <- strat_quantiles( mraster = mraster$zq90, # use mraster as input for sampling nStrata = 4 ) # algorithm will produce 4 strata #--- apply stratified sampling algorithm ---# existing <- sample_strat( sraster = sraster, # use mraster as input for sampling nSamp = 200, # request 200 samples be taken mindist = 100 ) # define that samples must be 100 m apart #--- algorithm table ---# a <- c("`strat_kmeans()`", "`strat_quantiles()`", "`strat_breaks()`", "`strat_poly()`", "`strat_map()`") d <- c("kmeans", "Quantiles", "User-defined breaks", "Polygons", "Maps (combines) `srasters`") s <- c("Unsupervised", "Either", "Supervised", "Supervised", "Unsupervised") urls <- c("#kmeans", "#quantiles", "#breaks", "#poly", "#map") df <- data.frame(Algorithm = a, Description = d, Approach = s) df$Algorithm <- paste0("[", df$Algorithm, "](", urls, ")") ``` Fundamental to many structurally guided sampling approaches is the use of stratification methods that allow for more effective and representative sampling protocols. It is important to note that the data sets being used as inputs are considered to be populations. Currently, there are 5 functions associated with the `strat` verb in the `sgsR` package: ```{r echo=FALSE} knitr::kable(df, align = "c") ``` ## `strat_kmeans` {#kmeans .unnumbered} `strat_kmeans()` uses kmeans clustering to produce an `sraster` output. ```{r,warning=F,message=F} #--- perform stratification using k-means ---# strat_kmeans( mraster = mraster, # input nStrata = 5 ) # algorithm will produce 4 strata ``` ::: {.infobox .caution data-latex="{caution}"} **TIP!** `plot = FALSE` is the default for all functions. `plot = TRUE` will visualize raster and vector ouputs. ::: ```{r,warning=F,message=F} strat_kmeans( mraster = mraster, # input nStrata = 10, # algorithm will produce 10 strata iter = 1000, # set minimum number of iterations to determine kmeans centers algorithm = "MacQueen", # use MacQueen algorithm plot = TRUE ) # plot output ``` ## `strat_quantiles` {#quantiles .unnumbered} The `strat_quantiles()` algorithm divides data into equally sized strata (`nStrata`). Similar to `strat_breaks()`, this function is vectorized to allow users to input any number of metrics to stratify (`mraster`) so long as `nStrata` is a list containing a matching number of numeric objects. `nStrata` can be either `nStrata` can be either a scalar integer representing the number of desired output strata, or a numeric vector of probabilities between 0-1 demarcating quantile break points. The `nStrata` list can be a mix of these (e.g. `nStrata = list(c(0.1,0.8,1), 4, 9)` where `mraster` would have 3 layers) to allow users to define both explicit quantile breaks or a desired strata number that is converted to quantiles breaks internally. Specifying `map = TRUE` will combine (map) stratifications of all input `mraster` layers to produce a combined stratified output. ```{r,warning=F,message=F} #--- perform quantiles stratification ---# strat_quantiles( mraster = mraster$zq90, nStrata = 6, plot = TRUE ) #--- vectorized ---# strat_quantiles( mraster = mraster[[1:2]], # two metric layers nStrata = list(c(0.2, 0.4, 0.8), 3), # list with two objects - 1 probability breaks, 1 scalar integer plot = TRUE, # plot output srasters map = TRUE ) # combine stratifications to a mapped output ``` ## `strat_breaks` {#breaks .unnumbered} `strat_breaks()` stratifies data based on user-defined breaks in `mraster`. This algorithm is vectorized. The user can provide an `mraster` with as many layers as they wish as long as the `breaks` parameters is a list of equal length comprised of numeric vectors. Like `strat_quantiles()` this function has the `map` parameter to combine input stratifications to generate a mapped output. ```{r,warning=F,message=F} #--- perform stratification using user-defined breaks ---# #--- define breaks for metric ---# br.pz2 <- c(20, 40, 60, 80) br.pz2 #--- perform stratification using user-defined breaks ---# #--- define breaks for metric ---# br.zq90 <- c(3, 5, 11, 18) br.zq90 ``` Once the breaks are created, we can use them as input into the `strat_breaks()` function using the `breaks` parameter. ```{r,warning=F,message=F} #--- stratify on 1 metric only ---# strat_breaks( mraster = mraster$pzabove2, # single raster breaks = br.pz2, # single set of breaks plot = TRUE ) # plot output ``` ```{r,warning=F,message=F} #--- vectorized ---# strat_breaks( mraster = mraster[[1:2]], # two metrics breaks = list(br.zq90, br.pz2), # list of two breaks vectors map = TRUE, # map final output plot = TRUE ) # plot outputs ``` ## `strat_poly` {#poly .unnumbered} Forest inventories with polygon coverages summarizing forest attributes such as species, management type, or photo-interpreted estimates of volume can be stratified using `strat_poly()`. ::: {.infobox .caution data-latex="{caution}"} **TIP!** Users may wish to stratify based on categorical or empirical variables that are not available through raster data (e.g. species from forest inventory polygons). ::: Users define the input `poly` and its associated `attribute`. A `raster` layer must be defined to guide the spatial extent and resolution for the output stratification polygon. Based on the vector or list of `features`, stratification is applied and the polygon is rasterized into its appropriate strata. ```{r} #--- load in polygon coverage ---# poly <- system.file("extdata", "inventory_polygons.shp", package = "sgsR") fri <- sf::st_read(poly) #--- specify polygon attribute to stratify ---# attribute <- "NUTRIENTS" #--- specify features within attribute & how they should be grouped ---# #--- as a single vector ---# features <- c("poor", "rich", "medium") ``` In our example, `attribute = "NUTRIENTS"` and features within, `c("poor", "rich", "medium")`, define the 3 desired strata. ```{r} #--- stratify polygon coverage ---# srasterpoly <- strat_poly( poly = fri, # input polygon attribute = attribute, # attribute to stratify by features = features, # features within attribute raster = sraster, # raster to define extent and resolution for output plot = TRUE ) # plot output ``` `features` can be grouped. In our example below, `rich` and `medium` features are combined into a single strata, while `low` is left in isolation. The 2 vectors are specified into a list, which will result in the output of 2 strata (low & rich/medium). ```{r} #--- or as multiple lists ---# g1 <- "poor" g2 <- c("rich", "medium") features <- list(g1, g2) strat_poly( poly = fri, attribute = attribute, features = features, raster = sraster, plot = TRUE, details = TRUE ) ``` ::: {.infobox .caution data-latex="{caution}"} **`details`** `details` returns the output `outRaster`, the stratification `$lookUp` table, and the polygon (`$poly`) used to drive the stratification based on attributes and features specified by the users. ::: ## `strat_map` {#map .unnumbered} Users may wish to pair stratifications. `strat_map()` facilitates vectorized mapping of `sraster` layers to generate a unique mapped strata output based on stratum pairings. The `stack` parameter will output a multilayer `sraster` with the inputs (`strata_1, strata_2 ...`) and mapped output (`strata`). This facilitates the user to generate stratifications detailing quantitative and qualitative measures such as structure by species, or multiple qualitative measures such as species by management type. ```{r} #--- stack srasters together ---# srasters <- c(srasterpoly, sraster) plot(srasters) ``` ```{r} #--- map srasters ---# strat_map( sraster = srasters, # two layer sraster plot = TRUE ) ``` The convention for the numeric value of the output strata is the concatenation (merging) of `sraster` layers. Check `$lookUP` for a clear depiction of this step. ```{r} strat_map( sraster = srasters, # input with 2 sraster layers stack = TRUE, # output stacked input (strata_1, strata_2) and output (strata) layers details = TRUE, # provide additional details plot = TRUE ) # plot output ```