Sample occurrence data from a virtual prediction surface
Source:R/sample_virtual_data.R
sample_virtual_data.RdSamples n_occ virtual occurrence points from a prediction surface
generated by predict() on a nicheR_ellipsoid object. Unlike
sample_data(), this function only accepts data frame input and is
designed for purely virtual (non-spatial) workflows where no raster or
geographic coordinates are involved.
Usage
sample_virtual_data(
n_occ,
object,
virtual_prediction = NULL,
prediction_layer = NULL,
sampling = "centroid",
method = "suitability",
seed = 1,
verbose = TRUE,
strict = NULL
)Arguments
- n_occ
Integer. Number of occurrence points to sample.
- object
A
nicheR_ellipsoidobject. Used for context and validation but not directly for sampling — prediction values are taken fromvirtual_prediction.- virtual_prediction
A data frame containing the prediction surface to sample from, typically the output of
predict()on anicheR_ellipsoidobject.- prediction_layer
Character. Name of the column to use as prediction values. Required when
virtual_predictioncontains multiple prediction columns.- sampling
Character. Sampling strategy. One of
"centroid"(default),"edge", or"random". Controls where within the niche points are preferentially drawn from.- method
Character. Weighting method. One of
"suitability"(default) or"mahalanobis". Must match the type of values inprediction_layer: suitability values must be in[0, 1], Mahalanobis values must be non-negative.- seed
Integer. Random seed for reproducibility. Default is
1.- verbose
Logical. If
TRUE(default), prints progress messages.- strict
Logical or
NULL. IfTRUE, removesNAand zero-valued rows before sampling (recommended with truncated prediction layers). IfNULL(default), auto-detected from the layer name and the proportion of zeros andNAs in the prediction values.
Value
A data frame of sampled occurrence points with the same columns as
virtual_prediction, minus the internal pred column.
Details
The sampling and method arguments interact to define sampling
weights in the same way as sample_data():
sampling = "centroid",method = "suitability": weights proportional to suitability — higher near the niche center.sampling = "edge",method = "suitability": weights proportional to \(1 - \text{suitability}\) — higher near the niche boundary.sampling = "centroid",method = "mahalanobis": weights inversely proportional to Mahalanobis distance — higher near the centroid.sampling = "edge",method = "mahalanobis": weights proportional to Mahalanobis distance — higher near the boundary.sampling = "random": equal weights regardless of method.
Auto-detection of strict follows the same logic as
sample_data(): it is set to TRUE if the layer name contains
"trunc" or if the proportion of zeros or NAs exceeds 25%.
See also
sample_data for spatial sampling from raster or
data frame prediction surfaces, sample_biased_data for
bias-weighted sampling.