sim_eDNA_lmer.Rd
Simulate eDNA data
sim_eDNA_lm(
formula,
variable_list,
betas,
sigma_ln_eDNA,
std_curve_alpha,
std_curve_beta,
n_sim = 1L,
upper_Cq = 40,
prob_zero = 0.08,
X = expand.grid(variable_list),
verbose = FALSE,
cache_dir = tools::R_user_dir("artemis", "cache")
)
sim_eDNA_lmer(
formula,
variable_list,
betas,
sigma_ln_eDNA,
sigma_rand,
std_curve_alpha,
std_curve_beta,
n_sim = 1L,
upper_Cq = 40,
prob_zero = 0.08,
X = expand.grid(variable_list),
verbose = FALSE,
cache_dir = tools::R_user_dir("artemis", "cache")
)
a model formula, e.g. y ~ x1 + x2
. For
sim_eDNA_lmer
, random intercepts can also be provided,
e.g. ( 1 | rep )
.
a named list, with the levels that each variable can take. Please note that the variables listed in the formula, including the response variable, must be present in the variable_list or in the X design matrix. Extra variables, i.e. variables which do not occur in the formula, are ignored.
numeric vector, the beta for each variable in the design matrix
numeric, the measurement error on ln[eDNA].
the alpha value for the formula for converting between log(eDNA concentration) and CQ value
the beta value for the formula for converting between log(eDNA concentration) and CQ value
integer, the number of cases to simulate
numeric, the upper limit on CQ detection. Any value of log(concentration) which would result in a value greater than this limit is instead recorded as the limit.
numeric, between 0 and 1. The probability of seeing a non-detection (i.e., a "zero") via the zero-inflated mechanism. Defaults to 0.08.
optional, a design matrix. By default, this is created
from the variable_list using expand.grid()
, which
creates a balanced design matrix. However, the user can
provide their own X
as well, in which case the
variable_list is ignored. This allows users to provide an
unbalanced design matrix.
logical, when TRUE output from
rstan::sampling
is written to the console.
the cache directory where pre-compiled models are
stored. Defaults to the output of
tools::R_user_dir("artemis", "cache")
numeric vector, the stdev for the random effects. There must be one sigma per random effect specified
S4 object of class "eDNA_simulation_lm/lmer" with the following slots:
the simulated log(concentration)
the simulated CQ values, including the measurement error
the formula for the simulation
named list, the variable levels used for the simulation
numeric vector, the betas for the simulation
data.frame, the design matrix
the alpha for the std curve conversion
the alpha for the std curve conversion
the upper limit for CQ
These functions allow for computationally efficient simulation of
Cq values from a hypothetical eDNA sampling experiment via a
series of effect sizes (betas
) on a number of predictor or
variable levels (variable_levels
). The mechanism for this
model is described in detail in the artemis "Getting Started"
vignette.
The simulation functions call to specialized functions which are written in Stan and are compiled to provide speed. This also allows the simulation functions and the modeling functions to reflect the same process at the code level.
Users will find that sometimes the simulationed response (i.e. Cq values) produced by this function are not similar to expected data collected from a sampling experiment. This circumstance suggests that there is a mismatch between the assumptions of the model and the data generating process in the field. For these circumstances, we suggest:
Check that the betas
provided are the
effect sizes on the predictor on the log[eDNA concentration], and
not the Cq values.
Check that the variable levels provided are representative of real-world circumstances. For example, a sample volume of 0 ml is not possible.
Verify the values for the standard curve alpha and beta. These are specific to each calibration for the lab, so it is important that you use the same conversion between Cq values and log[eDNA concentration] as the comparison data.
# \donttest{
## Includes extra variables
vars = list(Intercept = -10.6,
distance = c(0, 15, 50),
volume = c(25, 50),
biomass = 100,
alive = 1,
tech_rep = 1:10,
rep = 1:3, Cq = 1)
## Intercept only
ans = sim_eDNA_lm(Cq ~ 1, vars,
betas = c(intercept = -15),
sigma_ln_eDNA = 1e-5,
std_curve_alpha = 21.2, std_curve_beta = -1.5)
#> Model executable is up to date!
print(ans)
#>
#> formula: Cq ~ 1
#> <environment: 0x55743d947208>
#>
#> Standard curve parameters: Cq = alpha + beta * log(concentration)
#> Standard curve alpha = 21.2
#> Standard curve beta = -1.5
#>
#> ln concentration:
#> variable level 2.5% 50% 97.5% mean p_detect
#> V1 1 -15 -15 -15 -15 NA
#>
#> simulated Cq:
#> variable level 2.5% 50% 97.5% mean p_detect
#> V1 1 40 40 40 40 0
ans = sim_eDNA_lm(Cq ~ distance + volume, vars,
betas = c(intercept = -10.6, distance = -0.05, volume = 0.1),
sigma_ln_eDNA = 1, std_curve_alpha = 21.2, std_curve_beta = -1.5)
#> Model executable is up to date!
# }