R: Empirical Variance of H-T Density Estimate

empirical.varD {secr}

R Documentation

Empirical Variance of H-T Density Estimate

Description

Compute Horvitz-Thompson-like estimate of population density from a previously fitted spatial detection model, and estimate its sampling variance using the empirical spatial variance of the number observed in replicate sampling units. Wrapper functions are provided for several different scenarios, but all ultimately call derivednj. The function derived also computes Horvitz-Thompson-like estimates, but it assumes a Poisson or binomial distribution of total number when computing the sampling variance.

Usage


derivednj ( nj, esa, se.esa = NULL, method = c("SRS", "R2", "R3", "local",
    "poisson", "binomial"), xy = NULL, alpha = 0.05, loginterval = TRUE, 
    area = NULL, independent.esa = FALSE )

derivedMash ( object, sessnum = NULL,  method = c("SRS", "local"),
    alpha = 0.05, loginterval = TRUE)

derivedCluster ( object, method = c("SRS", "R2", "R3", "local", "poisson", "binomial"),
    alpha = 0.05, loginterval = TRUE)

derivedSession ( object,  method = c("SRS", "R2", "R3", "local", "poisson", "binomial"), 
    xy = NULL, alpha = 0.05, loginterval = TRUE, area = NULL, independent.esa = FALSE )

derivedExternal ( object, sessnum = NULL, nj, cluster, buffer = 100,
    mask = NULL, noccasions = NULL,  method = c("SRS", "local"), xy = NULL,
    alpha = 0.05, loginterval = TRUE)

Arguments

`object`	fitted secr model
`nj`	vector of number observed in each sampling unit (cluster)
`esa`	estimate of effective sampling area ( $\hat{a}$ )
`se.esa`	estimated standard error of effective sampling area ( $\widehat{SE}(\hat{a})$ )
`method`	character string ‘SRS’ or ‘local’
`xy`	dataframe of x- and y- coordinates (`method = "local"` only)
`alpha`	alpha level for confidence intervals
`loginterval`	logical for whether to base interval on log(N)
`area`	area of region for method = "binomial" (hectares)
`independent.esa`	logical; controls variance contribution from esa (see Details)
`sessnum`	index of session in object$capthist for which output required
`cluster`	‘traps’ object for a single cluster
`buffer`	width of buffer in metres (ignored if `mask` provided)
`mask`	mask object for a single cluster of detectors
`noccasions`	number of occasions (for `nj`)

Details

derivednj accepts a vector of counts (nj), along with $\hat{a}$ and $\widehat{SE}(\hat{a})$ . The argument esa may be a scalar or (if se.esa is NULL) a 2-column matrix with $\hat{a_j}$ and $\widehat{SE}(\hat{a_j})$ for each replicate $j$ (row). In the special case that nj is of length 1, or method takes the values ‘poisson’ or ‘binomial’, the variance is computed using a theoretical variance rather than an empirical estimate. The value of method corresponds to ‘distribution’ in derived, and defaults to ‘poisson’. For method = 'binomial' you must specify area (see Examples).

If independent.esa is TRUE then independence is assumed among cluster-specific estimates of esa, and esa variances are summed. The default is a weighted sum leading to higher overall variance.

derivedCluster accepts a model fitted to data from clustered detectors; each cluster is interpreted as a replicate sample. It is assumed that the sets of individuals sampled by different clusters do not intersect, and that all clusters have the same geometry (spacing, detector number etc.).

derivedMash accepts a model fitted to clustered data that have been ‘mashed’ for fast processing (see mash); each cluster is a replicate sample: the function uses the vector of cluster frequencies ( $n_j$ ) stored as an attribute of the mashed capthist by mash.

derivedExternal combines detection parameter estimates from a fitted model with a vector of frequencies nj from replicate sampling units configured as in cluster. Detectors in cluster are assumed to match those in the fitted model with respect to type and efficiency, but sampling duration (noccasions), spacing etc. may differ. The mask should match cluster; if mask is missing, one will be constructed using the buffer argument and defaults from make.mask.

derivedSession accepts a single fitted model that must span multiple sessions; each session is interpreted as a replicate sample.

Spatial variance is calculated by one of these methods

Method	Description
`"SRS"`	simple random sampling with identical clusters
`"R2"`	variable cluster size cf Thompson (2002:70) estimator for line transects
`"R3"`	variable cluster size cf Buckland et al. (2001)
`"local"`	neighbourhood variance estimator (Stevens and Olsen 2003) SUSPENDED in 4.4.7
`"poisson"`	theoretical (model-based) variance
`"binomial"`	theoretical (model-based) variance in given `area`

The weighted options R2 and R3 substitute $\hat{a_j}$ for line length $l_k$ in the corresponding formulae of Fewster et al. (2009, Eq 3,5). Density is estimated by $D = n/A$ where $A = \sum a_j$ . The variance of $A$ is estimated as the sum of the cluster-specific variances, assuming independence among clusters. Fewster et al. (2009) found that an alternative estimator for line transects derived by Thompson (2002) performed better when there were strong density gradients correlated with line length (R2 in Fewster et al. 2009, Eq 3).

The neighborhood variance estimator is implemented in package spsurvey and was originally proposed for generalized random tessellation stratified (GRTS) samples. For ‘local’ variance estimates, the centre of each replicate must be provided in xy, except where centres may be inferred from the data. It is unclear whether ‘local’ can or should be used when clusters vary in size.

derivedSystematic, now defunct, was an experimental function in earlier versions of secr.

Value

Dataframe with one row for each derived parameter (‘esa’, ‘D’) and columns as below

estimate	estimate of derived parameter
SE.estimate	standard error of the estimate
lcl	lower 100(1--alpha)% confidence limit
ucl	upper 100(1--alpha)% confidence limit
CVn	relative SE of number observed (across sampling units)
CVa	relative SE of effective sampling area
CVD	relative SE of density estimate

Note

The variance of a Horvitz-Thompson-like estimate of density may be estimated as the sum of two components, one due to uncertainty in the estimate of effective sampling area ( $\hat{a}$ ) and the other due to spatial variance in the total number of animals $n$ observed on $J$ replicate sampling units ( $n = \sum_{j=1}^{J}{n_j}$ ). We use a delta-method approximation that assumes independence of the components:

$\widehat{\mathrm{var}}(\hat{D}) = \hat{D}^2 \{\frac{\widehat{\mathrm{var}}(n)}{n^2} + \frac{\widehat{\mathrm{var}}(\hat{a})}{\hat{a}^2}\}$

where $\widehat{\mathrm{var}}(n) = \frac{J}{J-1} \sum_{j=1}^{J}(n_j-n/J)^2$ . The estimate of $\mathrm{var}(\hat{a})$ is model-based while that of $\mathrm{var}(n)$ is design-based. This formulation follows that of Buckland et al. (2001, p. 78) for conventional distance sampling. Given sufficient independent replicates, it is a robust way to allow for unmodelled spatial overdispersion.

There is a complication in SECR owing to the fact that $\hat{a}$ is a derived quantity (actually an integral) rather than a model parameter. Its sampling variance $\mathrm{var}(\hat{a})$ is estimated indirectly in secr by combining the asymptotic estimate of the covariance matrix of the fitted detection parameters $\theta$ with a numerical estimate of the gradient of $a(\theta)$ with respect to $\theta$ . This calculation is performed in derived.

References

Buckland, S. T., Anderson, D. R., Burnham, K. P., Laake, J. L., Borchers, D. L. and Thomas, L. (2001) Introduction to Distance Sampling: Estimating Abundance of Biological Populations. Oxford University Press, Oxford.

Fewster, R. M. (2011) Variance estimation for systematic designs in spatial surveys. Biometrics 67, 1518–1531.

Fewster, R. M., Buckland, S. T., Burnham, K. P., Borchers, D. L., Jupp, P. E., Laake, J. L. and Thomas, L. (2009) Estimating the encounter rate variance in distance sampling. Biometrics 65, 225–236.

Stevens, D. L. Jr and Olsen, A. R. (2003) Variance estimation for spatially balanced samples of environmental resources. Environmetrics 14, 593–610.

Thompson, S. K. (2002) Sampling. 2nd edition. Wiley, New York.

Examples


## The `ovensong' data are pooled from 75 replicate positions of a
## 4-microphone array. The array positions are coded as the first 4
## digits of each sound identifier. The sound data are initially in the
## object `signalCH'. We first impose a 52.5 dB signal threshold as in
## Dawson & Efford (2009, J. Appl. Ecol. 46:1201--1209). The vector nj
## includes 33 positions at which no ovenbird was heard. The first and
## second columns of `temp' hold the estimated effective sampling area
## and its standard error.

## Not run: 

signalCH.525 <- subset(signalCH, cutval = 52.5)
nonzero.counts <- table(substring(rownames(signalCH.525),1,4))
nj <- c(nonzero.counts, rep(0, 75 - length(nonzero.counts)))
temp <- derived(ovensong.model.1, se.esa = TRUE)
derivednj(nj, temp["esa",1:2])

## The result is very close to that reported by Dawson & Efford
## from a 2-D Poisson model fitted by maximizing the full likelihood.

## If nj vector has length 1, a theoretical variance is used...
msk <- ovensong.model.1$mask
A <- nrow(msk) * attr(msk, "area")
derivednj (sum(nj), temp["esa",1:2], method = "poisson")
derivednj (sum(nj), temp["esa",1:2], method = "binomial", area = A)

## Set up an array of small (4 x 4) grids,
## simulate a Poisson-distributed population,
## sample from it, plot, and fit a model.
## mash() condenses clusters to a single cluster

testregion <- data.frame(x = c(0,2000,2000,0),
    y = c(0,0,2000,2000))
t4 <- make.grid(nx = 4, ny = 4, spacing = 40)
t4.16 <- make.systematic (n = 16, cluster = t4,
    region = testregion)
popn1 <- sim.popn (D = 5, core = testregion,
    buffer = 0)
capt1 <- sim.capthist(t4.16, popn = popn1)
fit1 <- secr.fit(mash(capt1), CL = TRUE, trace = FALSE)

## Visualize sampling
tempmask <- make.mask(t4.16, spacing = 10, type =
    "clusterbuffer")
plot(tempmask)
plot(t4.16, add = TRUE)
plot(capt1, add = TRUE)

## Compare model-based and empirical variances.
## Here the answers are similar because the data
## were simulated from a Poisson distribution,
## as assumed by \code{derived}

derived(fit1)
derivedMash(fit1)

## Now simulate a patchy distribution; note the
## larger (and more credible) SE from derivedMash().

popn2 <- sim.popn (D = 5, core = testregion, buffer = 0,
    model2D = "hills", details = list(hills = c(-2,3)))
capt2 <- sim.capthist(t4.16, popn = popn2)
fit2 <- secr.fit(mash(capt2), CL = TRUE, trace = FALSE)
derived(fit2)
derivedMash(fit2)

## The detection model we have fitted may be extrapolated to
## a more fine-grained systematic sample of points, with
## detectors operated on a single occasion at each...
## Total effort 400 x 1 = 400 detector-occasions, compared
## to 256 x 5 = 1280 detector-occasions for initial survey.

t1 <- make.grid(nx = 1, ny = 1)
t1.100 <- make.systematic (cluster = t1, spacing = 100,
    region = testregion)
capt2a <- sim.capthist(t1.100, popn = popn2, noccasions = 1)
## one way to get number of animals per point
nj <- attr(mash(capt2a), "n.mash")
derivedExternal (fit2, nj = nj, cluster = t1, buffer = 100,
    noccasions = 1)

## Review plots
base.plot <- function() {
    MASS::eqscplot( testregion, axes = FALSE, xlab = "",
        ylab = "", type = "n")
    polygon(testregion)
}
par(mfrow = c(1,3), xpd = TRUE, xaxs = "i", yaxs = "i")
base.plot()
plot(popn2, add = TRUE, col = "blue")
mtext(side=3, line=0.5, "Population", cex=0.8, col="black")
base.plot()
plot (capt2a, add = TRUE,title = "Extensive survey")
base.plot()
plot(capt2, add = TRUE, title = "Intensive survey")
par(mfrow = c(1,1), xpd = FALSE, xaxs = "r", yaxs = "r")  ## defaults


## Weighted variance

derivedSession(ovenbird.model.1, method = "R2")


## End(Not run)

[Package secr version 5.2.1 Index]