Two-level sample selection
selectMatch.Rd
Carries out a two-level sample selection where the possibility of an initially selected site not wanting to participate is anticipated, and the site is optimally replaced. The procedure aims to reduce the bias (and/or loss of generalizability) with respect to the target population.
Usage
selectMatch(
df,
unitID,
subunitID,
subunitSampVars,
unitVars,
nUnitSamp,
nRepUnits,
nsubUnits,
exactMatchVars = NULL,
calipMatchVars = NULL,
calipValue = 0.2,
seedN = NA,
matchDistance = "mahalanobis",
sizeFlag = TRUE,
repFlag = TRUE,
writeOut = TRUE,
replacementUnitsFilename = "replacementUnits.csv",
subUnitTableFilename = "subUnitTable.csv"
)
Arguments
- df
dataframe; sub-unit level dataframe with both sub-unit and unit level variables
- unitID
character; name of unit ID column
- subunitID
character; name of sub-unit ID column
- subunitSampVars
vector; column names of unit level variables to sample units on
- unitVars
vector; column names of unit level variables to match units on
- nUnitSamp
numeric; number of units to be initially randomly selected
- nRepUnits
numeric; number of replacement units to find for each selected unit
- nsubUnits
numeric; number of sub-units to be randomly selected for each unit
- exactMatchVars
vector; column names of categorical variables on which units must be matched exactly. Must be present in 'unitVars'; default = NULL
- calipMatchVars
vector; column names of continuous variables on which units must be matched within a specified caliper. Must be present in 'unitVars'; default = NULL
- calipValue
numeric; number of standard deviations to be used as caliper for matching units on calipMatchVars
- seedN
numeric; seed number to be used for sampling. If NA, calls set.seed(); default = NA
- matchDistance
character; MatchIt distance parameter to obtain optimal matches (nearest neigboors); default = "mahalanois"
- sizeFlag
logical; if TRUE, sampling is made proportional to unit size; default = TRUE
- repFlag
logical; if TRUE, pick unit matches with/without repetition; default = TRUE
- writeOut
logical; if TRUE, writes a .csv file for each output table; default = TRUE
- replacementUnitsFilename
character; csv filename for saving unit:replacement directory when writeOut == TRUE; default = "replacementUnits.csv"
- subUnitTableFilename
character; csv filename for saving unit:replacement directory when writeOut == TRUE; default = "subUnitTable.csv"