SSVSquantreg {MCMCpack} | R Documentation |
This function uses stochastic search to select promising regression models at a fixed quantile tau. Indicator variables gamma are used to represent whether a predictor is included in the model or not. The user supplies the data and the prior distribution on the model size. A list is returned containing the posterior sample of gamma and the associated regression parameters beta.
SSVSquantreg(formula, data = NULL, tau = 0.5, include=NULL, burnin = 1000, mcmc = 10000, thin = 1, verbose = 0, seed = sample(1:1000000,1), pi0a0 = 1, pi0b0 = 1, ...)
formula |
Model formula. |
data |
Data frame. |
tau |
The quantile of interest. Must be between 0 and 1. The default value of 0.5 corresponds to median regression model selection. |
include |
The predictor(s) that should definitely appear in the model. Can be specified by name, or their position in the formula (taking into account the intercept). |
burnin |
The number of burn-in iterations for the sampler. |
mcmc |
The number of MCMC iterations after burnin. |
thin |
The thinning interval used in the simulation. The number of MCMC iterations must be divisible by this value. |
verbose |
A switch which determines whether or not the progress of
the sampler is printed to the screen. If |
seed |
The seed for the random number generator. If NA, the Mersenne
Twister generator is used with default seed 12345; if an integer is
passed it is used to seed the Mersenne twister. The default value for this argument is a random integer between 1 and 1,000,000.
This default value ensures that if the function is used again with a different value of tau, it is extremely unlikely that the seed will be identical. The user can also
pass a list of length two to use the L'Ecuyer random number generator,
which is suitable for parallel computation. The first element of the
list is the L'Ecuyer seed, which is a vector of length six or NA (if NA
a default seed of |
pi0a0, pi0b0 |
Hyperparameters of the beta prior on pi_0, the prior probability of including a predictor. Default values of (1,1) are equivalent to a uniform distribution. |
... |
Further arguments |
SSVSquantreg
implements stochastic search variable selection over a set of potential predictors to obtain promising models.
The models considered take the following form:
Q_tau(y_i|x_igamma) = x_igamma'beta_gamma,
where Q_tau(y_i|x_igamma) denotes the conditional tauth quantile of y_i given x_igamma, x_igamma denotes x_i with those predictors x_ij for which gamma_j=0 removed and beta_gamma denotes the model specific regression parameters.
The likelihood is formed based on the assumption of independent asymmetric Laplace distributions on the y_i with skewness parameter tau and location parameters x_igamma'beta_gamma. This assumption ensures that the likelihood function is maximised by the tauth conditional quantile of the response variable.
The prior on each beta_j is
(1-gamma_j)delta_0+gamma_jCauchy(0,1),
where delta_0 denotes a degenerate distribution with all mass at 0. A standard Cauchy distribution is chosen conditional on gamma_j=1. This allows for a wider range of nonzero values of beta_j than a standard Normal distribution, improving the robustness of the method. Each of the indicator variables gamma_j is independently assigned a Bernoulli prior, with prior probability of inclusion pi_0. This in turn is assigned a beta distribution, resulting in a beta-binomial prior on the model size. The user can supply the hyperparameters for the beta distribution. Starting values are randomly generated from the prior distribution.
It is recommended to standardise any non-binary predictors in order to compare these predictors on the same scale.
This can be achieved using the scale
function.
If it is certain that a predictor should be included, all predictors specified are brought to the first positions for computational convenience. The regression parameters associated with these predictors are given independent improper priors. Users may notice a small speed advantage if they specify the predictors that they feel certain should appear in the model, particularly for large models with a large number of observations.
A list containing:
gamma |
The posterior sample of gamma. This has associated summary and plot methods. |
beta |
The posterior sample of the associated regression parameters beta. This can be analysed with functions from the coda package. |
Craig Reed
Craig Reed, David B. Dunson and Keming Yu. 2010. "Bayesian Variable Selection for Quantile Regression" Technical Report.
Daniel Pemstein, Kevin M. Quinn, and Andrew D. Martin. 2007. Scythe Statistical Library 1.2. http://scythe.wustl.edu.
Keming Yu and Jin Zhang. 2005. "A Three Parameter Asymmetric Laplace Distribution and it's extensions." Communications in Statistics - Theory and Methods, 34, 1867-1879.
Martyn Plummer, Nicky Best, Kate Cowles, and Karen Vines. 2002. Output Analysis and Diagnostics for MCMC (CODA). http://www-fis.iarc.fr/coda/.
MCMCquantreg
,
summary.qrssvs
,
plot.qrssvs
,
mptable
,
topmodels
,
scale
,
rq
## Not run: set.seed(1) epsilon<-rnorm(100) set.seed(2) x<-matrix(rnorm(1000),100,10) y<-x[,1]+x[,10]+epsilon qrssvs<-SSVSquantreg(y~x) model.50pc<-SSVSquantreg(y~x) model.90pc<-SSVSquantreg(y~x,tau=0.9) summary(model.50pc) ## Intercept not in median probability model summary(model.90pc) ## Intercept appears in median probability model ## End(Not run)