
Simulate Model Matrix and Response Vector
simulXy.Rd
Generates synthetic covariates and response vector from a specified distribution for simulation studies or method validation.
Arguments
- n
Integer. Number of observations.
- p
Integer. Total number of covariates in the model matrix.
- interc
Numeric. Intercept to include in the linear predictor. Default is
0
.- beta
Numeric vector of length
p
. Regression coefficients in the linear predictor.- family
Distribution and link function. Allowed:
gaussian()
,binomial()
,poisson()
and ,Gamma()
. Can be a string, function, or family object.- prop
Numeric in
[0,1]
. Used only ifbeta
is missing; proportion of non-zero coefficients inp
. Default is0.1
.- lim.b
Numeric vector of length 2. Range for coefficients if
beta
is missing. Default:c(-3, 3)
.- sigma
Standard deviation of Gaussian response. Default is
1
.- size
Integer. Number of trials for binomial response. Default is
1
.- rho
Numeric. Correlation coefficient for generating covariates. Used to create AR(1)-type covariance:
rho^|i-j|
. Default is0
.- scale.data
Logical. Whether to scale columns of the model matrix. Default is
TRUE
.- seed
Optional. Integer seed for reproducibility.
- X
Optional. Custom model matrix. If supplied, it overrides the internally generated
X
.- dispersion
Dispersion parameter of Gamma response. Default is
0.1
.
Value
A list with components:
- X
Model matrix of dimension
n x p
- y
Simulated response vector
- beta
True regression coefficients used
- eta
Linear predictor
Examples
n <- 100; p <- 100
beta <- c(runif(10, -3, 3), rep(0, p - 10))
sim <- simulXy(n = n, p = p, beta = beta, seed = 1234)
o <- islasso(y ~ ., data = sim$data, family = gaussian())
summary(o, pval = 0.05)
#>
#> Call:
#> islasso(formula = y ~ ., family = gaussian(), data = sim$data)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -1.9777 -0.5928 -0.0602 0.5340 2.2449
#>
#> Estimate Std. Error Df z value Pr(>|z|)
#> (Intercept) -0.1159 0.1258 1.000 -0.921 0.35708
#> X1 0.3477 0.1402 0.976 2.480 0.01315 *
#> X2 -2.3120 0.1492 1.000 -15.497 < 2e-16 ***
#> X3 -0.4404 0.1437 0.995 -3.064 0.00218 **
#> X4 1.3065 0.1457 1.000 8.967 < 2e-16 ***
#> X5 -2.0609 0.1443 1.000 -14.280 < 2e-16 ***
#> X6 -0.5635 0.1448 1.000 -3.892 9.96e-05 ***
#> X7 1.2906 0.1399 1.000 9.225 < 2e-16 ***
#> X8 1.8948 0.1490 1.000 12.719 < 2e-16 ***
#> X9 -1.0072 0.1515 1.000 -6.649 2.96e-11 ***
#> X10 -2.1626 0.1500 1.000 -14.418 < 2e-16 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> (Dispersion parameter for gaussian family taken to be 1.583274)
#>
#> Null deviance: 2492.536 on 99.00 degrees of freedom
#> Residual deviance: 92.352 on 58.33 degrees of freedom
#> AIC: 175.69
#> Lambda: 11.472
#>
#> Number of Newton-Raphson iterations: 102
#>