
Breast Cancer microarray experiment
breast.Rd
This data set details a microarray experiment for 52 breast cancer patients. The binary variable status
indicates whether or not the patient died of breast cancer (status = 0
: did not die, status = 1
: died). The other variables represent amplification or deletion of specific genes.
Format
A data frame with 52 rows and multiple variables, including a binary status
and gene-level measurements.
Source
Dr. John Bartlett and Dr. Caroline Witton, Division of Cancer Sciences and Molecular Pathology, University of Glasgow, Glasgow Royal Infirmary.
Details
Unlike gene expression studies, this experiment focuses on measuring gene amplification or deletion-the number of DNA copies for a given genomic sequence. The goal is to identify key genomic markers distinguishing aggressive from non-aggressive breast cancer.
The experiment was conducted by Dr. John Bartlett and Dr. Caroline Witton in the Division of Cancer Sciences and Molecular Pathology at the University of Glasgow's Royal Infirmary.
References
Augugliaro L., Mineo A.M. and Wit E.C. (2013). dgLARS: a differential geometric approach to sparse generalized linear models, Journal of the Royal Statistical Society. Series B, Vol 75(3), 471-498. Wit E.C. and McClure J. (2004). Statistics for Microarrays: Design, Analysis and Inference, Chichester: Wiley.
Examples
data(breast)
str(breast)
#> 'data.frame': 52 obs. of 288 variables:
#> $ status : int 1 0 1 1 1 0 1 1 1 0 ...
#> $ CEB108.T7 : num 0.1142 -0.1098 -0.0141 -0.0899 0.1124 ...
#> $ X1PTEL06 : num 0.2439 -0.137 -0.1602 -0.0823 0.0139 ...
#> $ CDC2L1.p58. : num -0.0276 -0.0661 -0.1132 -0.1625 -0.121 ...
#> $ PRKCZ : num -0.2692 0.179 -0.0151 -0.0481 0.131 ...
#> $ TP73 : num -0.1972 -0.078 -0.0566 -0.064 -0.003 ...
#> $ D1S2660 : num -0.02225 -0.00602 -0.04291 -0.2269 -0.00501 ...
#> $ D1S214 : num -0.0987 0.0797 0.1579 0.0564 0.2708 ...
#> $ D1S1635 : num -0.2231 -0.2692 -0.207 -0.0387 -0.2219 ...
#> $ D1S199 : num -0.1791 -0.1696 -0.0943 0.0411 -0.001 ...
#> $ FGR.SRC2. : num -0.159 -0.2169 -0.0987 -0.2705 -0.1602 ...
#> $ MYCL1.LMYC. : num -0.219 -0.475 -0.25 -0.439 -0.489 ...
#> $ D1S427.FAF1 : num 0.29639 0.16551 -0.00702 -0.01309 0.06672 ...
#> $ D1S500 : num -0.0534 -0.0305 0.0989 0.0834 0.0237 ...
#> $ D1S418 : num -0.2679 0.1458 0.0247 -0.0233 -0.1143 ...
#> $ NRAS : num 0.0788 -0.3106 -0.0812 -0.1358 -0.3133 ...
#> $ D1S2465.D1S3402 : num 0.823 0.207 0.251 0.268 0.273 ...
#> $ WI.5663.WI.13414 : num -0.3271 -0.4323 -0.2157 -0.0492 -0.2332 ...
#> $ LAMC2 : num -0.14 -0.342 -0.109 -0.364 -0.526 ...
#> $ PTGS2.COX2. : num 0.167 0.18 0.14 0.12 0.17 ...
#> $ TGFB2 : num 0.9501 0.0325 0.1266 -0.064 -0.0576 ...
#> $ AKT3 : num 0.3008 0.0751 0.1544 -0.1347 -0.0243 ...
#> $ SHGC.18290 : num -0.294 -0.524 -0.145 -0.26 -0.064 ...
#> $ X1QTEL10 : num 0.0212 -0.312 -0.1065 0.0649 -0.0619 ...
#> $ U32389 : num -0.7487 0.0834 0.1527 0.0554 0.1258 ...
#> $ X2PTEL27 : num 0.2167 0.0431 -0.0619 -0.1244 -0.0608 ...
#> $ MYCN.N.myc. : num -0.146 -0.393 -0.226 -0.198 -0.423 ...
#> $ MSH2.KCNK12 : num -0.1555 0.0237 0.0266 0.1007 0.0971 ...
#> $ REL : num 1.042 -0.691 -0.32 -0.223 -0.456 ...
#> $ GNLY : num -0.1851 0.002 0.0602 0.0469 0.0354 ...
#> $ SGC34236 : num 0.1947 0.0573 0.1319 -0.045 0.0305 ...
#> $ BIN1 : num -0.5621 -0.3567 -0.3202 -0.0943 -0.4095 ...
#> $ LRP1B : num 0.0276 -0.0336 0.0286 -0.1233 0.0218 ...
#> $ TBR1 : num 0.65285 0.00995 -0.13353 -0.08121 -0.21072 ...
#> $ ITGA4 : num 0.1519 0.0535 0.1231 0.1249 0.1956 ...
#> $ CASP8 : num -0.269 -0.494 -0.183 -0.167 -0.24 ...
#> $ ERBB4.HER.4. : num 0.0344 0.1773 -0.2132 -0.2408 -0.1755 ...
#> $ WI.6310 : num 0.045 0.4337 0.0853 0.063 0.0971 ...
#> $ D2S447 : num 0.1501 0.2247 0.1115 0.0705 0.145 ...
#> $ X3PTEL25 : num -0.1543 0.0898 0.1133 0.2539 0.2523 ...
#> $ X3PTEL01.CHL1 : num 0.0194 0.3605 0.0402 0.1053 0.0469 ...
#> $ VHL : num 0.55331 -0.10536 -0.08665 0.001 0.00797 ...
#> $ RAF1 : num 0.3729 0.2889 0.207 0.0714 0.1249 ...
#> $ THRB : num 0.124 0.2231 -0.0171 0.1621 0.0733 ...
#> $ MLH1 : num -0.17198 0.06953 -0.17316 -0.00501 -0.16487 ...
#> $ RASSF1 : num -0.6463 -0.172 -0.0315 0.0507 -0.0182 ...
#> $ FHIT : num 0.0989 0.1638 -0.3243 -0.1851 -0.2705 ...
#> $ p44S10 : num -0.0877 0.0705 0.0402 0.0686 -0.1661 ...
#> $ D3S1274..ROBO1 : num -0.0346 -0.1065 -0.0888 -0.0419 -0.091 ...
#> $ RBP1.RBP2 : num -0.0111 -0.0651 0.2207 0.0488 0.1071 ...
#> $ TERC : num 0.4259 0.0723 -0.0899 -0.0161 -0.2549 ...
#> $ EIF5A2 : num 0.459 0.16 -0.338 -0.12 -0.177 ...
#> $ PIK3CA : num -0.0278 -0.089 -0.2029 -0.5361 -0.7744 ...
#> $ TP63 : num 0.0478 -0.2653 -0.0356 -0.0161 -0.0387 ...
#> $ MFI2 : num -0.2614 0.1098 0.1089 0.0431 0.1441 ...
#> $ X3QTEL05 : num 0.2723 0.3674 0.0119 0.0109 0.0554 ...
#> $ GS10K2.T7 : num -0.0747 -0.2058 -0.1815 -0.1924 -0.091 ...
#> $ SHGC4.207 : num -0.25877 0.08066 0.10346 -0.11429 -0.00803 ...
#> $ D4S114 : num 0.00698 -0.23699 -0.10536 0.05543 -0.00602 ...
#> $ WHSC1 : num 0.50259 -0.40347 -0.16487 0.00499 -0.09541 ...
#> $ DDX15 : num -0.2083 -0.1684 0.0714 -0.0398 -0.0429 ...
#> $ KIT : num 0.3199 -0.0694 0.1124 0.1089 0.0944 ...
#> $ PDGFRA : num 0.0235 -0.1165 -0.1065 -0.2408 -0.2446 ...
#> $ EIF4E : num 0.00787 -0.23572 -0.18995 0.04784 -0.05869 ...
#> $ PGRMC2 : num -0.01106 0.12487 0.11422 -0.001 0.00598 ...
#> $ PDZ.GEF1 : num 0.1689 0.1638 0.0526 0.077 0.0188 ...
#> $ X4QTEL11 : num -0.1497 0.1997 0.2562 -0.2269 0.0807 ...
#> $ D4S2930 : num -0.4155 -0.2357 -0.2395 -0.172 -0.0877 ...
#> $ C84C11.T3 : num 0.0998 0.2577 0.39 0.3743 0.3695 ...
#> $ D5S23 : num 0.329 0.641 0.449 0.313 0.5 ...
#> $ D5S2064 : num -0.00602 0.01292 0.01587 -0.03459 -0.01309 ...
#> $ DAB2 : num -0.1567 -0.2182 0.131 0.2239 0.0188 ...
#> $ DHFR.MSH3 : num 0.833 0.286 0.188 0.179 0.194 ...
#> $ APC : num -0.414 -0.3813 -0.091 -0.0747 -0.2771 ...
#> $ EGR1 : num 0.1284 -0.0274 -0.0877 -0.091 -0.0598 ...
#> $ CSF1R : num 0.4001 -0.478 -0.1054 -0.0243 -0.0856 ...
#> $ NIB1408 : num -0.0715 0.4434 0.2585 0.0218 0.3038 ...
#> $ X5QTEL70 : num -0.0965 0.2231 0.1284 -0.1165 0.0723 ...
#> $ X9PTEL48 : num 0.255 0.187 0.21 0.141 0.166 ...
#> $ PIM1 : num 0.0788 -0.0263 0.0459 0.1089 0.0779 ...
#> $ CCND3 : num -0.00451 -0.17198 -0.02634 -0.11766 -0.10203 ...
#> $ D6S414 : num 0.0554 0.1947 0.0198 0.0178 0.0723 ...
#> $ HTR1B : num -0.0943 -0.0356 -0.002 -0.0284 0.0109 ...
#> $ D6S434 : num 0.7154 0.044 -0.0987 0.1071 0.0695 ...
#> $ D6S268 : num -0.0608 -0.2679 -0.212 -0.1863 -0.045 ...
#> $ MYB : num 0.108 0.0119 -0.0336 -0.0121 0.0459 ...
#> $ D6S311 : num 0.45 0.23 0.127 0.201 0.273 ...
#> $ ESR1 : num 0.7575 -0.2877 -0.2957 -0.0182 -0.0769 ...
#> $ X6QTEL54 : num -0.609 -0.819 -0.56 -0.562 -0.37 ...
#> $ G31341 : num -0.3025 0.0373 -0.1267 0.1432 -0.1381 ...
#> $ IL6 : num 0.0237 0.1672 0.1484 0.1638 0.1007 ...
#> $ EGFR : num -0.405 -0.205 -0.136 -0.201 -0.273 ...
#> $ ELN : num -0.1567 -0.2206 -0.0932 -0.1997 -0.0812 ...
#> $ RFC2.CYLN2 : num -0.0367 0.198 0.1035 0.0383 0.1151 ...
#> $ ABCB1.MDR1. : num 0.168 -0.154 -0.217 -0.13 -0.263 ...
#> $ CDK6 : num -0.091 -0.2666 0.0611 0.1089 0.0816 ...
#> $ SERPINE1 : num -0.0758 0.03922 -0.00702 0.0373 0.01587 ...
#> $ MET : num 0.459 0.128 0.134 -0.002 -0.079 ...
#> $ TIF1 : num -0.2408 -0.0726 -0.0492 0.2677 -0.0683 ...
#> [list output truncated]
table(breast$status)
#>
#> 0 1
#> 23 29
if (FALSE) { # \dontrun{
fit <- islasso.path(status ~ ., data = breast, family = binomial(),
alpha = 0, control = is.control(trace = 2L))
temp <- GoF.islasso.path(fit)
lambda.aic <- temp$lambda.min["AIC"]
fit.aic <- islasso(status ~ ., data = breast, family = binomial(),
alpha = 0, lambda = lambda.aic)
summary(fit.aic, pval = 0.05)
} # }