Buildrng the Regression Model I: Model Selection and Validation

Getting data

surgicalData <- read.table("http://users.stat.ufl.edu/~rrandles/sta4210/Rclassnotes/data/textdatasets/KutnerData/Chapter%20%209%20Data%20Sets/CH09TA01.txt", sep ="" , header = FALSE)


#Look at the first 6 entries
head(surgicalData)
   V1 V2  V3   V4 V5 V6 V7 V8   V9   V10
1 6.7 62  81 2.59 50  0  1  0  695 6.544
2 5.1 59  66 1.70 39  0  0  0  403 5.999
3 7.4 57  83 2.16 55  0  0  0  710 6.565
4 6.5 73  41 2.01 48  0  0  0  349 5.854
5 7.8 65 115 4.30 45  0  0  1 2343 7.759
6 5.8 38  72 1.42 65  1  1  0  348 5.852
names(surgicalData) <- c(paste("X", 1:8, sep = ""), "Y" , "lnY")
head(surgicalData)
   X1 X2  X3   X4 X5 X6 X7 X8    Y   lnY
1 6.7 62  81 2.59 50  0  1  0  695 6.544
2 5.1 59  66 1.70 39  0  0  0  403 5.999
3 7.4 57  83 2.16 55  0  0  0  710 6.565
4 6.5 73  41 2.01 48  0  0  0  349 5.854
5 7.8 65 115 4.30 45  0  0  1 2343 7.759
6 5.8 38  72 1.42 65  1  1  0  348 5.852

For simplicity we only use first four explanatory variable and first 54 of the 108 patients.

Stem and Leaf plots for individual X variables

Box plots for individual X variables

Correlation Matrix

First order regression model

The plot suggests that both curvature and nonconstant error variance are apparent. In addition, some departure from normality is suggested by the normal probability plot of residuals.

To make the distribution of the error terms more nearly normal and to see if the same transformation would also reduce the apparent curvature, the investigator examined the logarithmic transformation \(Y' = ln Y\).

Regression model with transformed Y

Now the plots look better!

Scatter Plot Matrix and Correlation Matrix when Response Variable is lnY

cor(model.frame(surgicalModel2))
          lnY          X1          X2          X3        X4
lnY 1.0000000  0.24618787  0.46994325  0.65388548 0.6492627
X1  0.2461879  1.00000000  0.09011973 -0.14963411 0.5024157
X2  0.4699432  0.09011973  1.00000000 -0.02360544 0.3690256
X3  0.6538855 -0.14963411 -0.02360544  1.00000000 0.4164245
X4  0.6492627  0.50241567  0.36902563  0.41642451 1.0000000
#cor(surgicalData[c(10, 1:4)])

pairs(model.frame(surgicalModel2), pch = 20, col = "darkgreen")

All of these plots indicate that each of the predictor variables is Linearly associated with lnY , with X3 and X4 showing the highest degrees of association and X1 the lowest.

The scatter plot matrix and the correlation matrix further show intercorrelations among the potential predictor variables. In particular, X4 has moderately high pairwise correlations with __________ __________ __________ .

Plot with \(r^2\)

library(leaps)
library(dplyr)

?regsubsets #from leaps library

surgicalData
     X1 X2  X3   X4 X5 X6 X7 X8    Y   lnY
1   6.7 62  81 2.59 50  0  1  0  695 6.544
2   5.1 59  66 1.70 39  0  0  0  403 5.999
3   7.4 57  83 2.16 55  0  0  0  710 6.565
4   6.5 73  41 2.01 48  0  0  0  349 5.854
5   7.8 65 115 4.30 45  0  0  1 2343 7.759
6   5.8 38  72 1.42 65  1  1  0  348 5.852
7   5.7 46  63 1.91 49  1  0  1  518 6.250
8   3.7 68  81 2.57 69  1  1  0  749 6.619
9   6.0 67  93 2.50 58  0  1  0 1056 6.962
10  3.7 76  94 2.40 48  0  1  0  968 6.875
11  6.3 84  83 4.13 37  0  1  0  745 6.613
12  6.7 51  43 1.86 57  0  1  0  257 5.549
13  5.8 96 114 3.95 63  1  0  0 1573 7.361
14  5.8 83  88 3.95 52  1  0  0  858 6.754
15  7.7 62  67 3.40 58  0  0  1  702 6.554
16  7.4 74  68 2.40 64  1  1  0  809 6.695
17  6.0 85  28 2.98 36  1  1  0  682 6.526
18  3.7 51  41 1.55 39  0  0  0  205 5.321
19  7.3 68  74 3.56 59  1  0  0  550 6.309
20  5.6 57  87 3.02 63  0  0  1  838 6.731
21  5.2 52  76 2.85 39  0  0  0  359 5.883
22  3.4 83  53 1.12 67  1  1  0  353 5.866
23  6.7 26  68 2.10 30  0  0  1  599 6.395
24  5.8 67  86 3.40 49  1  1  0  562 6.332
25  6.3 59 100 2.95 36  1  1  0  651 6.478
26  5.8 61  73 3.50 62  1  1  0  751 6.621
27  5.2 52  86 2.45 70  0  1  0  545 6.302
28 11.2 76  90 5.59 58  1  0  1 1965 7.583
29  5.2 54  56 2.71 44  1  0  0  477 6.167
30  5.8 76  59 2.58 61  1  1  0  600 6.396
31  3.2 64  65 0.74 53  0  1  0  443 6.094
32  8.7 45  23 2.52 68  0  1  0  181 5.198
33  5.0 59  73 3.50 57  0  1  0  411 6.019
34  5.8 72  93 3.30 39  1  0  1 1037 6.944
35  5.4 58  70 2.64 31  1  1  0  482 6.179
36  5.3 51  99 2.60 48  0  1  0  634 6.453
37  2.6 74  86 2.05 45  0  0  0  678 6.519
38  4.3  8 119 2.85 65  1  0  0  362 5.893
39  4.8 61  76 2.45 51  1  1  0  637 6.457
40  5.4 52  88 1.81 40  1  0  0  705 6.558
41  5.2 49  72 1.84 46  0  0  0  536 6.283
42  3.6 28  99 1.30 55  0  0  1  582 6.366
43  8.8 86  88 6.40 30  1  1  0 1270 7.147
44  6.5 56  77 2.85 41  0  1  0  538 6.288
45  3.4 77  93 1.48 69  0  1  0  482 6.178
46  6.5 40  84 3.00 54  1  1  0  611 6.416
47  4.5 73 106 3.05 47  1  1  0  960 6.867
48  4.8 86 101 4.10 35  1  0  1 1300 7.170
49  5.1 67  77 2.86 66  1  0  0  581 6.365
50  3.9 82 103 4.55 50  0  1  0 1078 6.983
51  6.6 77  46 1.95 50  0  1  0  405 6.005
52  6.4 85  40 1.21 58  0  0  1  579 6.361
53  6.4 59  85 2.33 63  0  1  0  550 6.310
54  8.8 78  72 3.20 56  0  0  0  651 6.478
subsets <- regsubsets(x = surgicalData[1:4], y = surgicalData[,10], which = "rsq", nbest = 4) #nbest: number of subsets of each size to record

summary(subsets)
Subset selection object
4 Variables  (and intercept)
   Forced in Forced out
X1     FALSE      FALSE
X2     FALSE      FALSE
X3     FALSE      FALSE
X4     FALSE      FALSE
4 subsets of each size up to 4
Selection Algorithm: exhaustive
         X1  X2  X3  X4 
1  ( 1 ) " " " " "*" " "
1  ( 2 ) " " " " " " "*"
1  ( 3 ) " " "*" " " " "
1  ( 4 ) "*" " " " " " "
2  ( 1 ) " " "*" "*" " "
2  ( 2 ) " " " " "*" "*"
2  ( 3 ) "*" " " "*" " "
2  ( 4 ) " " "*" " " "*"
3  ( 1 ) "*" "*" "*" " "
3  ( 2 ) " " "*" "*" "*"
3  ( 3 ) "*" " " "*" "*"
3  ( 4 ) "*" "*" " " "*"
4  ( 1 ) "*" "*" "*" "*"
plot(subsets, scale = "r2")

### Using leaps

obj <- leaps(x = surgicalData[1:4], y = surgicalData[,10], method = c( "r2"), nbest = 4) #nbest: number of subsets of each size to record

obj
$which
      1     2     3     4
1 FALSE FALSE  TRUE FALSE
1 FALSE FALSE FALSE  TRUE
1 FALSE  TRUE FALSE FALSE
1  TRUE FALSE FALSE FALSE
2 FALSE  TRUE  TRUE FALSE
2 FALSE FALSE  TRUE  TRUE
2  TRUE FALSE  TRUE FALSE
2 FALSE  TRUE FALSE  TRUE
3  TRUE  TRUE  TRUE FALSE
3 FALSE  TRUE  TRUE  TRUE
3  TRUE FALSE  TRUE  TRUE
3  TRUE  TRUE FALSE  TRUE
4  TRUE  TRUE  TRUE  TRUE

$label
[1] "(Intercept)" "1"           "2"           "3"           "4"          

$size
 [1] 2 2 2 2 3 3 3 3 4 4 4 4 5

$r2
 [1] 0.42756622 0.42154199 0.22084666 0.06060847 0.66328986 0.59948374
 [7] 0.54863462 0.48296742 0.75729185 0.71781636 0.61212320 0.48701249
[13] 0.75921083
tab <- cbind(obj$size, obj$r2)
tab
      [,1]       [,2]
 [1,]    2 0.42756622
 [2,]    2 0.42154199
 [3,]    2 0.22084666
 [4,]    2 0.06060847
 [5,]    3 0.66328986
 [6,]    3 0.59948374
 [7,]    3 0.54863462
 [8,]    3 0.48296742
 [9,]    4 0.75729185
[10,]    4 0.71781636
[11,]    4 0.61212320
[12,]    4 0.48701249
[13,]    5 0.75921083
tab <- data.frame(tab)
tab
   X1         X2
1   2 0.42756622
2   2 0.42154199
3   2 0.22084666
4   2 0.06060847
5   3 0.66328986
6   3 0.59948374
7   3 0.54863462
8   3 0.48296742
9   4 0.75729185
10  4 0.71781636
11  4 0.61212320
12  4 0.48701249
13  5 0.75921083
names(tab) <- c("p", "r2")
tab 
   p         r2
1  2 0.42756622
2  2 0.42154199
3  2 0.22084666
4  2 0.06060847
5  3 0.66328986
6  3 0.59948374
7  3 0.54863462
8  3 0.48296742
9  4 0.75729185
10 4 0.71781636
11 4 0.61212320
12 4 0.48701249
13 5 0.75921083
tabmax <- tab %>%
  group_by(p) %>%
  summarize(mr2 = max(r2))
tabmax
# A tibble: 4 × 2
      p   mr2
  <dbl> <dbl>
1     2 0.428
2     3 0.663
3     4 0.757
4     5 0.759
ggplot(data = tab, aes(x=p, y=r2)) + geom_point(color = "red") + 
   geom_line(data = tabmax, aes(x=p, y=mr2), color = "blue") + theme_bw()

tab2 <- cbind(obj$size, obj$r2, obj$which)
tab2
               1 2 3 4
1 2 0.42756622 0 0 1 0
1 2 0.42154199 0 0 0 1
1 2 0.22084666 0 1 0 0
1 2 0.06060847 1 0 0 0
2 3 0.66328986 0 1 1 0
2 3 0.59948374 0 0 1 1
2 3 0.54863462 1 0 1 0
2 3 0.48296742 0 1 0 1
3 4 0.75729185 1 1 1 0
3 4 0.71781636 0 1 1 1
3 4 0.61212320 1 0 1 1
3 4 0.48701249 1 1 0 1
4 5 0.75921083 1 1 1 1

Plot with \(R^2_{adj}\)

subsets <- regsubsets(x = surgicalData[1:4], y = surgicalData[,10], which = "adjr2", nbest = 4) #nbest: number of subsets of each size to record

summary(subsets)
## Subset selection object
## 4 Variables  (and intercept)
##    Forced in Forced out
## X1     FALSE      FALSE
## X2     FALSE      FALSE
## X3     FALSE      FALSE
## X4     FALSE      FALSE
## 4 subsets of each size up to 4
## Selection Algorithm: exhaustive
##          X1  X2  X3  X4 
## 1  ( 1 ) " " " " "*" " "
## 1  ( 2 ) " " " " " " "*"
## 1  ( 3 ) " " "*" " " " "
## 1  ( 4 ) "*" " " " " " "
## 2  ( 1 ) " " "*" "*" " "
## 2  ( 2 ) " " " " "*" "*"
## 2  ( 3 ) "*" " " "*" " "
## 2  ( 4 ) " " "*" " " "*"
## 3  ( 1 ) "*" "*" "*" " "
## 3  ( 2 ) " " "*" "*" "*"
## 3  ( 3 ) "*" " " "*" "*"
## 3  ( 4 ) "*" "*" " " "*"
## 4  ( 1 ) "*" "*" "*" "*"
plot(subsets, scale = "adjr2")

### Using leaps

obj <- leaps(x = surgicalData[1:4], y = surgicalData[,10], method = c("adjr2"), nbest = 4) # only change the method from previous criterion

obj
## $which
##       1     2     3     4
## 1 FALSE FALSE  TRUE FALSE
## 1 FALSE FALSE FALSE  TRUE
## 1 FALSE  TRUE FALSE FALSE
## 1  TRUE FALSE FALSE FALSE
## 2 FALSE  TRUE  TRUE FALSE
## 2 FALSE FALSE  TRUE  TRUE
## 2  TRUE FALSE  TRUE FALSE
## 2 FALSE  TRUE FALSE  TRUE
## 3  TRUE  TRUE  TRUE FALSE
## 3 FALSE  TRUE  TRUE  TRUE
## 3  TRUE FALSE  TRUE  TRUE
## 3  TRUE  TRUE FALSE  TRUE
## 4  TRUE  TRUE  TRUE  TRUE
## 
## $label
## [1] "(Intercept)" "1"           "2"           "3"           "4"          
## 
## $size
##  [1] 2 2 2 2 3 3 3 3 4 4 4 4 5
## 
## $adjr2
##  [1] 0.41655788 0.41041780 0.20586294 0.04254324 0.65008554 0.58377722
##  [7] 0.53093401 0.46269163 0.74272936 0.70088534 0.58885059 0.45623323
## [13] 0.73955457
tab <- cbind(obj$size, obj$adjr2)
tab <- data.frame(tab)
tab
##    X1         X2
## 1   2 0.41655788
## 2   2 0.41041780
## 3   2 0.20586294
## 4   2 0.04254324
## 5   3 0.65008554
## 6   3 0.58377722
## 7   3 0.53093401
## 8   3 0.46269163
## 9   4 0.74272936
## 10  4 0.70088534
## 11  4 0.58885059
## 12  4 0.45623323
## 13  5 0.73955457
names(tab) <- c("p", "adjr2")
tab 
##    p      adjr2
## 1  2 0.41655788
## 2  2 0.41041780
## 3  2 0.20586294
## 4  2 0.04254324
## 5  3 0.65008554
## 6  3 0.58377722
## 7  3 0.53093401
## 8  3 0.46269163
## 9  4 0.74272936
## 10 4 0.70088534
## 11 4 0.58885059
## 12 4 0.45623323
## 13 5 0.73955457
tabmax <- tab %>%
  group_by(p) %>%
  summarize(madjr2 = max(adjr2))
tabmax
## # A tibble: 4 × 2
##       p madjr2
##   <dbl>  <dbl>
## 1     2  0.417
## 2     3  0.650
## 3     4  0.743
## 4     5  0.740
ggplot(data = tab, aes(x=p, y=adjr2)) + geom_point(color = "red") + 
   geom_line(data = tabmax, aes(x=p, y=madjr2), color = "blue") + theme_bw()

tab2 <- cbind(obj$size, obj$adjr2, obj$which)
tab2
##                1 2 3 4
## 1 2 0.41655788 0 0 1 0
## 1 2 0.41041780 0 0 0 1
## 1 2 0.20586294 0 1 0 0
## 1 2 0.04254324 1 0 0 0
## 2 3 0.65008554 0 1 1 0
## 2 3 0.58377722 0 0 1 1
## 2 3 0.53093401 1 0 1 0
## 2 3 0.46269163 0 1 0 1
## 3 4 0.74272936 1 1 1 0
## 3 4 0.70088534 0 1 1 1
## 3 4 0.58885059 1 0 1 1
## 3 4 0.45623323 1 1 0 1
## 4 5 0.73955457 1 1 1 1
tab2[which.max(tab2[,2]),] # Another way to find the  best subset model using  adjusted r^2 criterion
##                             1         2         3         4 
## 4.0000000 0.7427294 1.0000000 1.0000000 1.0000000 0.0000000

Plot with \(C_p\)

subsets <- regsubsets(x = surgicalData[1:4], y = surgicalData[,10], which = "cp", nbest = 4) #nbest: number of subsets of each size to record

summary(subsets)
## Subset selection object
## 4 Variables  (and intercept)
##    Forced in Forced out
## X1     FALSE      FALSE
## X2     FALSE      FALSE
## X3     FALSE      FALSE
## X4     FALSE      FALSE
## 4 subsets of each size up to 4
## Selection Algorithm: exhaustive
##          X1  X2  X3  X4 
## 1  ( 1 ) " " " " "*" " "
## 1  ( 2 ) " " " " " " "*"
## 1  ( 3 ) " " "*" " " " "
## 1  ( 4 ) "*" " " " " " "
## 2  ( 1 ) " " "*" "*" " "
## 2  ( 2 ) " " " " "*" "*"
## 2  ( 3 ) "*" " " "*" " "
## 2  ( 4 ) " " "*" " " "*"
## 3  ( 1 ) "*" "*" "*" " "
## 3  ( 2 ) " " "*" "*" "*"
## 3  ( 3 ) "*" " " "*" "*"
## 3  ( 4 ) "*" "*" " " "*"
## 4  ( 1 ) "*" "*" "*" "*"
plot(subsets, scale = "Cp")

### Using leaps

obj <- leaps(x = surgicalData[1:4], y = surgicalData[,10], method = c( "Cp"), nbest = 4)

obj
## $which
##       1     2     3     4
## 1 FALSE FALSE  TRUE FALSE
## 1 FALSE FALSE FALSE  TRUE
## 1 FALSE  TRUE FALSE FALSE
## 1  TRUE FALSE FALSE FALSE
## 2 FALSE  TRUE  TRUE FALSE
## 2 FALSE FALSE  TRUE  TRUE
## 2  TRUE FALSE  TRUE FALSE
## 2 FALSE  TRUE FALSE  TRUE
## 3  TRUE  TRUE  TRUE FALSE
## 3 FALSE  TRUE  TRUE  TRUE
## 3  TRUE FALSE  TRUE  TRUE
## 3  TRUE  TRUE FALSE  TRUE
## 4  TRUE  TRUE  TRUE  TRUE
## 
## $label
## [1] "(Intercept)" "1"           "2"           "3"           "4"          
## 
## $size
##  [1] 2 2 2 2 3 3 3 3 4 4 4 4 5
## 
## $Cp
##  [1]  66.488856  67.714773 108.555776 141.163851  20.519679  33.504067
##  [7]  43.851738  57.214850   3.390508  11.423673  32.931969  58.391689
## [13]   5.000000
tab <- cbind(obj$size, obj$Cp)

tab <- data.frame(tab)
tab
##    X1         X2
## 1   2  66.488856
## 2   2  67.714773
## 3   2 108.555776
## 4   2 141.163851
## 5   3  20.519679
## 6   3  33.504067
## 7   3  43.851738
## 8   3  57.214850
## 9   4   3.390508
## 10  4  11.423673
## 11  4  32.931969
## 12  4  58.391689
## 13  5   5.000000
names(tab) <- c("p", "Cp")
tab 
##    p         Cp
## 1  2  66.488856
## 2  2  67.714773
## 3  2 108.555776
## 4  2 141.163851
## 5  3  20.519679
## 6  3  33.504067
## 7  3  43.851738
## 8  3  57.214850
## 9  4   3.390508
## 10 4  11.423673
## 11 4  32.931969
## 12 4  58.391689
## 13 5   5.000000
tabmax <- tab %>%
  group_by(p) %>%
  summarize(mCp = min(Cp))
tabmax
## # A tibble: 4 × 2
##       p   mCp
##   <dbl> <dbl>
## 1     2 66.5 
## 2     3 20.5 
## 3     4  3.39
## 4     5  5.00
ggplot(data = tab, aes(x=p, y=Cp)) + geom_point(color = "red") + 
   geom_line(data = tabmax, aes(x=p, y=mCp), color = "blue") + theme_bw()

tab2 <- cbind(obj$size, obj$Cp, obj$which)
tab2
##                1 2 3 4
## 1 2  66.488856 0 0 1 0
## 1 2  67.714773 0 0 0 1
## 1 2 108.555776 0 1 0 0
## 1 2 141.163851 1 0 0 0
## 2 3  20.519679 0 1 1 0
## 2 3  33.504067 0 0 1 1
## 2 3  43.851738 1 0 1 0
## 2 3  57.214850 0 1 0 1
## 3 4   3.390508 1 1 1 0
## 3 4  11.423673 0 1 1 1
## 3 4  32.931969 1 0 1 1
## 3 4  58.391689 1 1 0 1
## 4 5   5.000000 1 1 1 1
tab2[which.min(tab2[,2]),]
##                          1        2        3        4 
## 4.000000 3.390508 1.000000 1.000000 1.000000 0.000000

Multiple Criterians at once

library(olsrr)
## 
## Attaching package: 'olsrr'
## The following object is masked from 'package:datasets':
## 
##     rivers
obj2 <- ols_step_all_possible(surgicalModel2)
obj2 # Default table
##    Index N  Predictors   R-Square Adj. R-Square Mallow's Cp
## 3      1 1          X3 0.42756622    0.41655788   66.488856
## 4      2 1          X4 0.42154199    0.41041780   67.714773
## 2      3 1          X2 0.22084666    0.20586294  108.555776
## 1      4 1          X1 0.06060847    0.04254324  141.163851
## 8      5 2       X2 X3 0.66328986    0.65008554   20.519679
## 10     6 2       X3 X4 0.59948374    0.58377722   33.504067
## 6      7 2       X1 X3 0.54863462    0.53093401   43.851738
## 9      8 2       X2 X4 0.48296742    0.46269163   57.214850
## 7      9 2       X1 X4 0.43010550    0.40775670   67.972119
## 5     10 2       X1 X2 0.26273627    0.23382397  102.031343
## 11    11 3    X1 X2 X3 0.75729185    0.74272936    3.390508
## 14    12 3    X2 X3 X4 0.71781636    0.70088534   11.423673
## 13    13 3    X1 X3 X4 0.61212320    0.58885059   32.931969
## 12    14 3    X1 X2 X4 0.48701249    0.45623323   58.391689
## 15    15 4 X1 X2 X3 X4 0.75921083    0.73955457    5.000000
plot(obj2) # Plots

Best subsets algorithms

library(olsrr)
# To obtain the set of plots in page 362 
surgicalModelAll <- lm(lnY ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8, data = surgicalData)
summary(surgicalModelAll)
## 
## Call:
## lm(formula = lnY ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8, data = surgicalData)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.35562 -0.13833 -0.05158  0.14949  0.46472 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  4.050515   0.251756  16.089  < 2e-16 ***
## X1           0.068512   0.025422   2.695  0.00986 ** 
## X2           0.013452   0.001947   6.909 1.39e-08 ***
## X3           0.014954   0.001809   8.264 1.43e-10 ***
## X4           0.008016   0.046708   0.172  0.86450    
## X5          -0.003566   0.002752  -1.296  0.20163    
## X6           0.084208   0.060750   1.386  0.17253    
## X7           0.057864   0.067483   0.857  0.39574    
## X8           0.388383   0.088380   4.394 6.69e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2093 on 45 degrees of freedom
## Multiple R-squared:  0.8461, Adjusted R-squared:  0.8188 
## F-statistic: 30.93 on 8 and 45 DF,  p-value: 7.8e-16
obj3 <- ols_step_all_possible(surgicalModelAll)
obj3 # Default table
##     Index N              Predictors   R-Square Adj. R-Square Mallow's Cp
## 3       1 1                      X3 0.42756622   0.416557881  117.409441
## 4       2 1                      X4 0.42154199   0.410417798  119.171240
## 2       3 1                      X2 0.22084666   0.205862939  177.865004
## 8       4 1                      X8 0.13896492   0.122406556  201.811487
## 1       5 1                      X1 0.06060847   0.042543244  224.726995
## 6       6 1                      X6 0.05384893   0.035653717  226.703835
## 5       7 1                      X5 0.02101436   0.002187716  236.306372
## 7       8 1                      X7 0.01602435  -0.002898254  237.765710
## 16      9 2                   X2 X3 0.66328986   0.650085544   50.471575
## 22     10 2                   X3 X4 0.59948374   0.583777218   69.131808
## 10     11 2                   X1 X3 0.54863462   0.530934014   84.002738
## 26     12 2                   X3 X8 0.51638460   0.497419295   93.434321
## 30     13 2                   X4 X8 0.50782098   0.488519841   95.938771
## 17     14 2                   X2 X4 0.48296742   0.462691631  103.207247
## 24     15 2                   X3 X6 0.44775235   0.426095576  113.505967
## 23     16 2                   X3 X5 0.44620869   0.424491387  113.957411
## 29     17 2                   X4 X7 0.43393030   0.411731492  117.548252
## 25     18 2                   X3 X7 0.43263981   0.410390386  117.925661
## 11     19 2                   X1 X4 0.43010550   0.407756700  118.666823
## 28     20 2                   X4 X6 0.42329987   0.400684181  120.657144
## 27     21 2                   X4 X5 0.42165329   0.398973031  121.138689
## 21     22 2                   X2 X8 0.39189208   0.368044713  129.842416
## 9      23 2                   X1 X2 0.26273627   0.233823967  167.614301
## 20     24 2                   X2 X7 0.25601115   0.226835112  169.581077
## 19     25 2                   X2 X6 0.25149268   0.222139448  170.902513
## 18     26 2                   X2 X5 0.23590259   0.205937984  175.461866
## 35     27 2                   X6 X8 0.20396789   0.172750941  184.801236
## 15     28 2                   X1 X8 0.16681363   0.134139660  195.667073
## 33     29 2                   X5 X8 0.14968015   0.116334270  200.677798
## 36     30 2                   X7 X8 0.14467058   0.111128255  202.142854
## 13     31 2                   X1 X6 0.10980148   0.074891738  212.340395
## 12     32 2                   X1 X5 0.08018051   0.044109155  221.003110
## 31     33 2                   X5 X6 0.07525635   0.038991895  222.443190
## 34     34 2                   X6 X7 0.07251794   0.036146098  223.244043
## 14     35 2                   X1 X7 0.07129317   0.034873293  223.602231
## 32     36 2                   X5 X7 0.03228060  -0.005669181  235.011538
## 62     37 3                X2 X3 X8 0.77803373   0.764715749   18.914496
## 37     38 3                X1 X2 X3 0.75729185   0.742729360   24.980500
## 58     39 3                X2 X3 X4 0.71781636   0.700885337   36.525190
## 61     40 3                X2 X3 X7 0.68096045   0.661818074   47.303776
## 59     41 3                X2 X3 X5 0.67614248   0.656711027   48.712801
## 60     42 3                X2 X3 X6 0.66972707   0.649910695   50.589000
## 76     43 3                X3 X4 X8 0.66912072   0.649267958   50.766330
## 43     44 3                X1 X3 X4 0.61212320   0.588850588   67.435372
## 75     45 3                X3 X4 X7 0.60535901   0.581680554   69.413572
## 73     46 3                X3 X4 X5 0.60155797   0.577651443   70.525196
## 74     47 3                X3 X4 X6 0.60063138   0.576669264   70.796177
## 47     48 3                X1 X3 X8 0.59664418   0.572442836   71.962241
## 66     49 3                X2 X4 X8 0.59317780   0.568768468   72.975993
## 44     50 3                X1 X3 X5 0.56517531   0.539085831   81.165378
## 45     51 3                X1 X3 X6 0.56298658   0.536765778   81.805477
## 46     52 3                X1 X3 X7 0.54971929   0.522702446   85.685524
## 81     53 3                X3 X6 X8 0.54387486   0.516507349   87.394740
## 51     54 3                X1 X4 X8 0.53044387   0.502270499   91.322661
## 79     55 3                X3 X5 X8 0.52714735   0.498776188   92.286735
## 82     56 3                X3 X7 X8 0.52510928   0.496615835   92.882773
## 87     57 3                X4 X6 X8 0.51307031   0.483854533   96.403593
## 88     58 3                X4 X7 X8 0.50998468   0.480583760   97.305993
## 85     59 3                X4 X5 X8 0.50807180   0.478556112   97.865417
## 65     60 3                X2 X4 X7 0.50515075   0.475459798   98.719685
## 38     61 3                X1 X2 X4 0.48701249   0.456233235  104.024258
## 64     62 3                X2 X4 X6 0.48439147   0.453454963  104.790778
## 63     63 3                X2 X4 X5 0.48330696   0.452305382  105.107946
## 77     64 3                X3 X5 X6 0.46669524   0.434696949  109.966080
## 80     65 3                X3 X6 X7 0.45402812   0.421269809  113.670604
## 78     66 3                X3 X5 X7 0.44886096   0.415792619  115.181750
## 50     67 3                X1 X4 X7 0.44472680   0.411410409  116.390794
## 86     68 3                X4 X6 X7 0.43621403   0.402386869  118.880372
## 84     69 3                X4 X5 X7 0.43396561   0.400003551  119.537925
## 49     70 3                X1 X4 X6 0.43102308   0.396884467  120.398475
## 71     71 3                X2 X6 X8 0.43036308   0.396184860  120.591495
## 48     72 3                X1 X4 X5 0.43010752   0.395913975  120.666232
## 83     73 3                X4 X5 X6 0.42348475   0.388893836  122.603075
## 42     74 3                X1 X2 X8 0.40425587   0.368511225  128.226601
## 69     75 3                X2 X5 X8 0.39746072   0.361308360  130.213858
## 72     76 3                X2 X7 X8 0.39259800   0.356153882  131.635969
## 40     77 3                X1 X2 X6 0.29118358   0.248654593  161.294827
## 41     78 3                X1 X2 X7 0.29036093   0.247782585  161.535412
## 70     79 3                X2 X6 X7 0.28851309   0.245823876  162.075816
## 39     80 3                X1 X2 X5 0.27697980   0.233598586  165.448752
## 67     81 3                X2 X5 X6 0.26706319   0.223086986  168.348883
## 68     82 3                X2 X5 X7 0.26493872   0.220835044  168.970189
## 56     83 3                X1 X6 X8 0.22721230   0.180845035  180.003360
## 90     84 3                X5 X6 X8 0.21463019   0.167508001  183.683023
## 92     85 3                X6 X7 X8 0.20915415   0.161703395  185.284503
## 54     86 3                X1 X5 X8 0.17769501   0.128356714  194.484792
## 57     87 3                X1 X7 X8 0.17196881   0.122286939  196.159432
## 91     88 3                X5 X7 X8 0.15726864   0.106704756  200.458528
## 52     89 3                X1 X5 X6 0.12979185   0.077579363  208.494170
## 55     90 3                X1 X6 X7 0.12278427   0.070151328  210.543552
## 89     91 3                X5 X6 X7 0.08873885   0.034063177  220.500207
## 53     92 3                X1 X5 X7 0.08716732   0.032397357  220.959803
## 97     93 4             X1 X2 X3 X8 0.82988400   0.815996984    5.750774
## 131    94 4             X2 X3 X4 X8 0.81444134   0.799293694   10.267014
## 136    95 4             X2 X3 X6 X8 0.78876213   0.771518218   17.776952
## 134    96 4             X2 X3 X5 X8 0.78356252   0.765894154   19.297588
## 137    97 4             X2 X3 X7 X8 0.77995759   0.761994941   20.351858
## 94     98 4             X1 X2 X3 X5 0.76887693   0.750009741   23.592419
## 96     99 4             X1 X2 X3 X7 0.76649685   0.747435371   24.288478
## 95    100 4             X1 X2 X3 X6 0.76135339   0.741872034   25.792694
## 93    101 4             X1 X2 X3 X4 0.75921083   0.739554570   26.419290
## 130   102 4             X2 X3 X4 X7 0.73293141   0.711129897   34.104759
## 128   103 4             X2 X3 X4 X5 0.72177867   0.699066722   37.366403
## 129   104 4             X2 X3 X4 X6 0.71851944   0.695541433   38.319572
## 133   105 4             X2 X3 X5 X7 0.68975331   0.664427047   46.732286
## 135   106 4             X2 X3 X6 X7 0.68828952   0.662843767   47.160374
## 132   107 4             X2 X3 X5 X6 0.68283811   0.656947338   48.754651
## 153   108 4             X3 X4 X7 X8 0.67374296   0.647109728   51.414546
## 152   109 4             X3 X4 X6 X8 0.67293201   0.646232581   51.651710
## 111   110 4             X1 X3 X4 X8 0.67144798   0.644627405   52.085717
## 150   111 4             X3 X4 X5 X8 0.66953109   0.642554036   52.646315
## 116   112 4             X1 X3 X6 X8 0.61673165   0.585444440   68.087620
## 108   113 4             X1 X3 X4 X5 0.61616796   0.584834732   68.252473
## 110   114 4             X1 X3 X4 X7 0.61574513   0.584377382   68.376131
## 109   115 4             X1 X3 X4 X6 0.61456936   0.583105630   68.719988
## 101   116 4             X1 X2 X4 X8 0.60837935   0.576410314   70.530269
## 114   117 4             X1 X3 X5 X8 0.60769916   0.575674602   70.729191
## 151   118 4             X3 X4 X6 X7 0.60680757   0.574710232   70.989937
## 149   119 4             X3 X4 X5 X7 0.60651769   0.574396688   71.074714
## 117   120 4             X1 X3 X7 X8 0.60446866   0.572180384   71.673958
## 148   121 4             X3 X4 X5 X6 0.60294038   0.570527347   72.120907
## 142   122 4             X2 X4 X6 X8 0.59832171   0.565531642   73.471646
## 143   123 4             X2 X4 X7 X8 0.59363981   0.560467552   74.840876
## 140   124 4             X2 X4 X5 X8 0.59328136   0.560079837   74.945707
## 112   125 4             X1 X3 X5 X6 0.57981300   0.545512016   78.884557
## 113   126 4             X1 X3 X5 X7 0.56537611   0.529896606   83.106655
## 115   127 4             X1 X3 X6 X7 0.56463256   0.529092365   83.324106
## 155   128 4             X3 X5 X6 X8 0.55460080   0.518241677   86.257917
## 157   129 4             X3 X6 X7 X8 0.55204340   0.515475516   87.005832
## 156   130 4             X3 X5 X7 X8 0.53819681   0.500498585   91.055298
## 122   131 4             X1 X4 X6 X8 0.53357952   0.495504377   92.405633
## 123   132 4             X1 X4 X7 X8 0.53259174   0.494435968   92.694510
## 120   133 4             X1 X4 X5 X8 0.53158918   0.493351559   92.987713
## 161   134 4             X4 X6 X7 X8 0.51523987   0.475667618   97.769100
## 159   135 4             X4 X5 X6 X8 0.51319753   0.473458556   98.366387
## 100   136 4             X1 X2 X4 X7 0.51093403   0.471010279   99.028353
## 160   137 4             X4 X5 X7 X8 0.51009251   0.470100058   99.274459
## 141   138 4             X2 X4 X6 X7 0.50719957   0.466970959  100.120505
## 139   139 4             X2 X4 X5 X7 0.50516062   0.464765563  100.716800
## 99    140 4             X1 X2 X4 X6 0.48790530   0.446101649  105.763153
## 98    141 4             X1 X2 X4 X5 0.48715464   0.445289709  105.982686
## 138   142 4             X2 X4 X5 X6 0.48483977   0.442785875  106.659673
## 154   143 4             X3 X5 X6 X7 0.47022848   0.426981824  110.932776
## 121   144 4             X1 X4 X6 X7 0.44596325   0.400735755  118.029193
## 119   145 4             X1 X4 X5 X7 0.44505096   0.399748996  118.295993
## 106   146 4             X1 X2 X6 X8 0.44087433   0.395231417  119.517457
## 158   147 4             X4 X5 X6 X7 0.43622212   0.390199434  120.878006
## 145   148 4             X2 X5 X6 X8 0.43610769   0.390075662  120.911471
## 147   149 4             X2 X6 X7 X8 0.43104657   0.384601390  122.391606
## 118   150 4             X1 X4 X5 X6 0.43103933   0.384593556  122.393724
## 104   151 4             X1 X2 X5 X8 0.41001507   0.361853036  128.542311
## 107   152 4             X1 X2 X7 X8 0.40489454   0.356314504  130.039821
## 146   153 4             X2 X5 X7 X8 0.39870369   0.349618273  131.850349
## 105   154 4             X1 X2 X6 X7 0.32061032   0.265149937  154.688915
## 102   155 4             X1 X2 X5 X6 0.30593119   0.249272512  158.981857
## 103   156 4             X1 X2 X5 X7 0.29930989   0.242110694  160.918271
## 144   157 4             X2 X5 X6 X7 0.29770259   0.240372187  161.388329
## 125   158 4             X1 X5 X6 X8 0.23802809   0.175826301  178.840259
## 127   159 4             X1 X6 X7 X8 0.23193438   0.169235145  180.622377
## 162   160 4             X5 X6 X7 X8 0.22161023   0.158068213  183.641695
## 126   161 4             X1 X5 X7 X8 0.18465576   0.118097046  194.449107
## 124   162 4             X1 X5 X6 X7 0.13864225   0.068327335  207.905852
## 171   163 5          X1 X2 X3 X6 X8 0.83744126   0.820508061    5.540639
## 169   164 5          X1 X2 X3 X5 X8 0.83580826   0.818704958    6.018212
## 166   165 5          X1 X2 X3 X4 X8 0.83313992   0.815758658    6.798576
## 172   166 5          X1 X2 X3 X7 X8 0.83167542   0.814141607    7.226872
## 202   167 5          X2 X3 X4 X6 X8 0.81788212   0.798911508   11.260750
## 203   168 5          X2 X3 X4 X7 X8 0.81594726   0.796775100   11.826604
## 200   169 5          X2 X3 X4 X5 X8 0.81566233   0.796460487   11.909933
## 205   170 5          X2 X3 X5 X6 X8 0.79438537   0.772967184   18.132422
## 207   171 5          X2 X3 X6 X7 X8 0.79062800   0.768818413   19.231275
## 206   172 5          X2 X3 X5 X7 X8 0.78633272   0.764075717   20.487436
## 168   173 5          X1 X2 X3 X5 X7 0.77531394   0.751909146   23.709901
## 167   174 5          X1 X2 X3 X5 X6 0.77314833   0.749517945   24.343240
## 170   175 5          X1 X2 X3 X6 X7 0.77116415   0.747327077   24.923518
## 163   176 5          X1 X2 X3 X4 X5 0.76908567   0.745032092   25.531373
## 165   177 5          X1 X2 X3 X4 X7 0.76885947   0.744782335   25.597525
## 164   178 5          X1 X2 X3 X4 X6 0.76214345   0.737366725   27.561640
## 199   179 5          X2 X3 X4 X5 X7 0.73495391   0.707344937   35.513278
## 201   180 5          X2 X3 X4 X6 X7 0.73401319   0.706306235   35.788391
## 198   181 5          X2 X3 X4 X5 X6 0.72273768   0.693856183   39.085940
## 204   182 5          X2 X3 X5 X6 X7 0.69720963   0.665668966   46.551669
## 216   183 5          X3 X4 X6 X7 X8 0.67754959   0.643961003   52.301289
## 188   184 5          X1 X3 X4 X7 X8 0.67627657   0.642555380   52.673586
## 187   185 5          X1 X3 X4 X6 X8 0.67601873   0.642270684   52.748991
## 215   186 5          X3 X4 X5 X7 X8 0.67457994   0.640682020   53.169768
## 214   187 5          X3 X4 X5 X6 X8 0.67351159   0.639502381   53.482210
## 185   188 5          X1 X3 X4 X5 X8 0.67232795   0.638195447   53.828367
## 190   189 5          X1 X3 X5 X6 X8 0.62774183   0.588964941   66.867670
## 192   190 5          X1 X3 X6 X7 X8 0.62414055   0.584988524   67.920873
## 183   191 5          X1 X3 X4 X5 X6 0.61926220   0.579602015   69.347555
## 184   192 5          X1 X3 X4 X5 X7 0.61860448   0.578875783   69.539907
## 186   193 5          X1 X3 X4 X6 X7 0.61840093   0.578651025   69.599437
## 191   194 5          X1 X3 X5 X7 X8 0.61775534   0.577938188   69.788240
## 177   195 5          X1 X2 X4 X6 X8 0.61177573   0.571335697   71.536991
## 175   196 5          X1 X2 X4 X5 X8 0.60901545   0.568287894   72.344239
## 178   197 5          X1 X2 X4 X7 X8 0.60887957   0.568137857   72.383978
## 213   198 5          X3 X4 X5 X6 X7 0.60814052   0.567321819   72.600116
## 211   199 5          X2 X4 X6 X7 X8 0.59878708   0.556994067   75.335547
## 209   200 5          X2 X4 X5 X6 X8 0.59835369   0.556515537   75.462292
## 210   201 5          X2 X4 X5 X7 X8 0.59369913   0.551376123   76.823529
## 189   202 5          X1 X3 X5 X6 X7 0.58028658   0.536566431   80.746057
## 217   203 5          X3 X5 X6 X7 X8 0.56501667   0.519705905   85.211774
## 196   204 5          X1 X4 X6 X7 X8 0.53573260   0.487371409   93.775961
## 194   205 5          X1 X4 X5 X6 X8 0.53445535   0.485961112   94.149495
## 195   206 5          X1 X4 X5 X7 X8 0.53340375   0.484799976   94.457037
## 218   207 5          X4 X5 X6 X7 X8 0.51527319   0.464780813   99.759357
## 176   208 5          X1 X2 X4 X6 X7 0.51223908   0.461430646  100.646690
## 174   209 5          X1 X2 X4 X5 X7 0.51108042   0.460151296  100.985541
## 208   210 5          X2 X4 X5 X6 X7 0.50719961   0.455866234  102.120493
## 173   211 5          X1 X2 X4 X5 X6 0.48811657   0.434795382  107.701366
## 180   212 5          X1 X2 X5 X6 X8 0.44679328   0.389167576  119.786449
## 193   213 5          X1 X4 X5 X6 X7 0.44619513   0.388507123  119.961378
## 182   214 5          X1 X2 X6 X7 X8 0.44149715   0.383319773  121.335311
## 212   215 5          X2 X5 X6 X7 X8 0.43732990   0.378718435  122.554032
## 181   216 5          X1 X2 X5 X7 X8 0.41117687   0.349841124  130.202541
## 179   217 5          X1 X2 X5 X6 X7 0.32981166   0.260000371  153.997965
## 197   218 5          X1 X5 X6 X7 X8 0.24447620   0.165775801  178.954497
## 226   219 6       X1 X2 X3 X5 X6 X8 0.84343626   0.823449402    5.787389
## 228   220 6       X1 X2 X3 X6 X7 X8 0.83918918   0.818660139    7.029456
## 223   221 6       X1 X2 X3 X4 X6 X8 0.83872170   0.818132982    7.166172
## 227   222 6       X1 X2 X3 X5 X7 X8 0.83844803   0.817824375    7.246207
## 221   223 6       X1 X2 X3 X4 X5 X8 0.83714610   0.816356244    7.626958
## 224   224 6       X1 X2 X3 X4 X7 X8 0.83479488   0.813704868    8.314578
## 243   225 6       X2 X3 X4 X6 X7 X8 0.81938834   0.796331530   12.820254
## 241   226 6       X2 X3 X4 X5 X6 X8 0.81937072   0.796311668   12.825405
## 242   227 6       X2 X3 X4 X5 X7 X8 0.81754448   0.794252288   13.359493
## 244   228 6       X2 X3 X5 X6 X7 X8 0.79709267   0.771189602   19.340669
## 225   229 6       X1 X2 X3 X5 X6 X7 0.78008280   0.752008264   24.315239
## 220   230 6       X1 X2 X3 X4 X5 X7 0.77581462   0.747195207   25.563478
## 219   231 6       X1 X2 X3 X4 X5 X6 0.77316271   0.744204756   26.339035
## 222   232 6       X1 X2 X3 X4 X6 X7 0.77218459   0.743101776   26.625086
## 240   233 6       X2 X3 X4 X5 X6 X7 0.73623378   0.702561492   37.138977
## 237   234 6       X1 X3 X4 X6 X7 X8 0.68087463   0.640135221   53.328874
## 246   235 6       X3 X4 X5 X6 X7 X8 0.67862460   0.637597958   53.986898
## 236   236 6       X1 X3 X4 X5 X7 X8 0.67781288   0.636682605   54.224290
## 235   237 6       X1 X3 X4 X5 X6 X8 0.67727606   0.636077263   54.381282
## 238   238 6       X1 X3 X5 X6 X7 X8 0.63731987   0.591020282   66.066552
## 234   239 6       X1 X3 X4 X5 X6 X7 0.62179533   0.573513886   70.606736
## 232   240 6       X1 X2 X4 X6 X7 X8 0.61227672   0.562780136   73.390473
## 230   241 6       X1 X2 X4 X5 X6 X8 0.61220642   0.562700859   73.411033
## 231   242 6       X1 X2 X4 X5 X7 X8 0.60939705   0.559532848   74.232638
## 245   243 6       X2 X4 X5 X6 X7 X8 0.59879690   0.547579483   77.332675
## 239   244 6       X1 X4 X5 X6 X7 X8 0.53631780   0.477124325   95.604818
## 229   245 6       X1 X2 X4 X5 X6 X7 0.51232412   0.450067624  102.621819
## 233   246 6       X1 X2 X5 X6 X7 X8 0.44794155   0.377466003  121.450634
## 251   247 7    X1 X2 X3 X5 X6 X7 X8 0.84602791   0.822597377    7.029455
## 248   248 7    X1 X2 X3 X4 X5 X6 X8 0.84361461   0.819816836    7.735230
## 250   249 7    X1 X2 X3 X4 X6 X7 X8 0.84038662   0.816097625    8.679263
## 249   250 7    X1 X2 X3 X4 X5 X7 X8 0.83955867   0.815143680    8.921400
## 254   251 7    X2 X3 X4 X5 X6 X7 X8 0.82129305   0.794098517   14.263216
## 247   252 7    X1 X2 X3 X4 X5 X6 X7 0.78009570   0.746632006   26.311466
## 253   253 7    X1 X3 X4 X5 X6 X7 X8 0.68291063   0.634657905   54.733440
## 252   254 7    X1 X2 X4 X5 X6 X7 X8 0.61260940   0.553658658   75.293181
## 255   255 8 X1 X2 X3 X4 X5 X6 X7 X8 0.84612863   0.818773720    9.000000
plot(obj3) # Plots

# To get the table 9.3
library(leaps)
obj <- regsubsets(x = surgicalData[1:8], y = surgicalData[,10], nbest = 1, which = "adjr2")
objSummary <- summary(obj)
objSummary
## Subset selection object
## 8 Variables  (and intercept)
##    Forced in Forced out
## X1     FALSE      FALSE
## X2     FALSE      FALSE
## X3     FALSE      FALSE
## X4     FALSE      FALSE
## X5     FALSE      FALSE
## X6     FALSE      FALSE
## X7     FALSE      FALSE
## X8     FALSE      FALSE
## 1 subsets of each size up to 8
## Selection Algorithm: exhaustive
##          X1  X2  X3  X4  X5  X6  X7  X8 
## 1  ( 1 ) " " " " "*" " " " " " " " " " "
## 2  ( 1 ) " " "*" "*" " " " " " " " " " "
## 3  ( 1 ) " " "*" "*" " " " " " " " " "*"
## 4  ( 1 ) "*" "*" "*" " " " " " " " " "*"
## 5  ( 1 ) "*" "*" "*" " " " " "*" " " "*"
## 6  ( 1 ) "*" "*" "*" " " "*" "*" " " "*"
## 7  ( 1 ) "*" "*" "*" " " "*" "*" "*" "*"
## 8  ( 1 ) "*" "*" "*" "*" "*" "*" "*" "*"
objSummary$adjr2
## [1] 0.4165579 0.6500855 0.7647157 0.8159970 0.8205081 0.8234494 0.8225974
## [8] 0.8187737
obj <- regsubsets(x = surgicalData[1:8], y = surgicalData[,10], nbest = 1, which = "cp")
objSummary <- summary(obj)
objSummary
## Subset selection object
## 8 Variables  (and intercept)
##    Forced in Forced out
## X1     FALSE      FALSE
## X2     FALSE      FALSE
## X3     FALSE      FALSE
## X4     FALSE      FALSE
## X5     FALSE      FALSE
## X6     FALSE      FALSE
## X7     FALSE      FALSE
## X8     FALSE      FALSE
## 1 subsets of each size up to 8
## Selection Algorithm: exhaustive
##          X1  X2  X3  X4  X5  X6  X7  X8 
## 1  ( 1 ) " " " " "*" " " " " " " " " " "
## 2  ( 1 ) " " "*" "*" " " " " " " " " " "
## 3  ( 1 ) " " "*" "*" " " " " " " " " "*"
## 4  ( 1 ) "*" "*" "*" " " " " " " " " "*"
## 5  ( 1 ) "*" "*" "*" " " " " "*" " " "*"
## 6  ( 1 ) "*" "*" "*" " " "*" "*" " " "*"
## 7  ( 1 ) "*" "*" "*" " " "*" "*" "*" "*"
## 8  ( 1 ) "*" "*" "*" "*" "*" "*" "*" "*"
objSummary$cp
## [1] 117.409441  50.471575  18.914496   5.750774   5.540639   5.787389   7.029455
## [8]   9.000000
obj <- regsubsets(x = surgicalData[1:8], y = surgicalData[,10], nbest = 1, which = "rss")
objSummary <- summary(obj)
objSummary
## Subset selection object
## 8 Variables  (and intercept)
##    Forced in Forced out
## X1     FALSE      FALSE
## X2     FALSE      FALSE
## X3     FALSE      FALSE
## X4     FALSE      FALSE
## X5     FALSE      FALSE
## X6     FALSE      FALSE
## X7     FALSE      FALSE
## X8     FALSE      FALSE
## 1 subsets of each size up to 8
## Selection Algorithm: exhaustive
##          X1  X2  X3  X4  X5  X6  X7  X8 
## 1  ( 1 ) " " " " "*" " " " " " " " " " "
## 2  ( 1 ) " " "*" "*" " " " " " " " " " "
## 3  ( 1 ) " " "*" "*" " " " " " " " " "*"
## 4  ( 1 ) "*" "*" "*" " " " " " " " " "*"
## 5  ( 1 ) "*" "*" "*" " " " " "*" " " "*"
## 6  ( 1 ) "*" "*" "*" " " "*" "*" " " "*"
## 7  ( 1 ) "*" "*" "*" " " "*" "*" "*" "*"
## 8  ( 1 ) "*" "*" "*" "*" "*" "*" "*" "*"
objSummary$rss
## [1] 7.331575 4.312491 2.842883 2.178799 2.082008 2.005225 1.972032 1.970742

Forward Stepwise Regression

fit <- lm(lnY ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8, data = surgicalData)  # Full Model
o <- step(fit, direction = "forward")     
## Start:  AIC=-160.77
## lnY ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8
summary(o)
## 
## Call:
## lm(formula = lnY ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8, data = surgicalData)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.35562 -0.13833 -0.05158  0.14949  0.46472 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  4.050515   0.251756  16.089  < 2e-16 ***
## X1           0.068512   0.025422   2.695  0.00986 ** 
## X2           0.013452   0.001947   6.909 1.39e-08 ***
## X3           0.014954   0.001809   8.264 1.43e-10 ***
## X4           0.008016   0.046708   0.172  0.86450    
## X5          -0.003566   0.002752  -1.296  0.20163    
## X6           0.084208   0.060750   1.386  0.17253    
## X7           0.057864   0.067483   0.857  0.39574    
## X8           0.388383   0.088380   4.394 6.69e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2093 on 45 degrees of freedom
## Multiple R-squared:  0.8461, Adjusted R-squared:  0.8188 
## F-statistic: 30.93 on 8 and 45 DF,  p-value: 7.8e-16

Model Validation

# Get the Validation set
surgicalDataValidation <- read.table("http://users.stat.ufl.edu/~rrandles/sta4210/Rclassnotes/data/textdatasets/KutnerData/Chapter%20%209%20Data%20Sets/CH09TA05.txt", sep ="" , header = FALSE)

names(surgicalDataValidation) <- c(paste("X", 1:8, sep = ""), "Y" , "lnY")

dim(surgicalDataValidation)
## [1] 54 10
#------------------------------------------------------------------------------------
# We selected X1, X2, X3, X6, X8 using Cp criterion in the earlier section 
lmCpVal <- lm(lnY ~ X1 + X2 + X3 + X6 + X8, data = surgicalDataValidation)

summary(lmCpVal)
## 
## Call:
## lm(formula = lnY ~ X1 + X2 + X3 + X6 + X8, data = surgicalDataValidation)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.57757 -0.13516  0.01021  0.13439  0.49539 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 3.614319   0.290665  12.435  < 2e-16 ***
## X1          0.099885   0.032295   3.093   0.0033 ** 
## X2          0.015910   0.002374   6.702  2.1e-08 ***
## X3          0.015447   0.002042   7.566  1.0e-09 ***
## X6          0.073091   0.079154   0.923   0.3604    
## X8          0.188575   0.096622   1.952   0.0568 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2787 on 48 degrees of freedom
## Multiple R-squared:  0.7115, Adjusted R-squared:  0.6815 
## F-statistic: 23.68 on 5 and 48 DF,  p-value: 6.482e-12
MSPR1 <- summary(lmCpVal)$sigma^2 # Getting the MSPR - MSE for the Validation(prediction) set
MSPR1
## [1] 0.07768364
adjRsq1 <- summary(lmCpVal)$adj.r.squared
adjRsq1
## [1] 0.681457
#names(summary(lmCpVal))

#------------------------------------------------------------------------------------
# We selected X1, X2, X3, X5, X6, X8 using R2adj criterion in the earlier section 

lmRVal <- lm(lnY ~ X1 + X2 + X3 + X5 + X6 + X8, data = surgicalDataValidation)

summary(lmRVal)
## 
## Call:
## lm(formula = lnY ~ X1 + X2 + X3 + X5 + X6 + X8, data = surgicalDataValidation)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.60092 -0.12580  0.01123  0.13493  0.49742 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 3.469928   0.346781  10.006 3.14e-13 ***
## X1          0.098737   0.032466   3.041  0.00385 ** 
## X2          0.016203   0.002414   6.712 2.23e-08 ***
## X3          0.015582   0.002058   7.572 1.12e-09 ***
## X5          0.002540   0.003293   0.771  0.44443    
## X6          0.072651   0.079493   0.914  0.36542    
## X8          0.193054   0.097206   1.986  0.05288 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2799 on 47 degrees of freedom
## Multiple R-squared:  0.7151, Adjusted R-squared:  0.6787 
## F-statistic: 19.66 on 6 and 47 DF,  p-value: 2.525e-11
MSPR2 <- summary(lmRVal)$sigma^2 # Getting the MSPR - MSE for the Validation(prediction) set
MSPR2
## [1] 0.07834499
adjRsq2 <- summary(lmRVal)$adj.r.squared
adjRsq2
## [1] 0.6787451
#------------------------------------------------------------------------------------
# Making a table so it is easier to compre the models

compare <- data.frame(matrix(c(MSPR1, MSPR2, adjRsq1, adjRsq2), ncol = 2, byrow = TRUE))
colnames(compare) <- c( "Model 1" , "Model 2")
rownames(compare) <- c( "MSPR" , "Adj R square")
compare
##                 Model 1    Model 2
## MSPR         0.07768364 0.07834499
## Adj R square 0.68145696 0.67874506