Fitting and interpreting models
#Data: Paris Paintings
pp <- read_csv("paris-paintings.csv", na = c("n/a", "", "NA"))
##
## -- Column specification --------------------------------------------------------
## cols(
## .default = col_double(),
## name = col_character(),
## sale = col_character(),
## lot = col_character(),
## dealer = col_character(),
## origin_author = col_character(),
## origin_cat = col_character(),
## school_pntg = col_character(),
## price = col_number(),
## subject = col_character(),
## authorstandard = col_character(),
## authorstyle = col_character(),
## author = col_character(),
## winningbidder = col_character(),
## winningbiddertype = col_character(),
## endbuyer = col_character(),
## type_intermed = col_character(),
## Shape = col_character(),
## material = col_character(),
## mat = col_character(),
## materialCat = col_character()
## )
## i Use `spec()` for the full column specifications.
#Goal: Predict height from width
\[\widehat{height}_{i} = \beta_0 + \beta_1 \times width_{i}\]


Step 1: Specify model
linear_reg()
## Linear Regression Model Specification (regression)
Step 2: Set model fitting engine
linear_reg() %>%
set_engine("lm") # lm: linear model
## Linear Regression Model Specification (regression)
##
## Computational engine: lm
Step 3: Fit model & estimate parameters
linear_reg() %>%
set_engine("lm") %>%
fit(Height_in ~ Width_in, data = pp)
## parsnip model object
##
## Fit time: 21ms
##
## Call:
## stats::lm(formula = Height_in ~ Width_in, data = data)
##
## Coefficients:
## (Intercept) Width_in
## 3.6214 0.7808
A tidy look at model output
linear_reg() %>%
set_engine("lm") %>%
fit(Height_in ~ Width_in, data = pp) %>%
tidy()
## # A tibble: 2 x 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 3.62 0.254 14.3 8.82e-45
## 2 Width_in 0.781 0.00950 82.1 0
Visualizing residuals

Models with categorical explanatory variables (Height & landscape features)
linear_reg() %>%
set_engine("lm") %>%
fit(Height_in ~ factor(landsALL), data = pp) %>%
tidy()
## # A tibble: 2 x 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 22.7 0.328 69.1 0
## 2 factor(landsALL)1 -5.65 0.532 -10.6 7.97e-26
#Relationship between height and school
linear_reg() %>%
set_engine("lm") %>%
fit(Height_in ~ school_pntg, data = pp) %>%
tidy()
## # A tibble: 7 x 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 14.0 10.0 1.40 0.162
## 2 school_pntgD/FL 2.33 10.0 0.232 0.816
## 3 school_pntgF 10.2 10.0 1.02 0.309
## 4 school_pntgG 1.65 11.9 0.139 0.889
## 5 school_pntgI 10.3 10.0 1.02 0.306
## 6 school_pntgS 30.4 11.4 2.68 0.00744
## 7 school_pntgX 2.87 10.3 0.279 0.780
#Categorical predictor with 3+ levels
.pull-left-wide[
| school_pntg | D_FL | F | G | I | S | X |
|---|---|---|---|---|---|---|
| A | 0 | 0 | 0 | 0 | 0 | 0 |
| D/FL | 1 | 0 | 0 | 0 | 0 | 0 |
| F | 0 | 1 | 0 | 0 | 0 | 0 |
| G | 0 | 0 | 1 | 0 | 0 | 0 |
| I | 0 | 0 | 0 | 1 | 0 | 0 |
| S | 0 | 0 | 0 | 0 | 1 | 0 |
| X | 0 | 0 | 0 | 0 | 0 | 1 |
] .pull-right-narrow[ .small[
## # A tibble: 3,393 x 3
## name Height_in school_pntg
## <chr> <dbl> <chr>
## 1 L1764-2 37 F
## 2 L1764-3 18 I
## 3 L1764-4 13 D/FL
## 4 L1764-5a 14 F
## 5 L1764-5b 14 F
## 6 L1764-6 7 I
## 7 L1764-7a 6 F
## 8 L1764-7b 6 F
## 9 L1764-8 15 I
## 10 L1764-9a 9 D/FL
## 11 L1764-9b 9 D/FL
## 12 L1764-10a 16 X
## 13 L1764-10b 16 X
## 14 L1764-10c 16 X
## 15 L1764-11 20 D/FL
## 16 L1764-12a 14 D/FL
## 17 L1764-12b 14 D/FL
## 18 L1764-13a 15 D/FL
## 19 L1764-13b 15 D/FL
## 20 L1764-14 37 F
## # ... with 3,373 more rows
] ]
Relationship between height and school
.small[
- Austrian school (A) paintings are expected, on average, to be 14 inches tall.
- Dutch/Flemish school (D/FL) paintings are expected, on average, to be 2.33 inches taller than Austrian school paintings.
- French school (F) paintings are expected, on average, to be 10.2 inches taller than Austrian school paintings.
- German school (G) paintings are expected, on average, to be 1.65 inches taller than Austrian school paintings.
- Italian school (I) paintings are expected, on average, to be 10.3 inches taller than Austrian school paintings.
- Spanish school (S) paintings are expected, on average, to be 30.4 inches taller than Austrian school paintings.
- Paintings whose school is unknown (X) are expected, on average, to be 2.87 inches taller than Austrian school paintings. ]