With statistical software such as R, the manipulations described in Box D are unnecessary except to educate you as to what is going on behind the scenes. Here is one method to perform the manipulations as described in the box:
x2.bar <- mean( x^2)# the mean of the X squared values x2.bar
[1]7.5
xy.bar <- mean( x*y )#mean of the sum of the xy products xy.bar
[1]17.75
xy.cov <- xy.bar -(x.bar*y.bar)#covariance of x and y xy.cov
[1]2.125
x.var <- x2.bar - x.bar^2# variance of x x.var
[1]1.25
m <- xy.cov/x.var # slope m
[1]1.7
b <- y.bar - m*x.bar # y intercept b
[1]2
# So the best fit line is y = mx + b or # y = 1.7x + 2
# here is what you will normally do using R # using the original x, y vectors from above
model <- lm(y~x) summary(model)
Call: lm(formula = y ~ x)
Residuals: 1234 -0.71.6-1.10.2
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept)2.00001.79581.1140.381 x 1.70000.65572.5920.122
Residual standard error:1.466 on 2 degrees of freedom Multiple R-squared:0.7707, Adjusted R-squared:0.656 F-statistic:6.721 on 1 and 2 DF, p-value:0.1221
##Now let's incorporate weights
x <-c(1,2,3,4) y <-c(3,7,6,9) w <-c(10,10,1,1)
w.sum <-sum(w)# sum of the weights w.sum
[1]22
xw.bar <-sum(x*w)/w.sum # the mean of the weighted X values xw.bar
[1]1.681818
yw.bar <-sum(y*w)/w.sum # the mean of the weighted Y values yw.bar
[1]5.227273
x2w.bar <-sum( x^2*w)/w.sum #mean of the sum of the weighted x squared values x2w.bar
[1]3.409091
xyw.bar <-sum( x*y*w )/w.sum #mean of the sum of the weighted xy products xyw.bar
[1]10.18182
xyw.cov <- xyw.bar -(xw.bar*yw.bar)#covariance of weighted x and y xyw.cov
[1]1.390496
xw.var <- x2w.bar - xw.bar^2# variance of weighted x xw.var
[1]0.5805785
mw <- xyw.cov/xw.var # slope mw
[1]2.395018
bw <- yw.bar - mw*xw.bar # y intercept bw
[1]1.199288
# So the best fit line to the weighted data is y = mx + b or # y = 2.4x + 1.2
# lm has a weight option
model <- lm(y~x, weight= w) summary(model)
Call: lm(formula = y ~ x, weights = w)
Weighted Residuals: 1234 -1.8793.196-2.384-1.779
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept)1.19931.73660.6910.561 x 2.39500.94052.5460.126
Residual standard error:3.361 on 2 degrees of freedom Multiple R-squared:0.7643, Adjusted R-squared:0.6464 F-statistic:6.484 on 1 and 2 DF, p-value:0.1258
## Box D a & b
Obtaining quantities 8 and 9 (slope and y intercept) for problems a and b is now trivial using the built in R function lm
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept)5.50003.48571.5780.2553 x 3.20000.63645.0280.0373* --- Signif. codes:0'***'0.001'**'0.01'*'0.05'.'0.1' '1
Residual standard error:2.846 on 2 degrees of freedom Multiple R-squared:0.9267, Adjusted R-squared:0.89 F-statistic:25.28 on 1 and 2 DF, p-value:0.03735
w <-c(1,1,20,20)
model <- lm(y~x, weight= w) summary(model)
Call: lm(formula = y ~ x, weights = w)
Weighted Residuals: 1234 4.0215.913-5.3423.121
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept)-1.12885.4466-0.2070.8550 x 4.05390.78545.1620.0355* --- Signif. codes:0'***'0.001'**'0.01'*'0.05'.'0.1' '1
Residual standard error:6.686 on 2 degrees of freedom Multiple R-squared:0.9302, Adjusted R-squared:0.8953 F-statistic:26.64 on 1 and 2 DF, p-value:0.03554