>rem
first thing to do with SYSTAT is to click Edit / Options /
>rem
File locations Set project directory ... to a directory
>rem
where you place your data set and you can write the output
>FPATH
"Z:\mydocs\s209\
>rem
Problem 5
>rem
Then click File / Open / Data and select file knnch01pr19
>USE
"Z:\mydocs\s209\knnch01pr19.syd"
SYSTAT
Rectangular file Z:\mydocs\s209\knnch01pr19.syd,
created
Wed Jan 26, 2005 at 12:42:36, contains variables:
Y
X
>rem
next estimate regression function
>regress
>model
y=constant+x
>estimate
Dep
Var: Y N: 120 Multiple R: 0.269 Squared multiple R: 0.073
Adjusted
squared multiple R: 0.065 Standard
error of estimate: 0.623
Effect Coefficient Std Error Std Coef Tolerance t
P(2 Tail)
CONSTANT 2.114 0.321 0.000 .
6.588 0.000
X 0.039 0.013 0.269 1.000
3.040 0.003
Analysis
of Variance
Source Sum-of-Squares df
Mean-Square F-ratio P
Regression 3.588 1
3.588 9.240 0.003
Residual 45.818 118
0.388
-------------------------------------------------------------------------------
***
WARNING ***
Case 9 is an outlier (Studentized Residual = -4.835)
Durbin-Watson
D Statistic 1.831
First
Order Autocorrelation 0.074
>rem
thus estimated regression function is
>rem
^Y = 2.114 + 0.039X
>plot
y*x/stick=out smooth=linear short

>rem
is that a good fit? with r-square=.073, r=.269?
>rem
what's GPA (^Y) for ACT (X) = 30?
>calc
2.114+0.039*30
3.284
>rem
what's the change in mean response when test score increases
>rem
by 1 point? see the printout!
>rem
what's the change in mean response in standard deviations of Y
>rem
when test score X increases by 1 standard deviation? see printout!
>rem
Problem 6
>regress
>model
y=constant+x
>save
p6resid/resid data
>rem
this will save residuals and other stuff in file p6resid
>estimate
Dep
Var: Y N: 120 Multiple R: 0.269 Squared multiple R: 0.073
Adjusted
squared multiple R: 0.065 Standard
error of estimate: 0.623
Effect Coefficient Std Error Std Coef Tolerance t
P(2 Tail)
CONSTANT 2.114 0.321 0.000 .
6.588 0.000
X 0.039 0.013 0.269 1.000
3.040 0.003
Analysis
of Variance
Source Sum-of-Squares df
Mean-Square F-ratio P
Regression 3.588
1 3.588 9.240 0.003
Residual 45.818 118
0.388
***
WARNING ***
Case 9 is an outlier (Studentized Residual = -4.835)
Durbin-Watson
D Statistic 1.831
First
Order Autocorrelation 0.074
Residuals
and data have been saved.
-------------------------------------------------------------------------------
>use
p6resid
SYSTAT
Rectangular file Z:\mydocs\s209\p6resid.SYD,
created
Thu Jan 27, 2005 at 13:00:36, contains variables:
ESTIMATE
RESIDUAL LEVERAGE COOK STUDENT SEPRED
Y
X
>rem
get the file with the residuals and other stuff
>names
Variables
in the SYSTAT Rectangular file are:
ESTIMATE
RESIDUAL LEVERAGE COOK STUDENT SEPRED
Y
X
>stats
>stat
residual/sum mean variance sd
RESIDUAL
Sum 0.000
Mean 0.000
Standard Dev 0.621
Variance 0.385
>rem
so what's the estimates of sigma square and sigma?
>rem
Problem 12. To see individual data click
on View / Data
>rem
or list values for first 10 cases as
>list
estimate sepred y x/n=10
Case number ESTIMATE SEPRED Y X
1
2.929 0.074 3.897 21.000
2 2.658 0.148 3.885 14.000
3 3.201 0.071 3.778 28.000
4 2.968 0.067 2.540 22.000
5 2.929 0.074 3.028 21.000
6 3.318 0.098 3.865 31.000
7 3.357 0.109 2.962 32.000
8 3.162 0.064 3.961 27.000
9 3.240 0.079 0.500 29.000
10 3.124 0.059 3.178 26.000
>rem
fortunately case # 3 has value X=28 (otherwise there is a trick)
>rem
so we can see that ^Yh is 3.201 and s{^Yh) is .0706
>rem
from regression printout we find MSE=.3883 (see it?)
>rem
thus lower and upper bounds for CI for ^Yh are:
>calc
3.2012 - tif(0.975,118)*0.0706
3.061
>calc
3.2012 + tif(0.975,118)*0.0706
3.341
>rem
so our 95% CI is (3.061, 3.341)
>rem
Mary Jones got X=28; 95% CI for her GPA (Y)?
>rem
first we need s{pred}
>calc
sqr(0.3883+0.0706^2)
0.627
>rem
thus lower and upper bounds for CI for Ynnew are:
>calc
3.2012 - tif(0.975,118)*0.627
1.960
>calc
3.2012 + tif(0.975,118)*0.627
4.443
>rem
thus CI for individual prediction is much wider
>rem
at (1.96, 4.443)
>rem why is that?
27 Jan 2005