Handout for A1 Problems 5, 6, and 12 Using SYSTAT

 

>rem first thing to do with SYSTAT is to click Edit / Options /

>rem File locations Set project directory ... to a directory

>rem where you place your data set and you can write the output

 

>FPATH "Z:\mydocs\s209\

 

>rem Problem 5

>rem Then click File / Open / Data and select file knnch01pr19

 

>USE "Z:\mydocs\s209\knnch01pr19.syd"

SYSTAT Rectangular file Z:\mydocs\s209\knnch01pr19.syd,

created Wed Jan 26, 2005 at 12:42:36, contains variables:

 Y            X

 

>rem next estimate regression function

>regress

>model y=constant+x

>estimate

 

Dep Var: Y   N: 120   Multiple R: 0.269   Squared multiple R: 0.073

 

Adjusted squared multiple R: 0.065   Standard error of estimate: 0.623

 

Effect         Coefficient    Std Error     Std Coef Tolerance     t   P(2 Tail)

 

CONSTANT             2.114        0.321        0.000      .       6.588    0.000

X                    0.039        0.013        0.269     1.000    3.040    0.003

 

 

Analysis of Variance

Source             Sum-of-Squares   df  Mean-Square     F-ratio       P

 

Regression                 3.588     1        3.588       9.240       0.003

Residual                  45.818   118        0.388

 

 

-------------------------------------------------------------------------------

*** WARNING ***

Case            9 is an outlier        (Studentized Residual =       -4.835)

 

Durbin-Watson D Statistic          1.831

First Order Autocorrelation        0.074

>rem thus estimated regression function is

>rem ^Y = 2.114 + 0.039X

>plot y*x/stick=out smooth=linear short

 

 

 

>rem is that a good fit? with r-square=.073, r=.269?

>rem what's GPA (^Y) for ACT (X) = 30?

>calc 2.114+0.039*30

           3.284

 

>rem what's the change in mean response when test score increases

>rem by 1 point?  see the printout!

>rem what's the change in mean response in standard deviations of Y

>rem when test score X increases by 1 standard deviation?  see printout!

>rem Problem 6

>regress

>model y=constant+x

>save p6resid/resid data

>rem this will save residuals and other stuff in file p6resid

 

 

>estimate

 

Dep Var: Y   N: 120   Multiple R: 0.269   Squared multiple R: 0.073

 

Adjusted squared multiple R: 0.065   Standard error of estimate: 0.623

 

Effect         Coefficient    Std Error     Std Coef Tolerance     t   P(2 Tail)

 

CONSTANT             2.114        0.321        0.000      .       6.588    0.000

X                    0.039        0.013        0.269     1.000    3.040    0.003

 

 

Analysis of Variance

Source             Sum-of-Squares   df  Mean-Square     F-ratio       P

 

Regression                 3.588     1        3.588       9.240       0.003

Residual                  45.818   118        0.388

 

*** WARNING ***

Case            9 is an outlier        (Studentized Residual =       -4.835)

 

Durbin-Watson D Statistic          1.831

First Order Autocorrelation        0.074

 

Residuals and data have been saved.

 

-------------------------------------------------------------------------------

 

 

 

>use p6resid

SYSTAT Rectangular file Z:\mydocs\s209\p6resid.SYD,

created Thu Jan 27, 2005 at 13:00:36, contains variables:

 ESTIMATE     RESIDUAL     LEVERAGE     COOK         STUDENT      SEPRED

 Y            X

 

>rem get the file with the residuals and other stuff

>names

 

Variables in the SYSTAT Rectangular file are:

 ESTIMATE     RESIDUAL     LEVERAGE     COOK         STUDENT      SEPRED

 Y            X

 

>stats

>stat residual/sum mean variance sd

 

                      RESIDUAL

  Sum                   0.000

  Mean                  0.000

  Standard Dev          0.621

  Variance              0.385

 

>rem so what's the estimates of sigma square and sigma?

>rem Problem 12.  To see individual data click on View / Data

>rem or list values for first 10 cases as

 

>list estimate sepred y x/n=10

  Case number     ESTIMATE       SEPRED            Y            X

        1            2.929        0.074        3.897       21.000

        2            2.658        0.148        3.885       14.000

        3            3.201        0.071        3.778       28.000

        4            2.968        0.067        2.540       22.000

        5            2.929        0.074        3.028       21.000

        6            3.318        0.098        3.865       31.000

        7            3.357        0.109        2.962       32.000

        8            3.162        0.064        3.961       27.000

        9            3.240        0.079        0.500       29.000

       10            3.124        0.059        3.178       26.000

 

>rem fortunately case # 3 has value X=28 (otherwise there is a trick)

>rem so we can see that ^Yh is 3.201 and s{^Yh) is .0706

>rem from regression printout we find MSE=.3883 (see it?)

>rem thus lower and upper bounds for CI for ^Yh are:

>calc 3.2012 - tif(0.975,118)*0.0706

           3.061

 

>calc 3.2012 + tif(0.975,118)*0.0706

           3.341

 

>rem so our 95% CI is (3.061, 3.341)

>rem Mary Jones got X=28; 95% CI for her GPA (Y)?

>rem first we need s{pred}

>calc sqr(0.3883+0.0706^2)

           0.627

 

>rem thus lower and upper bounds for CI for Ynnew are:

>calc 3.2012 - tif(0.975,118)*0.627

           1.960

 

>calc 3.2012 + tif(0.975,118)*0.627

           4.443

 

>rem thus CI for individual prediction is much wider

>rem at (1.96, 4.443)

>rem why is that?

 


27 Jan 2005