Loyola College M.Sc. Statistics April 2007 Applied Regression Analysis Question Paper PDF Download

 

LOYOLA COLLEGE (AUTONOMOUS), CHENNAI – 600 034

AC 28

M.Sc. DEGREE EXAMINATION – STATISTICS

FIRST SEMESTER – APRIL 2007

ST 1811  – APPLIED REGRESSION ANALYSIS

 

 

 

Date & Time: 02/05/2007 / 1:00 – 4:00      Dept. No.                                                  Max. : 100 Marks

 

 

Answer ALL the questions                                          SECTION – A                                                    (10 x 2 = 20 marks)

  1. Explain the term ‘partial regression coefficients’.
  2. State an unbiased estimate of the error variance in a multiple linear regression model.
  3. Define ‘PRESS Residuals’.
  4. What is the variance stabilizing transformation used when σ2 is proportional to E(Y)?
  5. Give the expression for the GLS estimator explaining the notations.
  6. Mention any two sources of multicollinearity.
  7. Define ‘Variance Inflation Factor’ of a regression coefficient.
  8. Define a ‘Hierarchical Polynomial Model’.
  9. What is ‘Link function’ in a GLM?
  10. Give the interpretation for a positive coefficient in a logit model.

 

Answer any  FIVE questions:                                      SECTION – B                                                (5 x 8 = 40 marks)

 

  1. Briefly explain the limitations to be recognized and cautions that are needed in applying regression models in practice.

 

  1. A model (with an intercept) relating a response variable to four regressors is to be built based on the following sample of size 10:
Y X1 X2 X3 X4
23.5

15.7

22.8

18.9

17.3

28.4

16.6

23.1

20.0

19.8

2

3

7

1

5

8

3

7

3

4

12

22

18

14

20

25

24

17

13

24

38

33

27

29

34

40

32

37

28

30

7

18

9

14

11

16

10

8

13

15

Write down the full data matrix. Also, if we wish to test the linear hypothesis  H0: β2 = 2β3, β1 = 0, write down the reduced model under the H0 and also the reduced data matrix.

 

  1. Explain the motivation and give the expressions for ‘studentized’ and ‘externally studentized’ residuals.

 

  1. The following residuals were obtained after a linear regression model was built:  -0.15, 0.03. -0.06, 0.01, 0.23, -0.31, 0.19,  0.15, -0.08, -0.01

Plot the ‘normal probability plot’ on a graph sheet and draw appropriate conclusions.

  1. An investigator has the following data:

Y:   3.2     5.1     4.5     2.4

X:    5        9         6        4

Guide the investigator as to whether the model Y = β0 + β1X or Y1/2 = β0 + β1X is  more appropriate.

 

  1. Discuss the need for ‘Generalized Least Squares’ pointing out the requirements for it. Briefly indicate the ANOVA for a model built using GLS.

 

  1. The following is part of the output obtained while investigating the presence of multicollinearity in the data used for building a linear model. Fill up the missing entries and identify which regressors are involved in collinear relationship(s), if any.
Eigen

Value

(of X’X)

Singular

value

(of X)

Condition

Indices

Variance Decomposition Proportions

X1        X2             X3           X4           X5           X6

2.525 ? ? 0.0180    0.0355      0.0004     0.0005        ?         0.0350
1.783 ? ? 0.0029    0.1590      0.0305      0.0987   0.0032        ?
1.380 ? ? 0.0168    0.0006          ?           0.0500   0.0006    0.0018
0.952 ? ? 0.6830         ?          0.0001      0.0033   0.1004    0.4845
0.245 ? ?      ?        0.1785      0.0025      0.0231   0.7175    0.4199
0.002 ? ? 0.2040    0.2642      0.9664          ?       0.0172    0.0029

 

 

  1. Discuss ‘Spline’ fitting.

 

Answer any TWO Questions                         SECTION – C                                                           (2 x 20 = 40 marks)

 

  1. (a)Obtain the decomposition of the total variation in the data under a multiple

linear regression model. Hence, define SST, SSR and SSRes and indicate the

ANOVA.

(b)Develop the Partial F-Test for the contribution of some ‘r’ of the ‘k’ regressors

in a multiple regression model.                                                               (10 + 10)

 

  1. A model with an intercept is to be built with the monthly mobile phone bill amount (Average over the past six months) of students (Y) as the DV and IDVs as: monthly income of parents (X1), age of the student (X2), number of telephone numbers saved in the mobile (X3) and also dummy variables indicating gender (male / female), class (UG /PG / M/Phil.), residence (day scholar / hostel inmate). The following data collected from 15 students are available:

 

Bill Amt.

(in Rs.)

Income

(in ‘000 Rs.)

Age # of Saved

numbers

Gender Class Residence
230

150

300

225

400

180

125

170

200

350

280

375

450

390

195

25

12

35

40

45

20

15

18

12

15

21

35

42

37

18

17

22

21

18

21

24

19

18

20

25

19

23

22

26

17

50

38

62

43

45

33

27

36

22

35

39

50

47

43

25

F

F

M

M

M

F

F

M

F

M

F

F

M

M

M

UG

PG

PG

UG

UG

M.Phil

UG

UG

PG

M.Phil.

UG

PG

PG

M.Phil.

UG

Day scholar

Hostel inmate

Day scholar

Hostel inmate

Hostel inmate

Hostel inmate

Day scholar

Day scholar

Hostel inmate

Day scholar

Hostel inmate

Day Scholar

Hostel inmate

Hostel inmate

Day Scholar

 

(a) Construct the data matrix for building the model.

 

 

(b) If interaction effects of ‘Class’ with ‘parental income’ and interaction effect of ‘Gender’ with

‘Residence type’ are also to be incorporated in the model, write down the appropriate data matrix.

[You need not build the models].                                                          (10 +10)

 

  1. Build a linear model for a DV with a maximum of four regressors using the Forward Selection method based on a sample of size 25, given the following information:

SST = 2800, SSRes(X1) = 1500, SSRes(X2)  = 1650, SSRes(X3) = 1800,

SSRes(X4) = 1200, SSRes(X1,X2) = 1150, SSRes(X1,X3) = 1380,

SSRes(X1,X4) = 1050, SSRes(X2,X3) = 1300, SSRes(X2,X4) = 1020,

SSRes(X3,X4) = 990, SSRes(X1, X2, X3) = 1000, SSRes(X1, X2, X4) = 900,

SSRes(X1,X3, X4) = 850, SSRes(X2,X3,X4) = 750, SSRes(X1,X2,X3, X4) = 720.

 

  1. (a)Discuss ‘sensitivity’, ‘specificity’ and ‘ROC’ of a logistic regression model

and the objective behind these measures.

(b) The following data were used to build a logistic model and the estimates were

β0 = 3.8, β1 = –5.2, β2 = 2.2

DV 1 1 0 1 0 0 1 0 1 0 0 0 1 1 0 1 1 1 1 0
X1 -3 1 0 2 -2 4 1 -1 5 2 3 -2 0 -4 1 2 -1 -2 -3 4
X2 0 2 -3 2 -4 -1 0 3 2 -3 4 -5 1 -1 -4 3 4 -3 1 1

 

Compute the logit score for each record. Construct the Gains Table and

compute the KS statistic.                                                                            (8 + 12)

 

Go To Main Page

Loyola College M.Sc. Statistics Nov 2007 Applied Regression Analysis Question Paper PDF Download

Go To Main Page

© Copyright Entrance India - Engineering and Medical Entrance Exams in India | Website Maintained by Firewall Firm - IT Monteur