Loyola College M.Sc. Statistics April 2006 Applied Regression Analysis Question Paper PDF Download

             LOYOLA COLLEGE (AUTONOMOUS), CHENNAI – 600 034

M.Sc. DEGREE EXAMINATION – STATISTICS

AC 28

FIRST SEMESTER – APRIL 2006

                                            ST 1811 – APPLIED REGRESSION ANALYSIS

 

 

Date & Time : 27-04-2006/1.00-4.00 P.M.   Dept. No.                                                       Max. : 100 Marks

SECTION – A

Answer ALL the Questions                                                                     (2´10 = 20 marks)

  1. Define ‘Residuals’ of a linear model.
  2. What is Partial F- test.
  3. What are the two scaling techniques for computing standardized regression coefficients.
  4. Define ‘Externally Studentized Residuals’.
  5. Stae the variance stabilizing tramsformation if V(Y) is proportional to [E(Y)]3.
  6. What is FOUT in Backward selection process.
  7. How is the multicollinearity trap avoided in regression models with dummy variables.
  8. State any one method of detecting multicollinearit.
  9. Give an example of a polynomial regression model.
  10. Give the motivation for Generalized Linear Models.

SECTION – B

Answer any FIVE Questions                                                                   (5´8 = 40 marks)

  1. Fill up the missing entries in the following ANOVA for a regression model with 5 regressors and an intercept:
Source d.f S.S Mean S.S. F ratio
Regression

Residual

?

14

?

?

40.5

?

13.5

——-

Residual ? ? ——– ——-

Also, test for the overall fit of the model.

 

  1. The following table gives the data matrix corresponding to a model
    Y = b0+b1X1+b2X2+b3X3. Suppose we wish to test H0: b2 = b3. Write down the restrcited model under H0 and the reduced data matrix that is used to build the restricted model.

1    2   -3    4

1   -1    2    5

1    3    4    -3

1   -2   1     2

X =     1    4    5   -2

1   -3    4    3

1    2    3     1

1    1    2     5

1    4   -2    2

1   -3    4    2

  1. Explain how residual plot are used to check the assumption of normality of the errors in a linear model.

 

  1. Discuss ‘Generalized Least Squares’ and obtain the form of the GLS estimate.

 

  1. Explain the variance decomposition method of detecting multicollinearity and derive the expression for ‘Variance Inflation Factor’.
  2. Discuss ‘Ridge Regression’ and obtain the expression for the redge estimate.

 

  1. Suggest some strategies to decide on the degree of a polynomial regression model.

 

  1. Describe Cubic-Spline fitting.

SECTION – C

Answer any TWO Questions                                                                 (2 ´ 20 = 40 marks)

  1. Build a linear regression model with the following data and test for overall fit . Also, test for the individual significance of X1 and of X2.

Y:  12.8    13.9    15.2     18.3     14.5     12.4

X1:    2          3        5          5          4          1

X2:       4          2        5          1          2          3

 

  1. (a)Decide whether “Y =b0 + b1X” or “Y2 = b0 + b1X” is the more appropriate model for the following data:

X:    1      2       3      4

Y:  1.2   1.8    2.3   2.5

 

(b)The starting salary of PG students selected in campus interviews are given below

along with the percentage of marks they scored in their PG and their academic

stream:

Salary  (in ‘000 Rs) Stream Gender % in PG
12

8

15

12.5

7.5

6

10

18

14

Arts

Science

Commerce

Science

Arts

Commerce

Science

Science

Commerce

Male

Male

Female

Male

Female

Female

Male

Male

Female

75

70

85

80

75

60

70

87

82

It is believed that there could be a possible interaction between Stream and % in

PG and between Gender and % in PG. Incorporate this view and create the data

matrix. (You need not build the model).                                                      (10+10)

  1. Based on a sample of size 16, a model is to be built for a response variable with four regressors X1, …,X4. Carry out the Forward selection process to decide on the significant regressors, given the following information:

SST = 1810.509, SSRes(X1) = 843.88, SSRes(X2) = 604.224, SSRes(X3) = 1292.923, SSRes(X4) = 589.24, SSRes(X1, X2) = 38.603, SSRes(X1,X3) = 818.048,           SSRes(X1,X4) = 49.84, SSRes(X2,X3) = 276.96, SSRes(X2,X4) = 579.23,          SSRes(X3,X4) = 117.14, SSRes(X1,X2,X3) = 32.074, SSRes(X1,X2, X4) = 31.98, SSRes(X1,X3,X4) = 33.89, SSRes(X2,X3,X4) = 49.22, SSRes(X1,X2,X3,X4) = 31.91.

 

  1. (a) Obtain the likelihood equation for estimating the parameters of a logistic regression model.

(b) If the logit score (linear predictor) is given by –2.4 + 1.5 X1 + 2 X2, find the estimated P(Y = 1) for each of the following combination of the IDVs:

X1:  0       1.5        2       3       -2      -2.5

X2:  1         0       1.5     -1        2       2.5                                    (12+8)

 

 

 

Go To Main page

Loyola College M.Sc. Statistics April 2006 Applied Regression Analysis Question Paper PDF Download

             LOYOLA COLLEGE (AUTONOMOUS), CHENNAI – 600 034

M.Sc. DEGREE EXAMINATION – STATISTICS

AC 52

FOURTH SEMESTER – APRIL 2006

                                            ST 4954 – APPLIED REGRESSION ANALYSIS

 

 

Date & Time : 27-04-2006/9.00-12.00         Dept. No.                                                       Max. : 100 Marks

 

 

SECTION –A

Answer ALL the Questions                                                                   (10 X 2 = 20 marks)

 

  1. State the statistic for testing the overall fit of a linear model with ‘k’ regressors.
  2. Define ‘Extra Sum of Squares’.
  3. Define ‘Studentized’ Residuals.
  4. What is a ‘Variance Stabilizing Transformation’?
  5. State the consequence of using OLS in a situation when GLS is required.
  6. Define “Variance Inflation Factor’.
  7. Give the form of the Ridge Estimate when a constant ‘l’ is added to the diagonal elements of X’X.
  8. What is a hierarchical polynomial regression model?
  9. Mention the components of a ‘Generalized Regression Model’ (GLM).
  10. Define ‘Sensitivity’ of a Binary Logit Model.

 

SECTION – B

Answer any FIVE Questions                                                                   (5 X 8 = 40 marks)

 

  1. The following table gives the data on four independent variables used to build a linear model with an intercept for a dependent variable.
X1 X2 X X4
2

1

5

4

-2

3

2

-3

2

1

-1

4

2

3

-2

3

2

3

2

1

3

2

-3

1

4

-1

2

5

-2

-3

5

3

4

-1

2

1

4

1

3

-2

If one wishes to test the hypothesis H0: b1 = b3, b2 = 2b4, write down the reduced

data matrix and the restricted model under H0. Briefly indicate the test procedure.

 

  1. Depict the different possibilities that occur when the residuals are plotted against the fitted values. How are they interpreted?

 

  1. Define ‘Standardized Regression Coefficient’ and discuss any one method of scaling the variables.

 

  1. Decide whether “Y= b0 + b1X” or “Y1/2 = b0 + b1X” is the more appropriate model for the following data:
X 1 2 3 4
Y 3.5 4.7 6.5 9.2

 

  1. Discuss the issue of ‘multicollinearity’ and its ill-effects.

 

Eigen Values

of X’X

Singular

Values of X

Condition

Indices

Variance decomposition Proportions

X1        X2          X3         X4        X5        X6

3.4784

2.1832

1.4548

0.9404

0.2204

0.0725

?

?

?

?

?

?

?

?

?

?

?

?

0.0003  0.0005  0.0004  0.0004       ?      0.0350

?     0.0031  0.0001  0.3001  0.0006  0.0018

0.0004       ?      0.0005  0.0012  0.0032  0.2559

0.0011  0.6937  0.5010  0.0002  0.7175       ?

0.0100  0.0000       ?      0.0003  0.0083  0.2845

0.8853  0.3024  0.4964       ?      0.2172  0.0029

  1. Fill up the missing entries in the following table and investigate the presence of collinearity in the data, indicating which variables are involved in collinear relationships, if any.

 

  1. Explain ‘Cubic Spline’ fitting.

 

  1. Describe the components of a GLM. Show how the log link arises naturally in modeling a Poisson (Count) response variable.

 

SECTION – C

 

Answer any TWO Questions                                                                 (2 X 20 = 40 marks)

 

  1. The observed and predicted values of a response variable (based on a model using 25 data points) and the diagonal elements of the ‘Hat’ matrix are given below:
Yi 16.68   11.50   12.03   14.88  13.75   18.11   8.00   17.83   79.24   21.50   40.33   21.0   13.5
Yi^ 21.71   10.35   12.08   9.96    14.19   18.40   7.16   16.67   71.82   19.12   38.09  21.59 12.47
hii 0.102   0.071   0.089   0.058  0.075   0.043   0.082  0.064  0.498   0.196   0.086  0.114 0.061

 

Yi 19.75   24.00   29.00   15.35   19.00   9.50    35.10   17.90   52.32   18.75   19.83   10.75
Yi^ 18.68   23.33   29.66   14.91   15.55   7.71    40.89   20.51   56.01   23.36   24.40   10.96
hii 0.078   0.041   0.166   0.059   0.096   0.096   0.102   0.165   0.392   0.041   0.121   0.067

 

Compute PRESS statistic and R2prediciton. Comment on the predictive power of the

underlying model.

 

  1. (a) In a study on the mileage performance of cars, three brands of cars (A, B and C) and two types of fuel (OR and HG) were used. The speed of driving was also observed and the data are reported below:

 

 

 

Mileage(Y) 14.5  12.6  13.7  15.8  16.4  13.9  14.6  16.7  11.8  15.3  16.8  17.0 15.0  16.5
Speed

Car

Fuel

  45     60     50      60    55     52     59    50      40     53     62     56    62    55

A       B      C       B     A      A       C     A       B      B      C      C      A     B

OR    HG    OR   HG   HG   OR    HG  OR    OR    HG   HG    OR  HG   OR

 

Create the data matrix so as to build a model with an intercept term and interaction terms between Fuel and Driving Speed and also between Car-type and Driving Speed.

(You need not build any model).

 

(b) Discuss GLS and obtain an expression for the GLS estimate.                (14 + 6)

 

  1. Based on a sample of size 15, a linear model is to be built for a response variable Y with four regressors X1,…,X4. Carry out the Forward Selection Process to decide which of the regressors would finally be significant for Y, given the following information:

SST = 543.15,  SSRes(X1) = 253.14,  SSRes(X2) = 181.26,  SSRes(X3) = 387.88, SSRes(X4) = 176.77,         SSRes(X1,X2) = 11.58,         SSRes(X1,X3) = 245.41,         SSRes(X1,X4) = 14.95,      SSRes(X2,X3) = 83.09,         SSRes(X2,X4) = 173.77,      SSRes(X3,X4) = 35.15,      SRes(X1,X2,X3) = 9.62,        SSRes(X1,X2,X4) = 9.59, SSRes(X1,X3,X4) = 10.17, SSRes(X2,X3,X4) = 14.76,    SSRes(X1,X2,X3,X4) = 9.57

 

  1. The laborers in a coal-mine were screened for symptoms of pneumoconiosis to study the effect of “number of years of work” (X) on the laborers’ health. The response variable ‘Y’ defined as ‘1’ if symptoms were found and ‘0’ if not. The data on 20 employees are given below:
Y    0    1    1    0    1    1    0    0    0    1    1     1    0    0    1    0    0     1    1    1
X   10  30  28  14  25  35  15  12  20  24  33   27  13  12  18  17  11   28  32  30

 

The logit model built for the purpose had the linear predictor (logit score) function as – 4.8 + 0.1 X. Construct the Gains Table and compute the KS statistic. Comment on the discriminatory power of the model.

 

 

Go To Main page

 

 

 

Loyola College M.Sc. Statistics Nov 2006 Applied Regression Analysis Question Paper PDF Download

LOYOLA COLLEGE (AUTONOMOUS), CHENNAI 600 034

M.Sc. Degree Examination – Statistics

 I Semester – November 2006

ST 1811 – APPLIED REGRESSION ANALYSIS

02 / 11/ 2006                                         Time: 1.00. – 4.00                    Max. Marks: 100 

 

SECTION – A

Answer ALL the Questions                                                                    (10 x 2 = 20 marks)

  1. Define ‘residuals’ and ‘residual sum of squares’ in a linear model.
  2. State the test for the overall fit of a linear regression model.
  3. Define Adjusted R2 of a linear model.
  4. Give an example of a relationship that can be linearized.
  5. What is the variance stabilizing transformation used when σ2 is proportional to E(Y)[1 – E(Y)]?
  6. State any one criterion for assessing and comparing performances of linear models.
  7. State any one ill-effect of multicollinearity.
  8. Illustrate with an example why both X and X2 can be considered for inclusion as regressors in a model.
  9. Define the logit link used for modeling a binary dependent variable.
  10. Define any one measure of performance of a logistic model.

 

SECTION – B

 

 

Answer any FIVE Questions                                                                    (5 x 8 = 40 marks)

  1. Discuss “No-Intercept Model” and give an illustrative example where such a model is appropriate. State how you will favour such a model against a model with intercept. Indicate the ANOVA for such a model.

 

  1. A model (with an intercept) relating a response variable to four regressors is to be built based on the following sample of size 10:

 

Y X1 X2 X3 X4
13.8

22.9

23.7

16.8

21.6

25.5

16.6

17.4

19.9

24.6

3

1

6

2

7

6

4

9

4

5

14

26

13

17

23

21

29

17

16

27

33

35

28

27

39

38

28

25

30

32

5

7

9

12

8

15

11

7

13

15

Write down the full data matrix. Also, if we wish to test the linear hypothesis               H0: β4 = 2β1 + β2, write down the reduced model under the H0 and also the reduced data matrix.

 

  1. Give the motivation for standardized regression coefficients and explain anyone method for scaling the variables.

 

  1. The following residuals were obtained after a linear regression model was built:

0.17, – 1.04, 1.24, 0.48, – 1.83, 1.57, 0.50, – 0.32, – 0.77

Plot the ‘normal probability plot’ on a graph sheet and draw appropriate conclusions.

 

  1. Describe the Box-Cox method of analytical selection of transformation of the dependent variable.

 

  1. Discuss the role of dummy variables in linear models, explaining clearly how they are used to indicate different intercepts and different slopes among categories of respondents /subjects. Illustrate with examples.

 

  1. The following is part of the output obtained while investigating the presence of multicollinearity in the data used for building a linear model. Fill up the missing entries and point out which regressors are involved in collinear relationship(s), if

any:

 

Eigen

Value

(of X’X)

Singular

value

(of X)

Condition

Indices

Variance Decomposition Proportions

X1        X2             X3           X4           X5           X6

2.429 ? ? 0.0003    0.0005      0.0004     0.0000   0.0531        ?
1.546 ? ? 0.0004    0.0000           ?         0.0012   0.0032    0.0559
0.922 ? ?     ?         0.0033      0.9964     0.0001   0.0006    0.0018
0.794 ? ? 0.0000    0.0000      0.0002     0.0003        ?        0.4845
0.308 ? ? 0.0011        ?           0.0025     0.0000   0.7175    0.4199
0.001 ? ? 0.9953    0.0024      0.0001         ?        0.0172    0.0029

 

  1. Discuss ‘Spline’ fitting.

 

 

SECTION – C

 

 

Answer any TWO Questions                                                                 (2 x 20 = 40 marks)

  1. (a)Depict the different possibilities that can arise when residuals are plotted against the fitted (predicted) values and explain how they can be used for detecting model inadequacies.

(b) Explain ‘partial regression plots’ and state how they are useful in model building.                                                                                                        (13 + 7)

 

  1. The following data were used to regress Y on X1, X2, X3 and X4 with an intercept term and the coefficients were estimated to be β0^ = 45.1225, β1^ = 1.5894,        β2^ = 0.7525, β3^ = 0.0629,  β4^ = 0.054. Carry out the ANOVA and test for the overall significance of the model. Also test the significance of the intercept and each of the individual slope coefficients.
Y(Heat  in calories) X1(Tricalium Aluminate) X2(Tricalcium Silicate) X3(Tetracalcium alumino ferrite) X4(Dicalium silicate)
78.5 7 26 6 60
74.3 1 29 15 52
104.3 11 56 8 20
87.6 11 31 8 47
95.9 7 52 6 3
109.2 11 55 9 22
102.7 3 71 17 6
72.5 1 31 22 44
93.1 2 54 18 22
115.9 21 47 4 26

The following is also given for your benefit:

15.90911472 -0.068104115 -0.216989375 -0.042460127 -0.165914393
-0.068104115 0.008693142 -0.001317006 0.007363424 -0.000687829
-0.216989375 -0.001317006 0.003723258 -0.001844902 0.002629903
-0.042460127 0.007363424 -0.001844902 0.009317298 -0.001147731
-0.165914393 -0.000687829 0.002629903 -0.001147731 0.002157976

 

(X’X)– 1 =

 

 

 

 

 

 

  1. Build a linear model for a DV with a maximum of four regressors using Stepwise Procedure, based on a sample of size 25, given the following information:

SST = 5600, SSRes(X1) = 3000, SSRes(X2)  = 3300, SSRes(X3) = 3600,

SSRes(X4) = 2400, SSRes(X1,X2) = 2300, SSRes(X1,X3) = 2760,

SSRes(X1,X4) = 2100, SSRes(X2,X3) = 2600, SSRes(X2,X4) = 2040,

SSRes(X3,X4) = 1980, SSRes(X1, X2, X3) = 2000, SSRes(X1, X2, X4) = 1800,

SSRes(X1,X3, X4) = 1700, SSRes(X2,X3,X4) = 1500, SSRes(X1,X2,X3, X4) = 1440.

 

  1. (a) Briefly indicate the Wilk’s Likelihood Ratio Test and the Wald’s Test for testing the significance of a subset of the parameters in a Logistic Regression model.

(b) The following data were used to build a logistic model:

DV 1 1 0 1 0 0 1 0 1 1 1 0 0 1 0 1 1 0 0 0
X1 2 4 1 0 -1 3 5 -2 3 -2 3 0 -4 2 -3 1 -1 3 4 -2
X2 -2 -4 2 0 4 -2 1 3 -4 2 1 3 0 -2 -4 -3 1 -1 2 0

The estimates were found to be β0 = 2.57, β1 = 3.78, β2 = – 3.2. Construct the Gains Table and compute KS Statistic.                                                          (8+12)

 

Go To Main Page

Loyola College M.Sc. Statistics April 2007 Applied Regression Analysis Question Paper PDF Download

 

LOYOLA COLLEGE (AUTONOMOUS), CHENNAI – 600 034

AC 28

M.Sc. DEGREE EXAMINATION – STATISTICS

FIRST SEMESTER – APRIL 2007

ST 1811  – APPLIED REGRESSION ANALYSIS

 

 

 

Date & Time: 02/05/2007 / 1:00 – 4:00      Dept. No.                                                  Max. : 100 Marks

 

 

Answer ALL the questions                                          SECTION – A                                                    (10 x 2 = 20 marks)

  1. Explain the term ‘partial regression coefficients’.
  2. State an unbiased estimate of the error variance in a multiple linear regression model.
  3. Define ‘PRESS Residuals’.
  4. What is the variance stabilizing transformation used when σ2 is proportional to E(Y)?
  5. Give the expression for the GLS estimator explaining the notations.
  6. Mention any two sources of multicollinearity.
  7. Define ‘Variance Inflation Factor’ of a regression coefficient.
  8. Define a ‘Hierarchical Polynomial Model’.
  9. What is ‘Link function’ in a GLM?
  10. Give the interpretation for a positive coefficient in a logit model.

 

Answer any  FIVE questions:                                      SECTION – B                                                (5 x 8 = 40 marks)

 

  1. Briefly explain the limitations to be recognized and cautions that are needed in applying regression models in practice.

 

  1. A model (with an intercept) relating a response variable to four regressors is to be built based on the following sample of size 10:
Y X1 X2 X3 X4
23.5

15.7

22.8

18.9

17.3

28.4

16.6

23.1

20.0

19.8

2

3

7

1

5

8

3

7

3

4

12

22

18

14

20

25

24

17

13

24

38

33

27

29

34

40

32

37

28

30

7

18

9

14

11

16

10

8

13

15

Write down the full data matrix. Also, if we wish to test the linear hypothesis  H0: β2 = 2β3, β1 = 0, write down the reduced model under the H0 and also the reduced data matrix.

 

  1. Explain the motivation and give the expressions for ‘studentized’ and ‘externally studentized’ residuals.

 

  1. The following residuals were obtained after a linear regression model was built:  -0.15, 0.03. -0.06, 0.01, 0.23, -0.31, 0.19,  0.15, -0.08, -0.01

Plot the ‘normal probability plot’ on a graph sheet and draw appropriate conclusions.

  1. An investigator has the following data:

Y:   3.2     5.1     4.5     2.4

X:    5        9         6        4

Guide the investigator as to whether the model Y = β0 + β1X or Y1/2 = β0 + β1X is  more appropriate.

 

  1. Discuss the need for ‘Generalized Least Squares’ pointing out the requirements for it. Briefly indicate the ANOVA for a model built using GLS.

 

  1. The following is part of the output obtained while investigating the presence of multicollinearity in the data used for building a linear model. Fill up the missing entries and identify which regressors are involved in collinear relationship(s), if any.
Eigen

Value

(of X’X)

Singular

value

(of X)

Condition

Indices

Variance Decomposition Proportions

X1        X2             X3           X4           X5           X6

2.525 ? ? 0.0180    0.0355      0.0004     0.0005        ?         0.0350
1.783 ? ? 0.0029    0.1590      0.0305      0.0987   0.0032        ?
1.380 ? ? 0.0168    0.0006          ?           0.0500   0.0006    0.0018
0.952 ? ? 0.6830         ?          0.0001      0.0033   0.1004    0.4845
0.245 ? ?      ?        0.1785      0.0025      0.0231   0.7175    0.4199
0.002 ? ? 0.2040    0.2642      0.9664          ?       0.0172    0.0029

 

 

  1. Discuss ‘Spline’ fitting.

 

Answer any TWO Questions                         SECTION – C                                                           (2 x 20 = 40 marks)

 

  1. (a)Obtain the decomposition of the total variation in the data under a multiple

linear regression model. Hence, define SST, SSR and SSRes and indicate the

ANOVA.

(b)Develop the Partial F-Test for the contribution of some ‘r’ of the ‘k’ regressors

in a multiple regression model.                                                               (10 + 10)

 

  1. A model with an intercept is to be built with the monthly mobile phone bill amount (Average over the past six months) of students (Y) as the DV and IDVs as: monthly income of parents (X1), age of the student (X2), number of telephone numbers saved in the mobile (X3) and also dummy variables indicating gender (male / female), class (UG /PG / M/Phil.), residence (day scholar / hostel inmate). The following data collected from 15 students are available:

 

Bill Amt.

(in Rs.)

Income

(in ‘000 Rs.)

Age # of Saved

numbers

Gender Class Residence
230

150

300

225

400

180

125

170

200

350

280

375

450

390

195

25

12

35

40

45

20

15

18

12

15

21

35

42

37

18

17

22

21

18

21

24

19

18

20

25

19

23

22

26

17

50

38

62

43

45

33

27

36

22

35

39

50

47

43

25

F

F

M

M

M

F

F

M

F

M

F

F

M

M

M

UG

PG

PG

UG

UG

M.Phil

UG

UG

PG

M.Phil.

UG

PG

PG

M.Phil.

UG

Day scholar

Hostel inmate

Day scholar

Hostel inmate

Hostel inmate

Hostel inmate

Day scholar

Day scholar

Hostel inmate

Day scholar

Hostel inmate

Day Scholar

Hostel inmate

Hostel inmate

Day Scholar

 

(a) Construct the data matrix for building the model.

 

 

(b) If interaction effects of ‘Class’ with ‘parental income’ and interaction effect of ‘Gender’ with

‘Residence type’ are also to be incorporated in the model, write down the appropriate data matrix.

[You need not build the models].                                                          (10 +10)

 

  1. Build a linear model for a DV with a maximum of four regressors using the Forward Selection method based on a sample of size 25, given the following information:

SST = 2800, SSRes(X1) = 1500, SSRes(X2)  = 1650, SSRes(X3) = 1800,

SSRes(X4) = 1200, SSRes(X1,X2) = 1150, SSRes(X1,X3) = 1380,

SSRes(X1,X4) = 1050, SSRes(X2,X3) = 1300, SSRes(X2,X4) = 1020,

SSRes(X3,X4) = 990, SSRes(X1, X2, X3) = 1000, SSRes(X1, X2, X4) = 900,

SSRes(X1,X3, X4) = 850, SSRes(X2,X3,X4) = 750, SSRes(X1,X2,X3, X4) = 720.

 

  1. (a)Discuss ‘sensitivity’, ‘specificity’ and ‘ROC’ of a logistic regression model

and the objective behind these measures.

(b) The following data were used to build a logistic model and the estimates were

β0 = 3.8, β1 = –5.2, β2 = 2.2

DV 1 1 0 1 0 0 1 0 1 0 0 0 1 1 0 1 1 1 1 0
X1 -3 1 0 2 -2 4 1 -1 5 2 3 -2 0 -4 1 2 -1 -2 -3 4
X2 0 2 -3 2 -4 -1 0 3 2 -3 4 -5 1 -1 -4 3 4 -3 1 1

 

Compute the logit score for each record. Construct the Gains Table and

compute the KS statistic.                                                                            (8 + 12)

 

Go To Main Page

Loyola College M.Sc. Statistics Nov 2007 Applied Regression Analysis Question Paper PDF Download

Go To Main Page

Loyola College M.Sc. Statistics April 2008 Applied Regression Analysis Question Paper PDF Download

LOYOLA COLLEGE (AUTONOMOUS), CHENNAI – 600 034

NO 34

 

M.Sc. DEGREE EXAMINATION – STATISTICS

FIRST SEMESTER – APRIL 2008

ST 1811 – APPLIED REGRESSION ANALYSIS

 

 

 

Date : 05-05-08                  Dept. No.                                        Max. : 100 Marks

Time : 1:00 – 4:00

 

Section A

Answer ALL the questions   

Each question carries 2 marks                                               (10 X 2 =20 Marks)

 

1)      Define the linear regression model in the context of any applied scenario.

2)      What are the basic assumptions of a linear regression model?

3)      Mention any four major areas where categorical data analysis is used.

4)      Explain nominal and ordinal variables with examples

5)      Explain interval variable with an example.

6)      Explain the link function of a generalized linear model

7)      What is the role of binary data in the generalized linear model.

8)      What is meant by multicollinearity?

9)      Write down the sampling variance of the slope coefficients in the multiple regression model.

10)    What is the role of the variance inflation factor?

 

Section B

Answer any 5 questions        

Each question carries 8 marks                                               (5  X 8 = 40 Marks)

 

11)    Derive the estimate of the parameters of a linear regression model using the method of least squares.

12)    Explain the three components of a generalized linear model

13)    Identify the natural parameter for the binomial logit model in the context of a decision taken to purchase a particular product

14)    Identify the natural parameter for the Poisson log linear model for a count data in the context of the number of silicon wafers used in the production of a computer chip.

15)    Explain Poisson log linear model.

16)    Explain the regression model of the status of the employees on the education and income level in the context of the usage of a dummy variable.

 

17)    Explain the concept of multi collinearity in the context of regression delivery time of and item on the distance traveled and the gasoline consumed.

18)    Write short note on stepwise regression methods.

 

Section C

Answer any 2 questions        

Each question carries 20 marks                                             (2  X 20 = 40 Marks)

 

19 a) How do you write the distribution of a transformed mean of a response variable of a Poisson log linear model in the natural exponential family form.

(10 Marks)

19 b) Give an illustration for multinomial responses using baseline logit models.

(10 Marks)

20 a) Illustrate the Poisson general linear model from a study of nesting horseshoe crab. ( 10 Marks)

20 b) Explain the use of dummy variables in the logit models with an example.

(10 Marks)

21 a) Explain the regression model of the status of the employees on the education and income level in the context of the usage of a dummy variable. ( 10 Marks)

21 b) Explain the concept of multi collinearity in the context of regression delivery time of and item on the distance traveled and the gasoline consumed.

(10 Marks)

22 a) Job satisfaction of the employees of a company is categorized in to 1. Not satisfied 2. A little satisfied 3. Satisfied and 4. Very much satisfied.  Construct a multinomial model for regressing job satisfaction on income and the gender.

( 10 Marks)

22 b) Explain the four methods for scaling residuals bringing out the relationship between them.          (10 Marks)

 

 Go To Main page

Loyola College M.Sc. Statistics Nov 2008 Applied Regression Analysis Question Paper PDF Download

LOYOLA COLLEGE (AUTONOMOUS), CHENNAI – 600 034

BA 22

M.Sc. DEGREE EXAMINATION – STATISTICS

FIRST SEMESTER – November 2008

    ST 1811 – APPLIED REGRESSION ANALYSIS

 

 

 

Date : 11-11-08                 Dept. No.                                        Max. : 100 Marks

Time : 1:00 – 4:00

Section A

Answer All the Questions                                                          (10 x 2 = 20 Marks)

 

  1. What are the distributions of the components (yi , ) of a simple regression model?
  2. What do you mean by Hetroscedasticity Property?
  3. Explain the Linear Probability model
  4. What is the Linear Predictor in a Generalized Linear Model?
  5. What is the Identity Link in a Generalized Linear Model?
  6. Explain the Simple Regression Model.
  7. List the four methods of scaling residuals.
  8. Give two examples for nominal variables.
  9. Give an example of a variable that is classified as nominal, ordinal and interval variable.
  1.     Give an example for Poisson Log linear Model.

 

Section B

Answer Any Five Questions                                                          (5 x 8 = 40 Marks)

 

  • Show that the least square estimates ,  of a simple regression

model are unbiased.

  • Explain the procedures for finding the confidence interval forand  of a simple regression model.
  • Discuss multicollinearity with an example.
  • Explain the purpose of Unit Normal Scaling
  • Explain Binomial logit model for the binary data
  • What are the properties of the least square estimates of the fitted regression model?
  • Discuss any two methods of scaling residuals
  • Explain the Logistic regression model with an example

 

Section C

Answer Any Two Questions                                      (2 x 20 = 40 Marks)

 

19 (a) Derive V() of a simple regression model.

(b) Write down the test procedures to test the intercept of a simple

regression model.

 

20 (a) Write down the test Procedures to test H0:  =0 against H1:  ¹ 0

using Analysis of Variance.

(b) Derive the interval estimation of the mean response of a simple

regression model.

 

21 (a) Fit a Logistic regression model

(b) How do you interpret the Poisson Loglinear model for a count data

 

22 (a) Estimate the parameters of a multiple linear regression model by the

methods Maximum Likelihood Estimation.

(b)  Write short notes on Relative Risk, Odds Ratio and cross product ratio.

 

 

Go To Main page

Loyola College M.Sc. Statistics April 2009 Applied Regression Analysis Question Paper PDF Download

    LOYOLA COLLEGE (AUTONOMOUS), CHENNAI – 600 034

M.Sc. DEGREE EXAMINATION – STATISTICS

YB 34

FIRST SEMESTER – April 2009

ST 1811 – APPLIED REGRESSION ANALYSIS

 

 

 

Date & Time: 30/04/2009 / 1:00 – 4:00 Dept. No.                                                   Max. : 100 Marks

 

 

SECTION – A

Answer All questions.                                                                           (10 x 2 = 20 marks)

  1. What is a multiple linear regression model ?

2.Why do regressions have negative signs. Give reasons.

  1. Explain BLUE.
  2. Explain the co-efficient of determination
  3. State any two ways in which ‘specification Error’ occurs.
  4. What is multi collinearity?

7.What is the formula for finding the adjusted r-square?

  1. What is Residuals ?
  2. Why do we use Dummy variables in a model?
  3. What are response and explanatory variables?

SECTION – B

Answer any Five questions. Each carries 8 marks.                             (5 x 8 = 40 marks)

  1. What are the three components specified in a generalized linear model? Explain in detail.
  2. Explain in detail categorical data analysis with examples. What are the two primary types of scales of categorical variables? Give example.
  3. What is the form of logistic regression model? Also give link function for a logistic regression model?
  4. Explain the four methods of scaling the Residuals ?
  5. Write short notes on Residual Plot.
  6. Estimate bo , b1, and s of  simple linear regression model by MLE
  7. Give an application scenario to illustrate the simple regression model .
  8. Write a short note on detecting multi collinearity.

SECTION –C

Answer any TWO questions. Each carries 20 marks.                         (2 x 20 = 40 marks)

  1. Give an illustration and explain the following in detail :

a)Binomial logit models for binary data

b)Poisson log linear model for count data                   (10+10 Marks)

  1. a) Explain the procedure of standardizing the regression model using the

(i) Unit normal scale and (ii) Unit length scale                 (5+5 Marks)

  1. b) Explain probit and complementary log-log model (10 Marks)
  2. Explain the following methods for scaling the residuals.

(i)  Standardized residuals

(ii) Studentized residuals

(iii) Press residuals

(iv) R Student residuals                                  (5 Marks each)

  1. a) Derive the procedure for testing the hypothesis that all of the regression slopes are zero.

(10 Marks)

  1. b) Derive the least square estimates of the parameters of a simple regression model.(10 Marks)

 

 

Go To Main page

Loyola College M.Sc. Statistics Nov 2010 Applied Regression Analysis Question Paper PDF Download

LOYOLA COLLEGE (AUTONOMOUS), CHENNAI – 600 034

M.Sc. DEGREE EXAMINATION – STATISTICS

FIRST SEMESTER – NOVEMBER 2010

    ST 1816  – APPLIED REGRESSION ANALYSIS

 

 

 

Date : 03-11-10                 Dept. No.                                        Max. : 100 Marks

Time : 1:00 – 4:00

 

SECTION – A

Answer all the questions.                                                                                                     10 x 2 = 20 marks

  1. Write any two properties of least squares estimators of multiple linear regression model.
  2. Distinguish between R2 and adjusted R2 statistics.
  3. Provide any two examples for linearizing non-linear models.
  4. Give an example for a quantitative regressor expressed in terms of indicator variables.
  5. Define Mallows’s CP statistic.
  6. Define ridge estimator.
  7. When do we use piecewise polynomial fitting?
  8. Write a note on kernel regression.
  9. Define logistic response function.
  10. Write a note on Poisson regression.

 

                                                                      SECTION-B

Answer any five questions                                                                                                    5 x 8 = 40 marks

  1. Show that the maximum likelihood estimator for the model parameters in multiple linear

regression when the model errors are normally and independently distributed are also least

square estimators.

  1. Explain the two popular scaling techniques in computing standardized regression

coefficients.

  1. Explain the fitting of regression model with two indicator variables.
  2. Write about four primary sources of multicollinearity among regressors.
  3. Write the procedure of principal components for obtaining biased estimators of regression

coefficients.

  1. How will you predict the response over the range of the data using locally weighted

regression approach ?

  1. How will you estimate parameters in a non-linear system?
  2. Briefly explain models with a binary response variable.

 

 

 

 

 

-2-

                                                                   

 

 

Section -C

 Answer any two questions                                                                                                2 x 20 = 40 Marks.

 

  1. (a) Derive the least squares estimators of model parameters for multiple linear regression

model.

(b) Carryout the test for significance of regression for a multiple linear regression model.

(10 + 10 Marks)

 

 

  1. (a) Present a formal statistical test for the lack of fit of a regression model.

(b) Explain some variance stabilizing transformations.                                    (15 + 5 Marks)

 

 

  1. (a) Explain piecewise polynomial fitting (splines).

(b) Elaborately write the use of orthogonal polynomials in fitting regression models.

(10+10) marks

 

 

  1. (a) Explain the fitting of polynomial models in two or more variables.

(b) Write about Link functions, linear predictor and canonical link for the generalized linear

model.                                                                                                       (10 + 10 Marks)

 

Go To Main Page

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Loyola College M.Sc. Statistics Nov 2012 Applied Regression Analysis Question Paper PDF Download

LOYOLA COLLEGE (AUTONOMOUS), CHENNAI – 600 034

M.Sc. DEGREE EXAMINATION – STATISTICS

FIRST SEMESTER – NOVEMBER 2012

ST 1821 – APPLIED REGRESSION ANALYSIS

 

 

Date : 05/11/2012            Dept. No.                                        Max. : 100 Marks

Time : 1:00 – 4:00

 

Part-A

 

Answer all the questions:                                                                                                                                                (10×2=20)

 

1) Define ‘residual’ in a regression model.

2) Explain adjusted R2.

3) What is the variance stabilizing transformation used when σ2 α E(Y) (1-E(Y))?

4) Mention any two sources of multi collinearity.

5) What is the need for standardized regression coefficients?

6) When the regression model is said to be hierarchical?

7) Explain the term auto correlation.

8) Explain AR (1) process.

9) Explain partial correlation coefficient.

10) Explain dummy variable trap.

 

Part-B

 

Answer any 5 questions:                                                                                                                                                  (5×8=40)

 

11) How will you verify the assumption of normality and constant variance in a linear regression model? Explain.

12) Consider the model

Y=β0 1 x12 x23 x3

It is decided to test H0: β13, β2=0

Write the reduced model and the data matrix relevant for the hypothesis, given the data matrix as

 

X=

13) Explain Studentized Residuals and externally studentized

Residuals.

14) Consider the following ANOVA used for fitting a linear

Regression model with 6 reggressors

 

 

 

 

 

ANOVA

Source df Sum of squares Mean square F
Regression 6 524.661
Residuals 1149
Total 29

 

 

  • Fill in the blanks (4)
  • What is the total number of observations? (1)
  • What conclusion do you draw about the over all fitness of the model? (2)

 

15) Explain generalized least squares.

16) What are the points to be considered in fitting a polynomial regression model?

17) Explain splines in detail.

18) Explain the random walk model in time series.

 

Part-C

Answer any 2 questions:                                                                                                                                                 (2×20=40)

 

19)a) An investigator has the following data

 

Y 3.2 5.1 4.5 2.4
X 5 9 6 4

Guide the investigator as to whether the model Y=β01X or Y1/201X is appropriate.

  1. b) Suppose theory suggested that annual income (Y) depended on sex(S), highest degree received (D), and years of experience (E).

The following data is obtained for 10 employees,

S No. Y E D S
1 13876 1 UG M
2 11608 2 PG F
3 18701 1 PG M
4 11283 2 H.Sc M
5 11767 2 UG F
6 20872 2 PG M
7 11772 4 UG F
8 10535 3 H.Sc F
9 12195 3 PG M
10 12313 2 H.Sc M

It is also decided to study the interaction effect of sex with education on Y. Write a suitable linear regression model, with the relevent data matrix.

 

20) Explain the various methods of diagnosing multicollinearity

and suggest the methods for removing it.

 

21) Given the following information for fitting a regression

model with 4 regressors. Use forward selection method

to find significant variables that enter at each iteration.

SST=2715.7635                                                   SSRes(x2, x3) =415.4

SSRes(x1) = 1265.6867                                       SSRes(x2, x4) =868.8

SSRes(x2) = 906.3363                                         SSRes(x3, x4) =175.7

SSRes(x3) = 1939.4                                              SSRes(x1, x2, x3) =48.1

SSRes(x4) =883.87                                               SSRes(x1, x2, x4) =47.9

SSRes(x1, x2) =57.9                                              SSRes(x1, x3, x4) =50.8

SSRes(x1, x3) =1227                                             SSRes(x2, x3, x4) =73.8

SSRes(x1, x4) =74.76                                           SSRes(x1, x2, x3, x4) =47.86

 

22) a) Explain the methods of studying autocorrelation in a

linear regression model.

  1. b) Explain the Box-Jenkins methodology of ARIMA

modelling.

 

Go To Main Page

 

 

 

 

 

 

 

 

 

© Copyright Entrance India - Engineering and Medical Entrance Exams in India | Website Maintained by Firewall Firm - IT Monteur