AMS 274 – Generalized Linear Models

Homework 4

1. The table below reports results from a developmental toxicity study involving ordinal

categorical outcomes. This study administered diethylene glycol dimethyl ether (an industrial solvent used in the manufacture of protective coatings) to pregnant mice. Each

mouse was exposed to one of five concentration levels for ten days early in the pregnancy

(with concentration 0 corresponding to controls). Two days later, the uterine contents of

the pregnant mice were examined for defects. One of three (ordered) outcomes (“Dead”,

“Malformation”, “Normal”) was recorded for each fetus.

Concentration Response Total number

(mg/kg per day) Dead Malformation Normal of subjects

(xi) (yi1) (yi2) (yi3) (mi)

0 15 1 281 297

62.5 17 0 225 242

125 22 7 283 312

250 38 59 202 299

500 144 132 9 285

Build a multinomial regression model for these data using continuation-ratio logits for the

response probabilities πj (x), j = 1, 2, 3, as a function of concentration level, x. Specifically,

consider the following model

L

(cr)

1 = log

π1

π2 + π3

= α1 + β1x; L

(cr)

2 = log

π2

π3

= α2 + β2x

for the multinomial response probabilities πj ≡ πj (x), j = 1, 2, 3.

(a) Show that the model, involving the multinomial likelihood for the data = {(yi1, yi2, yi3, xi) :

i = 1, …, 5}, can be fitted by fitting separately two Binomial GLMs. Provide details for

your argument, including the specific form of the Binomial GLMs.

(b) Use the result from part (a) to obtain the MLE estimates and corresponding standard errors for parameters (α1, α2, β1, β2). Plot the estimated response curves ˆπj (x), for

j = 1, 2, 3, and discuss the results.

(c) Develop and implement a Bayesian version of the model above. Discuss your prior

choice, and provide details for the posterior simulation method. Provide point and interval

estimates for the response curves πj (x), for j = 1, 2, 3.

2. Consider the “alligator food choice” data example, the full version of which is discussed

in Section 7.1 of Agresti (2002), Categorical Data Analysis, Second Edition. Here, consider the subset of the data reported in Table 7.16 (page 304) of the above book. This

data set involves observations on the primary food choice for n = 63 alligators caught in

Lake George, Florida. The nominal response variable is the primary food type (in volume) found in each alligator’s stomach, with three categories: “fish”, “invertebrate”, and

“other”. The invertebrates were mainly apple snails, aquatic insects, and crayfish. The

“other” category included amphibian, mammal, bird, reptile, and plant material. Also

available for each alligator is covariate information on its length (in meters) and gender.

(a) Focus first on length as the single covariate to explain the response probabilities

for the “fish”, “invertebrate” and “other” food choice categories. Develop a Bayesian

multinomial regression model, using the baseline-category logits formulation with “fish”

as the baseline category, to estimate (with point and interval estimates) the response

probabilities as a function of length. (Note that in this data example, we have mi = 1,

for i = 1, …, n.) Discuss your prior choice and approach to MCMC posterior simulation.

(b) Extend the model from part (a) to describe the effects of both length and gender

on food choice. Based on your proposed model, provide point and interval estimates for

the length-dependent response probabilities for male and female alligators.

3. Consider the inverse Gaussian distribution with density function

f(y | µ, φ) = (2πφy3

)

−1/2

exp

−

(y − µ)

2

2φµ2y

, y > 0; µ > 0, φ > 0.

Denote the inverse Gaussian distribution with parameters µ and φ by IG(µ, φ).

(a) Show that the inverse Gaussian distribution is a member of the exponential dispersion

family. Show that µ is the mean of the distribution and obtain the variance function.

(b) Consider a GLM with random component defined by the inverse Gaussian distribution. That is, assume that yi are realizations of independent random variables Yi with

IG(µi

, φ) distributions, for i = 1,…,n. Here, g(µi) = x

T

i β, where β = (β1, …, βp) (p < n)

is the vector of regression coefficients, and xi = (xi1, …, xip)

T

is the covariate vector for

the ith response, i = 1,…,n. Define the full model so that the yi are realizations of independent IG(µi

, φ) distributed random variables Yi

, with a distinct µi

for each yi

. Obtain

the scaled deviance for the comparison of the full model with the inverse Gaussian GLM.

AMS 274

# Generalized Linear Models Homework 4

Original price was: $40.00.$35.00Current price is: $35.00.

## Reviews

There are no reviews yet.