CSCI 3202 Problem Set 3.

Problem 3.1 (30 points)

Consider the Bayesian network whose graph is shown on the final page.

The probability entries are as follows:

N means “Nolan Arenado has a good day”

P(N) = 0.7

C means “Charlie Blackmon has a good day”

P(C) = 0.4

L means “The Rockies lose”

P(L | (not N) and (not C)) = 0.8

P(L | (not N) and C) = 0.6

P(L | N and (not C)) = 0.5

P(L | N and C) = 0.2

B means “Bud Black is grumpy”

P(B | L) = 0.9

P(B | not L) = 0.2

M means “Mike is grumpy”

P(M | L) = 0.6

P(M| not L) = 0.3

S means “Stella is grumpy”

P(S|M) = 0.8

P(S| not M) = 0.1

3.1a (2 points) What are the a priori probabilities for L, B, M, and S? (That is, in

the absence of any other information, what is the probability that the Rockies will

lose; the probability that Bud will be grumpy; the probability that Mike will be

grumpy; and the probability that Stella will be grumpy?)

3.1b (2 points) You are told that Charlie Blackmon had a good day today. What’s

the probability that both Bud and Mike are grumpy?

3.1c (6 points) You run into Stella in the evening and she growls at you. What is

the probability that Charlie had a good day?

3.1d (10 points) You run into Bud in the evening and he is grumpy. What is the

probability that both Nolan and Charlie had a good day today?

3.1e (10 points) You are told that Nolan did NOT have a good day today, and

that Bud is grumpy. What’s the probability that Mike is also grumpy?

Problem 3.2 (20 points)

3.2a (10 points) Suppose you have a set of 200 videos, with 100 “Yes” (Like) and

100 “No” (Dislike). You now have two choices of attributes to ask about:

A is a binary attribute. If you ask about attribute A you get two resulting sets, one

with 80 Yes and 40 No, and the other with 20 Yes and 60 No.

B is a binary attribute. If you ask about attribute B you get two resulting sets, one

with 100 Yes and 75 No, and the other with 0 Yes and 25 No.

Which of these two attributes is the more informative one to ask about?

3.2b (10 points) Suppose you have a set of 200 videos, with 100 “Yes” (Like) and

100 “No” (Dislike). You now have two choices of attributes to ask about:

A is a binary attribute. If you ask about attribute A you get two resulting sets, one

with 80 Yes and 20 No; and the other with 20 Yes and 80 No.

B is a three-valued attribute. If you ask about attribute B, you get three resulting

sets, one with X Yes and 50 No; the second with X Yes and 50 No; and the third

with (100 – 2X) Yes and 0 No.

Note that if X is 0, we definitely prefer Attribute B; if X is 50, we definitely prefer

Attribute A. What is the largest value for X such that Attribute B is more

informative than Attribute A?

Problem 3.3 (50 points)

In this problem, you will write a program to create a decision tree for some

particular situation of interest to you. Here’s the basic outline:

Imagine a type of complex decision that you might have to make. (Would I be

interested or uninterested in buying this car? Would I be favorable or unfavorable

toward living in this city?… Etc.) Your decision should involve at least 6 attributes

(for example, for the car these might include: does it have four doors, is the price

greater than $30K, does it have all-wheel drive, etc.).

a. First, create what you believe to be your own decision rule in tree form. In

other words, this is your cognitive model of what your own decision tree

looks like for this particular situation.

b. Now, create 25 examples consistent with this decision tree. Twenty of

these will be the “training examples” (you can choose these at random),

and the remaining five will be the “test examples”. Write a program that

creates decision trees from examples, using the entropy test for choosing

attributes that we described in class; and use that program to produce a

tree from the 20 training examples.

c. Compare the tree produced by your program to the tree in part (a) above.

How similar are they? Can you recognize your own decision process in

the computer-produced tree?

d. Finally, test the computer-produced tree against your cognitive tree by

comparing their decisions on the five test examples. (You can do this by

hand for this small test set.) How well does your computer-produced tree

do on the new examples?

N C

L

B M

S

Sale!

Uncategorized

# CSCI 3202 Problem Set 3

$30.00

CSCI 3202 Problem Set 3.

Problem 3.1 (30 points)

Consider the Bayesian network whose graph is shown on the final page.

The probability entries are as follows:

N means “Nolan Arenado has a good day”

P(N) = 0.7

C means “Charlie Blackmon has a good day”

P(C) = 0.4

L means “The Rockies lose”

P(L | (not N) and (not C)) = 0.8

P(L | (not N) and C) = 0.6

P(L | N and (not C)) = 0.5

P(L | N and C) = 0.2

B means “Bud Black is grumpy”