Investigating Absence of Conditional Independence Guarantees

Losing Independence

1 Losing Independence

We will investigate the absence of conditional independence guarantees between two random variables when an arbitrary descendant of a common effect is observed. We will consider the simple case of a causal chain of descendants:

Losing Independence

Suppose that all random variables are binary. The marginal distributions of the parents A and B are both uniform (0.5, 0.5), and the distributions of the common effect D0 and its descendants are

Table 1

(a) Give a minimal, analytical, and un-normalized expression for the joint probability of the parents A and B conditioned on an arbitrary descendant dk: Pr(A, B | dk). You should only include probability distributions from the Bayes net and sums and products (no division). Be sure to clearly label indices of cumulative sums or products.

(b) Suppose we observe Dk = +dk. What is the joint distribution Pr(D1, D2, . . . , Dk−1, +dk | D0)? The full distribution will have 2k rows, but you can simply give a verbal description of it.

(c) What is Pr(+dk | D0)? Plug this into a simplified form of the expression that you wrote down in (a) and compute Pr(A, B | +dk) (please do normalize this). Show that A and B are not independent conditioned on dk.

2.Starting a Car

For this problem you will be investigating the “Car Starting” sample Bayes net provided by the Beliefs and Decision Networks tool on AIspace (download the applet, run it, and go to File → Load Sample Problem → Car Starting Problem). The network has 19 nodes, many of them non-binary:

Beliefs and Decision Networks tool

While you should be able to complete all parts of this problem on your own, you are highly encouraged to use the tool to check your answers and understanding as well.

(a) Suppose we observe no variables. What variables, if any, are guaranteed to be independent of “Car Starts”? What variables, if any, are not guaranteed independence of “Spark Timing”?

(b) Suppose we observe “Battery Voltage”. Which variables lose guarantee of independence? Now suppose we observe “Main Fuse OK” instead. Which variables gain a new guarantee of independence from the no-observation case?

(c) Write an analytical expression for Pr(D | ST = bad) = Pr(Distributor OK | Spark Timing = bad). You should only use the probability tables of the Bayes net. Then compute this numerically by plugging in the probabilities given in the applet. Please show your work.

(d) Repeat for Pr(A, CS | BV = dead) = Pr(Alternator OK, Charging System OK | Battery Voltage = dead). While you are not required to do so, try to see if you can make your computations more efficient by distributing the sums and avoiding building a full joint distribution prior to marginalization.

3 Holiday Shopping

In anticipation of Black Friday, your company is planning how to market a classic product for the holidays this year. You want to predict whether the product will oversell (Y = +y) or undersell (Y = −y) the company’s projections. Three features will be considered: price level (X1), amount of promotion (X2), and distribution (X3). Each feature is discrete and can take one of three values: low (−1), medium (0), or high (+1). From historical data, we have the following observations:

(a) Estimate the parameters for the na¨?ve Bayes model. Use additive smoothing with k = 1 for the three features. You should have four probability tables (P(Y ), P(X1 | Y ), P(X2 | Y ), P(X3 | Y )) and no zero probabilities in any of them.

(b) What is the prediction for the sales of this product if it is lowly priced and promoted but distributed heavily (X1 = −1, X2 = −1, X3 = +1)? Please compute and show the loglikelihoods for each class.

Get instant help from 5000+ experts for