1 Losing Independence
We will investigate the absence of conditional independence guarantees between two random variables when an arbitrary descendant of a common effect is observed. We will consider the simple case of a causal chain of descendants:
Suppose that all random variables are binary. The marginal distributions of the parents A and B are both uniform (0.5, 0.5), and the distributions of the common effect D0 and its descendants are
(a) Give a minimal, analytical, and un-normalized expression for the joint probability of the parents A and B conditioned on an arbitrary descendant dk: Pr(A, B | dk). You should only include probability distributions from the Bayes net and sums and products (no division). Be sure to clearly label indices of cumulative sums or products.
(b) Suppose we observe Dk = +dk. What is the joint distribution Pr(D1, D2, . . . , Dk−1, +dk | D0)? The full distribution will have 2k rows, but you can simply give a verbal description of it.
(c) What is Pr(+dk | D0)? Plug this into a simplified form of the expression that you wrote down in (a) and compute Pr(A, B | +dk) (please do normalize this). Show that A and B are not independent conditioned on dk.
2.Starting a Car
For this problem you will be investigating the “Car Starting” sample Bayes net provided by the Beliefs and Decision Networks tool on AIspace (download the applet, run it, and go to File → Load Sample Problem → Car Starting Problem). The network has 19 nodes, many of them non-binary:
While you should be able to complete all parts of this problem on your own, you are highly encouraged to use the tool to check your answers and understanding as well.
(a) Suppose we observe no variables. What variables, if any, are guaranteed to be independent of “Car Starts”? What variables, if any, are not guaranteed independence of “Spark Timing”?
(b) Suppose we observe “Battery Voltage”. Which variables lose guarantee of independence? Now suppose we observe “Main Fuse OK” instead. Which variables gain a new guarantee of independence from the no-observation case?
(c) Write an analytical expression for Pr(D | ST = bad) = Pr(Distributor OK | Spark Timing = bad). You should only use the probability tables of the Bayes net. Then compute this numerically by plugging in the probabilities given in the applet. Please show your work.
(d) Repeat for Pr(A, CS | BV = dead) = Pr(Alternator OK, Charging System OK | Battery Voltage = dead). While you are not required to do so, try to see if you can make your computations more efficient by distributing the sums and avoiding building a full joint distribution prior to marginalization.
3 Holiday Shopping
In anticipation of Black Friday, your company is planning how to market a classic product for the holidays this year. You want to predict whether the product will oversell (Y = +y) or undersell (Y = −y) the company’s projections. Three features will be considered: price level (X1), amount of promotion (X2), and distribution (X3). Each feature is discrete and can take one of three values: low (−1), medium (0), or high (+1). From historical data, we have the following observations:
(a) Estimate the parameters for the na¨?ve Bayes model. Use additive smoothing with k = 1 for the three features. You should have four probability tables (P(Y ), P(X1 | Y ), P(X2 | Y ), P(X3 | Y )) and no zero probabilities in any of them.
(b) What is the prediction for the sales of this product if it is lowly priced and promoted but distributed heavily (X1 = −1, X2 = −1, X3 = +1)? Please compute and show the loglikelihoods for each class.
(c) What combination of the features maximizes the probability of overselling (Y = +y)?