Inference from Distributions

Calculating some useful quantity from a joint probability distribution
We have a joint distribution of a joint distribution and wish to find $P (Q ∣ e_{1}, e_{2}, ...)$
Evidence variables are variables in that help to query
- this case: $e_{i}$
Query variables are variables we wish to learn out
- this case $Q$
Hiddent variables are variables that are not in question
$O (d^{n})$ time complexity to find probability
Ex:
- Posterior probability: $P (Q ∣ E_{1} = e_{1}, ..., E_{k} = e_{k})$ , query with evidences
- Most likely explaination: $ar g max_{p} P (Q = q ∣ E_{1} = e_{1}, ...)$

ex:
- $P (B ∣ + j, + m) \propto_{B} P (B, + j, + m) = \sum_{e, a} P (B, e, a, + j, + m) = \sum_{e, a} P (B) P (e) P (a ∣ B, e) P (+ j ∣ a) P (+ m ∣ a)$
- $P (B ∣ + j, + m) = P (B, + j, + m) / P (+ j, + m)$ , therefore proportional, but the denominator is easy to calculate so we ignore for now
which is very long

Procedure: Join all factors, eliminate all hidden variables, normalize
- ex:
  - R (rain) → T (traffic) → L (late)
  - we have table for R, T|R and L|T
  - and we wish to query $P (L)$
1. Join all factors: similar to JOIN clause
  - Build a new table from conditional table and independent
    - $P (R) \times P (T ∣ R) = P (R, T)$ then $P (R, T) \times P (L ∣ T) = P (R, T, L)$
2. Eliminate all hidden variables:
  - Marginalize the joint table, $\sum_{R} P (R, T, L) = P (T, L) \to \sum_{T} P (T, L) = P (L)$
3. Normalize to sum of 1
$\sum_{T} \sum_{R} P (L ∣ t) P (r) P (t ∣ r)$ , from right to left: joint r, joint t, eliminate r, eliminate t
This way, it is join up the whole joint distribution before summing out the hidden variables, huge space complexity
We can normalize early, which is Variable Elimination

Similar to inference by enumeration but we marginalize early to save space
Join new variable from joint distribution, marginalize it and join new, marginalize,…
With the same example:
1. Join $P (R) \times P (T ∣ R) = P (R, T) \to \sum_{R} P (R, T) = P (T)$
2. Join $P (T) \times P (L ∣ T) = P (T, L) \to \sum_{T} P (T, L) = P (L)$
While there are still hidden variables, pick the hidden variable and join it and sum it out (eliminate)
If there is evidence in the query, start with factors that join with the evidence first and only select the positive when joining
- ex: query $P (L ∣ + r) : P (+ r) \to P (+ r) \times P (T ∣ + r) = P (T, + r) \to P (T, + r) \times P (L ∣ T) = P (L, T + r)$
- $\to \sum_{T} P (L, T, + r) = P (L, + r)$
If the final table is not $P (L ∣ + r)$ but $P (L, + r)$ , use Bayes’ theorem
- $P (L, + r) \to P (+ l ∣ + r) = P (+ l, + r) / P (+ r) = P (+ l, + r) /1$
Normalize then we have the answer
- ex: normalize between $L$ such $\sum_{L} P (L ∣ + r) = 1$

StrixTheKiet Notes