Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
and_or_kl [2024/06/20 12:53] – [Exponential mixture] pedroortega | and_or_kl [2024/07/26 18:53] (current) – pedroortega | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== And, Or, and the Two KL Projections ====== | ====== And, Or, and the Two KL Projections ====== | ||
+ | > I discuss the difference between minimizing the KL-divergence with respect to the first and second argument, and will conclude that they correspond to AND and OR operations on distributions, | ||
+ | |||
+ | //Cite as: Ortega, P.A. "And, Or, and the Two KL Projections", | ||
- | > I discuss the difference between minimizing the KL-divergence with respect to | ||
- | > the first and second argument, and will conclude that they correspond to | ||
- | > AND and OR operations on distributions, | ||
Oftentimes I see people wondering about the meaning of the two KL-projections: | Oftentimes I see people wondering about the meaning of the two KL-projections: | ||
Line 22: | Line 22: | ||
{{ :: | {{ :: | ||
- | Personally, I find this explanation somewhat | + | Personally, I find this explanation somewhat |
Line 30: | Line 30: | ||
their application on two distributions can be quite challenging. Instead, a | their application on two distributions can be quite challenging. Instead, a | ||
clearer grasp of the difference can be attained through the examination | clearer grasp of the difference can be attained through the examination | ||
- | of mixture distributions. Let' | + | of mixture distributions. Let' |
==== Linear mixture ==== | ==== Linear mixture ==== | ||
Line 36: | Line 36: | ||
Let's say we have N distributions q1,q2,…,qN over a finite set X. | Let's say we have N distributions q1,q2,…,qN over a finite set X. | ||
Given a set of positive weights w1,w2,…,wN that sum up to one, their | Given a set of positive weights w1,w2,…,wN that sum up to one, their | ||
- | *linear mixture* is | + | //linear mixture// is |
$$ | $$ | ||
Line 42: | Line 42: | ||
$$ | $$ | ||
- | The *linear mixture* expresses N mutually exclusive hypotheses qi(x) that | + | The //linear mixture// expresses N mutually exclusive hypotheses qi(x) that |
could be true with probabilities wi. That is, either q1 **or** q2 **or** | could be true with probabilities wi. That is, either q1 **or** q2 **or** | ||
... **or** qN is true, with probability w1, w2, ..., wN respectively, | ... **or** qN is true, with probability w1, w2, ..., wN respectively, | ||
Line 49: | Line 49: | ||
==== Exponential mixture ==== | ==== Exponential mixture ==== | ||
- | Given a set of positive coefficients α1,α2,…,αN (not necessarily summing up to one), their *exponential mixture* (a.k.a. geometric mixture) is | + | Given a set of positive coefficients α1,α2,…,αN (not necessarily summing up to one), their //exponential mixture// (a.k.a. geometric mixture) is |
$$ | $$ | ||
Line 57: | Line 57: | ||
It's important to highlight that in order for the exponential mixture to yield a valid probability distribution, | It's important to highlight that in order for the exponential mixture to yield a valid probability distribution, | ||
- | The *exponential mixture* expresses N constraints qi(x) that must be true | + | The //exponential mixture// expresses N constraints qi(x) that must be true |
simultaneously with precisions αi. That is, q1 **and** q2 **and** ... | simultaneously with precisions αi. That is, q1 **and** q2 **and** ... | ||
**and** qN are true, with precisions α1, α2, ..., αN | **and** qN are true, with precisions α1, α2, ..., αN | ||
Line 82: | Line 82: | ||
$$ | $$ | ||
- | \begin{equation} | + | |
- | \label{eq: | + | |
- | | + | |
- | \end{equation} | + | |
$$ | $$ | ||
Line 97: | Line 94: | ||
$$ | $$ | ||
- | That is, using the KL-divergences where $$p$$ is in the first argument. And in fact, | + | That is, using the KL-divergences where p is in the first argument. And in fact, |
the minimizer is precisely the exponential mixture: | the minimizer is precisely the exponential mixture: | ||
$$ | $$ | ||
- | \begin{equation} | + | |
- | \label{eq: | + | |
- | | + | |
- | \end{equation} | + | |
$$ | $$ | ||
- | Equations | + | Equations |
of my argument. Basically, we have found a relation between the two KL-projections | of my argument. Basically, we have found a relation between the two KL-projections | ||
and the two logical operators **and** and **or**. The two KL-divergences then measure | and the two logical operators **and** and **or**. The two KL-divergences then measure | ||
Line 150: | Line 144: | ||
Thus, it turns out that sequential predictions can be regarded as an alternation | Thus, it turns out that sequential predictions can be regarded as an alternation | ||
between OR and AND operations, first to express our uncertainty over the hypotheses, | between OR and AND operations, first to express our uncertainty over the hypotheses, | ||
- | and second to incorporate new evidence, respectively. | + | and second to incorporate new evidence, respectively. |