Evidence Lower Bound
where we use Jensenβs
inequality and factorization assumption.
Update parameters
Update
Extract terms related
from ELBO.
We combine the results of Variational Bayes and Collapsed Gibbs
Sampling. From Variational Bayes,
% Results of Collapsed Gibbs
Sampling show, $$\begin{align}
&\quad \Pr(z_{di}=k \mid \mathbf{z}^{-di}, \mathbf{w}, \mathbf{s},
\boldsymbol{\alpha}, \boldsymbol{\beta},
\tilde{\boldsymbol{\beta}}, \boldsymbol{\gamma})
\propto
\begin{cases} %
\frac{\displaystyle \beta_v + n_{k v}^{- di} }{\displaystyle V
\beta_v + n_{k}^{- di}} \cdot %
\frac{\displaystyle n^{- di}_{k} + \gamma_1 }{\displaystyle
\tilde{n}_{k}^{- di} + \gamma_1 + n^{- di}_{k} + \gamma_2 } \cdot %
\left(n_{d{k}}^{- di} + \alpha_{dk} \right) & \ {\rm if \ } s_{di}
= 0, \\
\frac{\displaystyle \tilde{\beta}_v + \tilde{n}_{k v}^{-
di} }{\displaystyle L_{k} \tilde{\beta}_v + \tilde{n}_{k }^{- di} }
\cdot%
\frac{\displaystyle \tilde{n}^{ - di}_{k} + \gamma_2 }{\displaystyle
\tilde{n}^{- di}_{k} + \gamma_1 + n^{- di}_{k} + \gamma_2 } \cdot %
\left(n_{d{k}}^{- di} + \alpha_{dk} \right) & \ {\rm if \ } s_{di}
= 1.
\end{cases}\label{eq:sample-z-base}
\end{align}$$ % We replace some of the integrations with the
results of the Collapsed Gibbs Sampling,
Note that
Next, we approximate expectations. We use Taylor expansion around
,
If
,
which is called CVB0. For
example
Hence,
where $$\begin{align}
%%%%%%%%%%%%
\mathbb{E}_{q(\mathbf{z}^{-di}) q(\mathbf{s})}[n_{k v}^{- di}] &=
\mathbb{E}_{q(\mathbf{z}^{-di}) q(\mathbf{s})}
\bigg[\sum_{d=1}^{D}\sum_{i'\neq i}^{N_d} \unicode{x1D7D9}(z_{di'} = k)
\unicode{x1D7D9}(s_{di'} = 0) \unicode{x1D7D9}(w_{di'} = v) \bigg] \\
&= \sum_{d=1}^{D}\sum_{i'\neq i}^{N_d}
\mathbb{E}_{q(\mathbf{z}^{-di})}[\unicode{x1D7D9}(z_{di'} = k)]
\mathbb{E}_{q(\mathbf{s})}[\unicode{x1D7D9}(s_{di'} = 0)]
\unicode{x1D7D9}(w_{di'} = v)\\
&= \sum_{d=1}^{D}\sum_{i'\neq i}^{N_d} q(z_{di'} = k) q(s_{di'} =
0) \unicode{x1D7D9}(w_{di'} = v) \\
%%%%%%%%%%%%
\mathbb{E}_{q(\mathbf{z}^{-di}) q(\mathbf{s})}[\tilde{n}_{k v}^{- di}]
&= \mathbb{E}_{q(\mathbf{z}^{-di}) q(\mathbf{s})}
\bigg[\sum_{d=1}^{D}\sum_{i\neq i'}^{N_d} \unicode{x1D7D9}(z_{di'} = k)
\unicode{x1D7D9}(s_{di'} = 1) \unicode{x1D7D9}(w_{di'} = v) \bigg] \\
&= \sum_{d=1}^{D}\sum_{i'\neq i}^{N_d}
\mathbb{E}_{q(\mathbf{z}^{-di})}[\unicode{x1D7D9}(z_{di'} = k)]
\mathbb{E}_{q(\mathbf{s})}[\unicode{x1D7D9}(s_{di'} = 1)]
\unicode{x1D7D9}(w_{di'} = v)\\
&= \sum_{d=1}^{D}\sum_{i'\neq i}^{N_d} q(z_{di'} = k) q(s_{di'} =
1) \unicode{x1D7D9}(w_{di'} = v)\\
%%%%%%%%%%%%
\mathbb{E}_{q(\mathbf{z}^{-di}) q(\mathbf{s})}[{n}_{k}^{- di}]
&= \mathbb{E}_{q(\mathbf{z}^{-di}) q(\mathbf{s})}
\bigg[\sum_{d=1}^{D}\sum_{i' \neq i}^{N_d} \unicode{x1D7D9}(z_{di'} = k)
\unicode{x1D7D9}(s_{di'} = 0) \bigg] \\
&= \sum_{d=1}^{D}\sum_{i'\neq i}^{N_d} q(z_{di'} = k) q(s_{di'} =
0) \\
%%%%%%%%%%%%
\mathbb{E}_{q(\mathbf{z}^{-di}) q(\mathbf{s})}[\tilde{n}_{k}^{- di}]
&= \mathbb{E}_{q(\mathbf{z}^{-di}) q(\mathbf{s})}
\bigg[\sum_{d=1}^{D}\sum_{i' \neq i}^{N_d} \unicode{x1D7D9}(z_{di'} = k)
\unicode{x1D7D9}(s_{di'} = 1) \bigg] \\
&= \sum_{d=1}^{D}\sum_{i'\neq i}^{N_d} q(z_{di'} = k) q(s_{di'} =
1) \\
%%%%%%%%%%%%
\mathbb{E}_{q(\mathbf{z}^{-di}) q(\mathbf{s})} [n_{d{k}}^{- di}]
&= \mathbb{E}_{q(\mathbf{z}^{-di}) q(\mathbf{s})} \bigg[ \sum_{i'
\neq i}^{N_d} \unicode{x1D7D9}(z_{di'} = k) \bigg] \\
&= \sum_{i' \neq i}^{N_d} q(z_{di'} = k)
%%%%%%%%%%%%
% \E_{q(\bzremove) q(\bs)} [\tilde{n}_{d{k}}^{- di}] &=
\E_{q(\bzremove) q(\bs)} \bigg[ \sum_{i' \neq i}^{N_d} \I(z_{di'} = k)
\I(s_{di'} = 1) \bigg] \\
% &= \sum_{i' \neq i}^{N_d} q(z_{di'} = k) q(s_{di'} = 1)
\end{align}$$
Update
Extract terms related to
from ELBO.
Results of the Collapsed Gibbs Sampling show, $$\begin{align}
\Pr(s_{di} = s \mid \mathbf{s}^{- di}, \mathbf{z},
\mathbf{w}, \boldsymbol{\beta}, \tilde{\boldsymbol{\beta}},
\boldsymbol{\gamma})
& \ \propto \
\begin{cases}
\frac{\displaystyle \beta_v + n_{z_{di}, v}^{- di} }{
\displaystyle V \beta_v + n_{z_{di}}^{ - di} } \cdot %
(\displaystyle n^{- di}_{z_{di}} + \gamma_1 ) %
& {\rm if} \quad s = 0, \\
\frac{\displaystyle \tilde{\beta}_v + \tilde{n}_{z_{di}, v}^{-
di} }{\displaystyle L_{z_{di}} \tilde{\beta}_v +
\tilde{n}_{z_{di}}^{- di} } \cdot%
(\displaystyle \tilde{n}^{- di}_{z_{di}} + \gamma_2 ) %
& {\rm if} \quad s = 1.
\end{cases}
\end{align}$$ \end{align}
From the Variational Bayes,
We replace the some parameters with the results of the Collapsed
Gibbs Sampling and the approximate the expectations,
Calculating perplexity
We cannot calculate the log-likelihood explicityly, so we check the
approximated perplexity instead.