Model
An alternative approach to model covariates is to use Pólya-Gamma
augmentation. Polson et al. (2013) propose a strategy to use Pólya-Gamma
latent variables for fully Bayesian inference in binomial likelihoods.
Linderman et al. (2015) use it to develop models for categorical and
multinomial data with dependencies among the multinomial parameters. We
extend their method to incorporate covariates.
We use the same notations as in the main text. Our corpus contains
documents. We observe a matrix
of
covariates whose dimension is
.
Topic model assigns one of
topics to each observed word.
is a vector of assigned topics for a document
.
We introduce the stick-breaking representation of the multinomial
distribution () described in Linderman et al. (2015),
rewrites the
-dimensional
multinomial distribution with
binomial distributions.
We model the parameters in with the covariates. First, we assume that
coefficients follow the multivariate normal distribution.
is a
matrix, so we introduce the vectorization transformation to draw all
elements from a single draw of the multivariate normal distribution.
where we have the priors
,
and
is a
identity matrix (for topics).
is a
identity matrix (for covariates).
becomes a diagonal matrix and equation (2) is the same as
.
Next, we use covariates to model the parameters in .
Social scientists often use
categorical variables (e.g., authorship of the document) as covariates.
Modeling the mean of the multivariate normal distribution with
covariates allows us to create variation in the document-topic
distribution when two or more documents have the same set of covariates.
The multivariate normal distribution can be generalized to the matrix
normal distribution.
where
is a
matrix, each row of
is equal to
,
and
is the
identity matrix (documents are independent). This generalization will
allow us to have a vectorized implementation.
Estimation
We sample
,
,
and Pólya-gamma auxiliary variables
.
Sampling
Equation (1) has the same form as Theorem 1 of Polson et al. (2013)
and we can introduce P{'{o}}lya-gamma auxiliary variables.
We can use the multivariate
normal distribution to sample
.
where the second proportion
comes from Matrix Cook Book 8.1.8 (product of Gaussians).
Sampling
Sampling
and
is the same as Bayesian multivariate linear regression in Rossi et
al. (2012, pp.31-34).