I’m going to sketch out the next few things to plan and code for the DSL library. I have so far provided the ability to describe the network but I haven’t yet provided a way to describe the distributions. For instance, in the HMM Model, we ought to be able to say what the distribution of `Symbols`

is and so on. I’ll start with the ability to pick two distributions: Dirichlet and Multinomial. We can cover many models with just these two. When I provide a way to specify what type of distribution each node is, I should be able to change the distributions at will without affecting the network; for instance, using a continuous response as opposed to a discrete response in a HMM.

After this, I will want to create a function in the `Gibbs`

module that can take in a `Reader`

and sample the distributions. In the case of HMM, this would mean sampling the `Transition`

distributions and the `Symbols`

distributions by reading the network to figure out their priors and support.

Finally, with the sampled distributions and a `Reader`

I will write a sampler that produces a new `Reader`

. In the case of HMM, this means sampling the new `Topic`

variables. These steps cover the (uncollapsed) Gibbs sampling technique.

Looking ahead even further, I intend to write a method to compute the density of the observed variables (having marginalized out the latent variables). I will do this using the annealed importance sampling method as described in this paper “Evaluation Methods for Topic Models” by Hanna M. Wallah et. al. In the case of the HMM, this amounts to computing the probability of `Symbol`

given `Symbols`

and `Transition`

while marginalizing out the `Topic`

variables.