Reference

RestrictedBoltzmannMachines.PottsType
Potts(θ)

Layer with Potts units, with external fields θ. Encodes categorical variables as one-hot vectors. The number of classes is the size of the first dimension.

source
RestrictedBoltzmannMachines.SpinType
Spin(θ)

Layer with spin units, with external fields θ. The energy of a layer with units $s_i$ is given by:

\[E = -\sum_i \theta_i s_i\]

where each spin $s_i$ takes values $\pm 1$.

source
RestrictedBoltzmannMachines.BinaryRBMMethod
BinaryRBM(a, b, w)
BinaryRBM(N, M)

Construct an RBM with binary visible and hidden units, which has an energy function:

\[E(v, h) = -a'v - b'h - v'wh\]

Equivalent to RBM(Binary(a), Binary(b), w).

source
RestrictedBoltzmannMachines.HopfieldRBMMethod
HopfieldRBM(g, θ, γ, w)
HopfieldRBM(g, w)

Construct an RBM with spin visible units and Gaussian hidden units. If not given, θ = 0 and γ = 1 by default.

\[E(v, h) = -g'v - θ'h + \sum_\mu \frac{γ_\mu}{2} h_\mu^2 - v'wh\]

source
RestrictedBoltzmannMachines.aisMethod
ais(rbm0, rbm1, v0, βs)

Provided v0 is an equilibrated sample from rbm0, returns F such that mean(exp.(F)) is an unbiased estimator of Z1/Z0, the ratio of partition functions of rbm1 and rbm0.

!!! tip Use logmeanexp logmeanexp(F), using the function logmeanexp[@ref] provided in this package, tends to give a better approximation of log(Z1) - log(Z0) than mean(F).

source
RestrictedBoltzmannMachines.aiseMethod
aise(rbm, [βs]; [nbetas], init=rbm.visible, nsamples=1)

AIS estimator of the log-partition function of rbm. It is recommended to fit init to the single-site statistics of rbm (or the data).

!!! tip Use large nbetas For more accurate estimates, use larger nbetas. It is usually better to have large nbetas and small nsamples, rather than large nsamples and small nbetas.

source
RestrictedBoltzmannMachines.annealMethod
anneal(rbm0, rbm1; β)

Returns an RBM that interpolates between rbm0 and rbm1. Denoting by E0(v, h) and E1(v, h) the energies assigned by rbm0 and rbm1, respectively, the returned RBM assigns energies given by:

E(v,h) = (1 - β) * E0(v) + β * E1(v, h)
source
RestrictedBoltzmannMachines.block_matrix_invertMethod
block_matrix_invert(A, B, C, D)

Inversion of a block matrix, using the formula:

\[\begin{bmatrix} \mathbf{A} & \mathbf{B} \\ \mathbf{C} & \mathbf{D} \end{bmatrix}^{-1} = \begin{bmatrix} \left(\mathbf{A} - \mathbf{B} \mathbf{D}^{-1} \mathbf{C}\right)^{-1} & \mathbf{0} \\ \mathbf{0} & \left(\mathbf{D} - \mathbf{C} \mathbf{A}^{-1} \mathbf{B}\right)^{-1} \end{bmatrix} \begin{bmatrix} \mathbf{I} & -\mathbf{B} \mathbf{D}^{-1} \\ -\mathbf{C} \mathbf{A}^{-1} & \mathbf{I} \end{bmatrix}\]

Assumes that A and D are square and invertible.

source
RestrictedBoltzmannMachines.block_matrix_logdetMethod
block_matrix_logdet(A, B, C, D)

Log-determinant of a block matrix using the determinant lemma.

\[\det\left( \begin{bmatrix} \mathbf{A} & \mathbf{B} \\ \mathbf{C} & \mathbf{D} \end{bmatrix} \right) = \det(A) \det(D - CA^{-1}B) = \det(D) \det(A - BD^{-1}C)\]

Here we assume that A and D are invertible, and moreover are easy to invert (for example, if they are diagonal). We use this to chose one or the other of the two formulas above.

source
RestrictedBoltzmannMachines.categorical_sampleMethod
categorical_sample(P)

Given a probability array P of size (q, *), returns an array C of size (*), such that C[i] ∈ 1:q is a random sample from the categorical distribution P[:,i]. You must ensure that P defines a proper probability distribution.

source
RestrictedBoltzmannMachines.collect_statesMethod
collect_states(layer)

Returns an array of all states of layer. Only defined for discrete layers.

Warning

Use only for small layers. For large layers, the exponential number of states will not fit in memory.

source
RestrictedBoltzmannMachines.initialize!Function
initialize!(rbm, [data]; ϵ = 1e-6)

Initializes the RBM and returns it. If provided, matches average visible unit activities from data.

initialize!(layer, [data]; ϵ = 1e-6)

Initializes a layer and returns it. If provided, matches average unit activities from data.

source
RestrictedBoltzmannMachines.log_likelihoodMethod
log_likelihood(rbm, v)

Log-likelihood of v under rbm, with the partition function compued by extensive enumeration. For discrete layers, this is exponentially slow for large machines.

source
RestrictedBoltzmannMachines.log_partitionMethod
log_partition(rbm)

Log-partition of rbm, computed by extensive enumeration of visible states (except for particular cases such as Gaussian-Gaussian RBM). This is exponentially slow for large machines.

If your RBM has a smaller hidden layer, mirroring the layers of the rbm first (see mirror).

source
RestrictedBoltzmannMachines.log_pseudolikelihoodMethod
log_pseudolikelihood(rbm, v; exact = false)

Log-pseudolikelihood of v. If exact is true, the exact pseudolikelihood is returned. But this is slow if v consists of many samples. Therefore by default exact is false, in which case the result is a stochastic approximation, where a random site is selected for each sample, and its conditional probability is calculated. In average the results with exact = false coincide with the deterministic result, and the estimate is more precise as the number of samples increases.

source
RestrictedBoltzmannMachines.log_pseudolikelihood_sitesMethod
log_pseudolikelihood_sites(rbm, v, sites)

Log-pseudolikelihood of a site conditioned on the other sites, where sites is an array of site indices (CartesianIndex), one for each sample. Returns an array of log-pseudolikelihood values, for each sample.

source
RestrictedBoltzmannMachines.log_pseudolikelihood_stochMethod
log_pseudolikelihood_stoch(rbm, v)

Log-pseudolikelihood of v. This function computes an stochastic approximation, by doing a trace over random sites for each sample. For large number of samples, this is in average close to the exact value of the pseudolikelihood.

source
RestrictedBoltzmannMachines.metropolis!Method
metropolis!(v, rbm; β = 1)

Metropolis-Hastings sampling from rbm at inverse temperature β. Uses v[:,:,..,:,1] as initial configurations, and writes the Monte-Carlo chains in v[:,:,..,:,2:end].

source
RestrictedBoltzmannMachines.metropolisMethod
metropolis(rbm, v; β = 1, steps = 1)

Metropolis-Hastings sampling from rbm at inverse temperature β, starting from configuration v. Moves are proposed by normal Gibbs sampling.

source
RestrictedBoltzmannMachines.raiseMethod
raise(rbm::RBM, βs; v, init)

Reverse AIS estimator of the log-partition function of rbm. While aise tends to understimate the log of the partition function, raise tends to overstimate it. v must be an equilibrated sample from rbm.

!!! tip Use logmeanexp If F = raise(...), then -logmeanexp(-F), using the function logmeanexp[@ref] provided in this package, tends to give a better approximation of log(Z) than mean(F).

!!! tip Sandwiching the log-partition function If Rf = aise(...), Rr = raise(...) are the AIS and reverse AIS estimators, we have the stochastic bounds logmeanexp(Rf) ≤ log(Z) ≤ -logmeanexp(-Rr).

source
RestrictedBoltzmannMachines.rescale_activations!Method
rescale_activations!(layer, λ::AbstractArray)

For continuous layers with scale parameters, re-parameterizes such that unit activations are divided by λ, and returns true. For other layers, does nothing and returns false.

source
RestrictedBoltzmannMachines.rescale_hidden!Method
rescale_hidden!(rbm, λ::AbstractArray)

For continuous hidden units with a scale parameter, scales parameters such that hidden unit activations are divided by λ, and returns true. For other hidden units does nothing and returns false. The modified RBM is equivalent to the original one.

source
RestrictedBoltzmannMachines.sample_h_from_hMethod
sample_h_from_h(rbm, h; steps=1)

Samples a hidden configuration conditional on another hidden configuration h. Ensures type stability by requiring that the returned array is of the same type as h.

source
RestrictedBoltzmannMachines.sample_v_from_vMethod
sample_v_from_v(rbm, v; steps=1)

Samples a visible configuration conditional on another visible configuration v. Ensures type stability by requiring that the returned array is of the same type as v.

source
RestrictedBoltzmannMachines.substitution_matrix_exhaustiveFunction
substitution_matrix_exhaustive(rbm, v)

Returns an q x N x B tensor of free energies F, where q is the number of possible values of each site, B the number of data points, and N the sequence length:

`q, N, B = size(v)

Thus F and v have the same size. The entry F[x,i,b] gives the free energy cost of flipping site i to x of v[b] from its original value to x, that is:

F[x,i,b] = free_energy(rbm, v_) - free_energy(rbm, v[b])

where v_ is the same as v[b] in all sites but i, where v_ has the value x.

Note that i can be a set of indices.

source
RestrictedBoltzmannMachines.substitution_matrix_sitesFunction
substitution_matrix_sites(rbm, v, sites)

Returns an q x B matrix of free energies F, where q is the number of possible values of each site, and B the number of data points. The entry F[x,b] equals the free energy cost of flipping site[b] of v[b] to x, that is (schemetically):

F[x, b] = free_energy(rbm, v_) - free_energy(rbm, v)

where v = v[b], and v_ is the same as v in all sites except site[b], where v_ has the value x.

source
RestrictedBoltzmannMachines.tnmeanvarMethod
tnmeanvar(a)

Mean and variance of the standard normal distribution truncated to the interval (a, +∞). Equivalent to tnmean(a), tnvar(a) but saves some common computations. WARNING: tnvar(a) can fail for very very large values ofa`.

source
RestrictedBoltzmannMachines.∂cgfFunction
∂cgf(layer, inputs = 0; wts = 1)

Unit activation moments, conjugate to layer parameters. These are obtained by differentiating cgfs with respect to the layer parameters. Averages over configurations (weigthed by wts).

source