Reference
RestrictedBoltzmannMachines.Binary
— TypeBinary(θ)
Layer with binary units, with external fields θ
.
RestrictedBoltzmannMachines.CenteredRBM
— MethodCenteredRBM(visible, hidden, w)
Creates a centered RBM, with offsets initialized to zero.
RestrictedBoltzmannMachines.CenteredRBM
— MethodCenteredRBM(rbm, λv, λh)
Creates a centered RBM, with offsets λv
(visible) and λh
(hidden). See http://jmlr.org/papers/v17/14-237.html for details. The resulting model is not equivalent to the original rbm
, unless λv = 0
and λh = 0
.
RestrictedBoltzmannMachines.Gaussian
— TypeGaussian(θ, γ)
Gaussian layer, with location parameters θ
and scale parameters γ
.
RestrictedBoltzmannMachines.Potts
— TypePotts(θ)
Layer with Potts units, with external fields θ
. Encodes categorical variables as one-hot vectors. The number of classes is the size of the first dimension.
RestrictedBoltzmannMachines.RBM
— TypeRBM{V,H,W}
RBM, with visible layer of type V
, hidden layer of type H
, and weights of type W
.
RestrictedBoltzmannMachines.RBM
— MethodRBM(centered_rbm::CenteredRBM)
Returns an (uncentered) RBM
which neglects the offsets of centered_rbm
. The resulting model is not equivalent to the original centered_rbm
. To construct an equivalent model, use the function uncenter(centered_rbm)
instead (see uncenter
). Shares parameters with centered_rbm
.
RestrictedBoltzmannMachines.ReLU
— TypeReLU(θ, γ)
Layer with ReLU units, with location parameters θ
and scale parameters γ
.
RestrictedBoltzmannMachines.Spin
— TypeSpin(θ)
Layer with spin units, with external fields θ
. The energy of a layer with units $s_i$ is given by:
\[E = -\sum_i \theta_i s_i\]
where each spin $s_i$ takes values $\pm 1$.
RestrictedBoltzmannMachines.nsReLU
— TypensReLU
A variant of xReLU
units without scale parameter γ (which is fixed at 1). This is done to remove the gauge invariance between the weights and the hidden units scale.
RestrictedBoltzmannMachines.BinaryRBM
— MethodBinaryRBM(a, b, w)
BinaryRBM(N, M)
Construct an RBM with binary visible and hidden units, which has an energy function:
\[E(v, h) = -a'v - b'h - v'wh\]
Equivalent to RBM(Binary(a), Binary(b), w)
.
RestrictedBoltzmannMachines.CenteredBinaryRBM
— MethodCenteredBinaryRBM(a, b, w, λv = 0, λh = 0)
Construct a centered binary RBM. The energy function is given by:
\[E(v,h) = -a' * v - b' * h - (v - λv)' * w * (h - λh)\]
RestrictedBoltzmannMachines.HopfieldRBM
— MethodHopfieldRBM(g, θ, γ, w)
HopfieldRBM(g, w)
Construct an RBM with spin visible units and Gaussian hidden units. If not given, θ = 0
and γ = 1
by default.
\[E(v, h) = -g'v - θ'h + \sum_\mu \frac{γ_\mu}{2} h_\mu^2 - v'wh\]
RestrictedBoltzmannMachines.ais
— Methodais(rbm0, rbm1, v0, βs)
Provided v0
is an equilibrated sample from rbm0
, returns F
such that mean(exp.(F))
is an unbiased estimator of Z1/Z0
, the ratio of partition functions of rbm1
and rbm0
.
!!! tip Use logmeanexp
logmeanexp(F)
, using the function logmeanexp
[@ref] provided in this package, tends to give a better approximation of log(Z1) - log(Z0)
than mean(F)
.
RestrictedBoltzmannMachines.aise
— Methodaise(rbm, [βs]; [nbetas], init=rbm.visible, nsamples=1)
AIS estimator of the log-partition function of rbm
. It is recommended to fit init
to the single-site statistics of rbm
(or the data).
!!! tip Use large nbetas
For more accurate estimates, use larger nbetas
. It is usually better to have large nbetas
and small nsamples
, rather than large nsamples
and small nbetas
.
RestrictedBoltzmannMachines.anneal
— Methodanneal(rbm0, rbm1; β)
Returns an RBM that interpolates between rbm0
and rbm1
. Denoting by E0(v, h)
and E1(v, h)
the energies assigned by rbm0
and rbm1
, respectively, the returned RBM assigns energies given by:
E(v,h) = (1 - β) * E0(v) + β * E1(v, h)
RestrictedBoltzmannMachines.batch_size
— Methodbatch_size(rbm, v, h)
Returns the batch size if energy(rbm, v, h)
were computed.
RestrictedBoltzmannMachines.batch_size
— Methodbatch_size(layer, x)
Batch sizes of x
, with respect to layer
.
RestrictedBoltzmannMachines.batchcov
— Methodbatchcov(layer, x; wts = nothing, [mean])
Covariance of x
over batch dimensions, weigthed by wts
.
RestrictedBoltzmannMachines.batchdims
— Methodbatchdims(layer, x)
Indices of batch dimensions in x
, with respect to layer
.
RestrictedBoltzmannMachines.batchmean
— Methodbatchmean(layer, x; wts = nothing)
Mean of x
over batch dimensions, weigthed by wts
.
RestrictedBoltzmannMachines.batchstd
— Methodbatchstd(layer, x; wts = nothing, [mean])
Standard deviation of x
over batch dimensions, weigthed by wts
.
RestrictedBoltzmannMachines.batchvar
— Methodbatchvar(layer, x; wts = nothing, [mean])
Variance of x
over batch dimensions, weigthed by wts
.
RestrictedBoltzmannMachines.block_matrix_invert
— Methodblock_matrix_invert(A, B, C, D)
Inversion of a block matrix, using the formula:
\[\begin{bmatrix} \mathbf{A} & \mathbf{B} \\ \mathbf{C} & \mathbf{D} \end{bmatrix}^{-1} = \begin{bmatrix} \left(\mathbf{A} - \mathbf{B} \mathbf{D}^{-1} \mathbf{C}\right)^{-1} & \mathbf{0} \\ \mathbf{0} & \left(\mathbf{D} - \mathbf{C} \mathbf{A}^{-1} \mathbf{B}\right)^{-1} \end{bmatrix} \begin{bmatrix} \mathbf{I} & -\mathbf{B} \mathbf{D}^{-1} \\ -\mathbf{C} \mathbf{A}^{-1} & \mathbf{I} \end{bmatrix}\]
Assumes that A
and D
are square and invertible.
RestrictedBoltzmannMachines.block_matrix_logdet
— Methodblock_matrix_logdet(A, B, C, D)
Log-determinant of a block matrix using the determinant lemma.
\[\det\left( \begin{bmatrix} \mathbf{A} & \mathbf{B} \\ \mathbf{C} & \mathbf{D} \end{bmatrix} \right) = \det(A) \det(D - CA^{-1}B) = \det(D) \det(A - BD^{-1}C)\]
Here we assume that A
and D
are invertible, and moreover are easy to invert (for example, if they are diagonal). We use this to chose one or the other of the two formulas above.
RestrictedBoltzmannMachines.broadlike
— Methodbroadlike(A, B...)
Broadcasts A
into the size of A .+ B .+ ...
(without actually doing a sum).
RestrictedBoltzmannMachines.categorical_rand
— Methodcategorical_rand(ps)
Randomly draw i
with probability ps[i]
. You must ensure that ps
defines a proper probability distribution.
RestrictedBoltzmannMachines.categorical_sample
— Methodcategorical_sample(P)
Given a probability array P
of size (q, *)
, returns an array C
of size (*)
, such that C[i] ∈ 1:q
is a random sample from the categorical distribution P[:,i]
. You must ensure that P
defines a proper probability distribution.
RestrictedBoltzmannMachines.categorical_sample_from_logits
— Methodcategorical_sample_from_logits(logits)
Given a logits array logits
of size (q, *)
(where q
is the number of classes), returns an array X
of size (*)
, such that X[i]
is a categorical random sample from the distribution with logits logits[:,i]
.
RestrictedBoltzmannMachines.categorical_sample_from_logits_gumbel
— Methodcategorical_sample_from_logits_gumbel(logits)
Like categoricalsamplefrom_logits, but using the Gumbel trick.
RestrictedBoltzmannMachines.center!
— Methodcenter!(centered_rbm, offset_v = 0, offset_h = 0)
Transforms the offsets of centered_rbm
. The transformed model is equivalent to the original one (energies differ by a constant).
RestrictedBoltzmannMachines.center
— Methodcenter(rbm::RBM, offset_v = 0, offset_h = 0)
Constructs a CenteredRBM
equivalent to the given rbm
. The energies assigned by the two models differ by a constant amount,
\[E(v,h) - E_c(v,h) = \sum_{i\mu}w_{i\mu}\lambda_i\lambda_\mu\]
where $E(v,h)$ is the energy assigned by the original rbm
, and $E_c(v,h)$ is the energy assigned by the returned CenteredRBM
.
This is the inverse operation of uncenter
.
To construct a CenteredRBM
that simply includes these offsets, call CenteredRBM(rbm, offset_v, offset_h)
instead.
RestrictedBoltzmannMachines.cgf
— Functioncgf(layer, inputs = 0)
Cumulant generating function of layer, reduced over layer dimensions.
RestrictedBoltzmannMachines.cold_metropolis
— Methodcold_metropolis(rbm, v; steps = 1)
Samples the rbm
at zero temperature, starting from configuration v
.
RestrictedBoltzmannMachines.collect_states
— Methodcollect_states(layer)
Returns an array of all states of layer
. Only defined for discrete layers.
RestrictedBoltzmannMachines.colors
— Methodcolors(layer)
Number of possible states of units in discrete layers.
RestrictedBoltzmannMachines.delta_energy
— Methoddelta_energy(rbm)
Compute the (constant) energy shift with respect to the equivalent normal RBM.
RestrictedBoltzmannMachines.energies
— Methodenergies(layer, x)
Energies of units in layer (not reduced over layer dimensions).
RestrictedBoltzmannMachines.energy
— Methodenergy(rbm, v, h)
Energy of the rbm in the configuration (v,h)
.
RestrictedBoltzmannMachines.energy
— Methodenergy(layer, x)
Layer energy, reduced over layer dimensions.
RestrictedBoltzmannMachines.flatten
— Methodflatten(layer, x)
Returns a vectorized version of x
.
RestrictedBoltzmannMachines.free_energy
— Methodfree_energy(rbm, v)
Free energy of visible configuration (after marginalizing hidden configurations).
RestrictedBoltzmannMachines.generate_sequences
— Functiongenerate_sequences(n, A = 0:1)
Retruns an iterator over all sequences of length n
out of the alphabet A
.
RestrictedBoltzmannMachines.gumbel_to_potts
— Methodgumbel_to_potts(rbm)
Converts PottsGumbel layers to Potts layers.
RestrictedBoltzmannMachines.initialize!
— Functioninitialize!(rbm, [data]; ϵ = 1e-6)
Initializes the RBM and returns it. If provided, matches average visible unit activities from data
.
initialize!(layer, [data]; ϵ = 1e-6)
Initializes a layer and returns it. If provided, matches average unit activities from data
.
RestrictedBoltzmannMachines.initialize_w!
— Methodinitialize_w!(rbm, data; λ = 0.1)
Initializes rbm.w
such that typical inputs to hidden units are λ.
RestrictedBoltzmannMachines.inputs_h_from_v
— Methodinputs_h_from_v(rbm, v)
Interaction inputs from visible to hidden layer.
RestrictedBoltzmannMachines.inputs_v_from_h
— Methodinputs_v_from_h(rbm, h)
Interaction inputs from hidden to visible layer.
RestrictedBoltzmannMachines.interaction_energy
— Methodinteraction_energy(rbm, v, h)
Weight mediated interaction energy.
RestrictedBoltzmannMachines.log_likelihood
— Methodlog_likelihood(rbm, v)
Log-likelihood of v
under rbm
, with the partition function compued by extensive enumeration. For discrete layers, this is exponentially slow for large machines.
RestrictedBoltzmannMachines.log_partition
— Methodlog_partition(rbm)
Log-partition of rbm
, computed by extensive enumeration of visible states (except for particular cases such as Gaussian-Gaussian RBM). This is exponentially slow for large machines.
If your RBM has a smaller hidden layer, mirroring the layers of the rbm
first (see mirror
).
RestrictedBoltzmannMachines.log_partition_zero_weight
— Methodlog_partition_zero_weight(rbm)
Log-partition function of a zero-weight version of rbm
.
RestrictedBoltzmannMachines.log_pseudolikelihood
— Methodlog_pseudolikelihood(rbm, v; exact = false)
Log-pseudolikelihood of v
. If exact
is true
, the exact pseudolikelihood is returned. But this is slow if v
consists of many samples. Therefore by default exact
is false
, in which case the result is a stochastic approximation, where a random site is selected for each sample, and its conditional probability is calculated. In average the results with exact = false
coincide with the deterministic result, and the estimate is more precise as the number of samples increases.
RestrictedBoltzmannMachines.log_pseudolikelihood_exact
— Methodlog_pseudolikelihood_exact(rbm, v)
Log-pseudolikelihood of v
. This function computes the exact pseudolikelihood, doing traces over all sites. Note that this can be slow for large number of samples.
RestrictedBoltzmannMachines.log_pseudolikelihood_sites
— Methodlog_pseudolikelihood_sites(rbm, v, sites)
Log-pseudolikelihood of a site conditioned on the other sites, where sites
is an array of site indices (CartesianIndex), one for each sample. Returns an array of log-pseudolikelihood values, for each sample.
RestrictedBoltzmannMachines.log_pseudolikelihood_stoch
— Methodlog_pseudolikelihood_stoch(rbm, v)
Log-pseudolikelihood of v
. This function computes an stochastic approximation, by doing a trace over random sites for each sample. For large number of samples, this is in average close to the exact value of the pseudolikelihood.
RestrictedBoltzmannMachines.logmeanexp
— Methodlogmeanexp(A; dims=:)
Computes log.(mean(exp.(A); dims))
, in a numerically stable way.
RestrictedBoltzmannMachines.logstdexp
— Methodlogstdexp(A; dims=:)
Computes log.(std(exp.(A); dims))
, in a numerically stable way.
RestrictedBoltzmannMachines.logvarexp
— Methodlogvarexp(A; dims=:)
Computes log.(var(exp.(A); dims))
, in a numerically stable way.
RestrictedBoltzmannMachines.mean_h_from_v
— Methodmean_h_from_v(rbm, v)
Mean unit activation values, conditioned on the other layer, <h | v>.
RestrictedBoltzmannMachines.mean_v_from_h
— Methodmean_v_from_h(rbm, v)
Mean unit activation values, conditioned on the other layer, <v | h>.
RestrictedBoltzmannMachines.metropolis!
— Methodmetropolis!(v, rbm; β = 1)
Metropolis-Hastings sampling from rbm
at inverse temperature β. Uses v[:,:,..,:,1]
as initial configurations, and writes the Monte-Carlo chains in v[:,:,..,:,2:end]
.
RestrictedBoltzmannMachines.metropolis
— Methodmetropolis(rbm, v; β = 1, steps = 1)
Metropolis-Hastings sampling from rbm
at inverse temperature β
, starting from configuration v
. Moves are proposed by normal Gibbs sampling.
RestrictedBoltzmannMachines.mirror
— Methodmirror(rbm)
Returns a new RBM with visible and hidden layers flipped.
RestrictedBoltzmannMachines.mode_h_from_v
— Methodmode_h_from_v(rbm, v)
Mode unit activations, conditioned on the other layer.
RestrictedBoltzmannMachines.mode_v_from_h
— Methodmode_v_from_h(rbm, h)
Mode unit activations, conditioned on the other layer.
RestrictedBoltzmannMachines.moving_average
— Methodmoving_average(A, m)
Moving average of A
with window size m
.
RestrictedBoltzmannMachines.onehot_decode
— Methodonehot_decode(X)
Given a onehot encoded array X
of N + 1
dimensions, returns the equivalent categorical array of N
dimensions.
RestrictedBoltzmannMachines.onehot_encode
— Functiononehot_encode(A, code)
Given an array A
of N
dimensions, returns a one-hot encoded BitArray
of N + 1
dimensions where single entries of the first dimension are one.
RestrictedBoltzmannMachines.pcd!
— Methodpcd!(rbm, data)
Trains the RBM on data using Persistent Contrastive divergence.
RestrictedBoltzmannMachines.potts_to_gumbel
— Methodpotts_to_gumbel(rbm)
Converts Potts layers to PottsGumbel layers.
RestrictedBoltzmannMachines.raise
— Methodraise(rbm::RBM, βs; v, init)
Reverse AIS estimator of the log-partition function of rbm
. While aise
tends to understimate the log of the partition function, raise
tends to overstimate it. v
must be an equilibrated sample from rbm
.
!!! tip Use logmeanexp
If F = raise(...)
, then -logmeanexp(-F)
, using the function logmeanexp
[@ref] provided in this package, tends to give a better approximation of log(Z)
than mean(F)
.
!!! tip Sandwiching the log-partition function If Rf = aise(...)
, Rr = raise(...)
are the AIS and reverse AIS estimators, we have the stochastic bounds logmeanexp(Rf) ≤ log(Z) ≤ -logmeanexp(-Rr)
.
RestrictedBoltzmannMachines.randgumbel
— Methodrandgumbel(T = Float64)
Generates a random Gumbel variate.
RestrictedBoltzmannMachines.randnt
— Methodrandnt([rng], a)
Random standard normal lower truncated at a
(that is, Z ≥ a).
RestrictedBoltzmannMachines.randnt_half
— Methodrandnt_half([rng], μ, σ)
Samples the normal distribution with mean μ
and standard deviation σ
truncated to positive values.
RestrictedBoltzmannMachines.reconstruction_error
— Methodreconstruction_error(rbm, v; steps = 1)
Stochastic reconstruction error of v
.
RestrictedBoltzmannMachines.rescale_activations!
— Methodrescale_activations!(layer, λ::AbstractArray)
For continuous layers with scale parameters, re-parameterizes such that unit activations are divided by λ
, and returns true
. For other layers, does nothing and returns false
.
rescale_hidden!(rbm, λ::AbstractArray)
For continuous hidden units with a scale parameter, scales parameters such that hidden unit activations are divided by λ
, and returns true
. For other hidden units does nothing and returns false
. The modified RBM is equivalent to the original one.
rescale_hidden!(rbm::StandardizedRBM, λ::AbstractArray)
Rescale hidden unit activities by λ
, which should be an array of the same size as the hidden units. This assumes the hidden units have a scale parameter, otherwise it does nothing and returns false
.
RestrictedBoltzmannMachines.rescale_weights!
— Methodrescale_weights!(rbm, λ::AbstractArray)
For continuous hidden units with a scale parameter, scales parameters such that the weights attached to each hidden unit have norm 1.
RestrictedBoltzmannMachines.rescale_weights!
— Methodrescale_weights!(rbm::StandardizedRBM)
Rescale weights so that the unstandardized weights have norm 1, by re-scaling hidden units. This assumes the hidden units have a scale parameter, otherwise it does nothing. Note that the standardized weights are invariant under such rescaling of hidden unit activities, and therefore cannot be constrained to have unit norm. So the only sensible choice is to rescale the unstandardized weights to have unit norm, as we do here.
RestrictedBoltzmannMachines.reshape_maybe
— Methodreshape_maybe(x, shape)
Like reshape(x, shape)
, except that zero-dimensional outputs are returned as scalars.
RestrictedBoltzmannMachines.sample_h_from_h
— Methodsample_h_from_h(rbm, h; steps=1)
Samples a hidden configuration conditional on another hidden configuration h
. Ensures type stability by requiring that the returned array is of the same type as h
.
RestrictedBoltzmannMachines.sample_h_from_v
— Methodsample_h_from_v(rbm, v)
Samples a hidden configuration conditional on the visible configuration v
.
RestrictedBoltzmannMachines.sample_v_from_h
— Methodsample_v_from_h(rbm, h)
Samples a visible configuration conditional on the hidden configuration h
.
RestrictedBoltzmannMachines.sample_v_from_v
— Methodsample_v_from_v(rbm, v; steps=1)
Samples a visible configuration conditional on another visible configuration v
. Ensures type stability by requiring that the returned array is of the same type as v
.
RestrictedBoltzmannMachines.sitedims
— Methodsitedims(layer)
Number of dimensions of layer, with special handling of Potts layer, for which the first dimension doesn't count as a site dimension.
RestrictedBoltzmannMachines.sitesize
— Methodsitesize(layer)
Size of layer, with special handling of Potts layer, for which the first dimension doesn't count as a site dimension.
RestrictedBoltzmannMachines.sqrt1half
— Methodsqrt1half(x)
Accurate computation of sqrt(1 + (x/2)^2) + |x|/2.
RestrictedBoltzmannMachines.substitution_matrix_exhaustive
— Functionsubstitution_matrix_exhaustive(rbm, v)
Returns an q x N x B tensor of free energies F
, where q
is the number of possible values of each site, B
the number of data points, and N
the sequence length:
`q, N, B = size(v)
Thus F
and v
have the same size. The entry F[x,i,b]
gives the free energy cost of flipping site i
to x
of v[b]
from its original value to x
, that is:
F[x,i,b] = free_energy(rbm, v_) - free_energy(rbm, v[b])
where v_
is the same as v[b]
in all sites but i
, where v_
has the value x
.
Note that i
can be a set of indices.
RestrictedBoltzmannMachines.substitution_matrix_sites
— Functionsubstitution_matrix_sites(rbm, v, sites)
Returns an q x B matrix of free energies F
, where q
is the number of possible values of each site, and B
the number of data points. The entry F[x,b]
equals the free energy cost of flipping site[b]
of v[b]
to x
, that is (schemetically):
F[x, b] = free_energy(rbm, v_) - free_energy(rbm, v)
where v = v[b]
, and v_
is the same as v
in all sites except site[b]
, where v_
has the value x
.
RestrictedBoltzmannMachines.tnmean
— Methodtnmean(a)
Mean of the standard normal distribution, truncated to the interval (a, +∞).
RestrictedBoltzmannMachines.tnmeanvar
— Methodtnmeanvar(a)
Mean and variance of the standard normal distribution truncated to the interval (a, +∞). Equivalent to tnmean(a), tnvar(a)
but saves some common computations. WARNING: tnvar(a) can fail for very very large values of
a`.
RestrictedBoltzmannMachines.tnvar
— Methodtnvar(a)
Variance of the standard normal distribution, truncated to the interval (a, +∞). WARNING: Fails for very very large values of a
.
RestrictedBoltzmannMachines.total_mean_from_inputs
— Functiontotal_mean_from_inputs(layer, inputs; wts = nothing)
Total mean of unit activations from inputs.
RestrictedBoltzmannMachines.total_mean_h_from_v
— Methodtotal_mean_h_from_v(rbm, v; wts = nothing)
Total mean of hidden unit activations from visible activities.
RestrictedBoltzmannMachines.total_mean_v_from_h
— Methodtotal_mean_v_from_h(rbm, h; wts = nothing)
Total mean of visible unit activations from given hidden activities.
RestrictedBoltzmannMachines.total_meanvar_from_inputs
— Functiontotal_meanvar_from_inputs(layer, inputs; wts = nothing)
Total mean and total variance of unit activations from inputs.
RestrictedBoltzmannMachines.total_meanvar_h_from_v
— Methodtotal_meanvar_h_from_v(rbm, v; wts = nothing)
Total mean and total variance of hidden unit activations from visible activities.
RestrictedBoltzmannMachines.total_meanvar_v_from_h
— Methodtotal_meanvar_v_from_h(rbm, h; wts = nothing)
Total mean and total variance of visible unit activations from hidden activities.
RestrictedBoltzmannMachines.total_var_from_inputs
— Functiontotal_var_from_inputs(layer, inputs; wts = nothing)
Total variance of unit activations from inputs.
RestrictedBoltzmannMachines.total_var_h_from_v
— Methodtotal_var_h_from_v(rbm, v; wts = nothing)
Total variance of hidden unit activations from given visible activities.
RestrictedBoltzmannMachines.total_var_v_from_h
— Methodtotal_var_v_from_h(rbm, h; wts = nothing)
Total variance of unit activations from given hidden activities.
RestrictedBoltzmannMachines.uncenter
— Methoduncenter(centered_rbm::CenteredRBM)
Constructs an RBM
equivalent to the given CenteredRBM
. The energies assigned by the two models differ by a constant amount,
\[E(v,h) - E_c(v,h) = \sum_{i\mu}w_{i\mu}\lambda_i\lambda_\mu\]
where $E_c(v,h)$ is the energy assigned by centered_rbm
and $E(v,h)$ is the energy assigned by the RBM
constructed by this method.
This is the inverse operation of center
.
To construct an RBM
that simply neglects the offsets, call RBM(centered_rbm)
instead.
RestrictedBoltzmannMachines.var_h_from_v
— Methodvar_h_from_v(rbm, v)
Variance of unit activation values, conditioned on the other layer, var(h | v).
RestrictedBoltzmannMachines.var_v_from_h
— Methodvar_v_from_h(rbm, v)
Variance of unit activation values, conditioned on the other layer, var(v | h).
RestrictedBoltzmannMachines.vstack
— Methodvstack(x)
Stack arrays along a new dimension inserted on the left.
RestrictedBoltzmannMachines.vwiden
— Methodvwiden(x)
Adds a singleton dimension on the left.
RestrictedBoltzmannMachines.weight_norms
— Methodweight_norms(std_rbm::StandardizedRBM)
Computes the norms of the unstandardized weights for each hidden unit. If you want the norms of the standardized weights, use weight_norms(RBM(std_rbm))
.
RestrictedBoltzmannMachines.wmean
— Methodwmean(A; wts = nothing, dims = :)
Weighted mean of A
along dimensions dims
, weighted by wts
.
\[\frac{\sum_i A_i w_i}{\sum_i w_i}\]
RestrictedBoltzmannMachines.wsum
— Methodwsum(A; wts = nothing, dims = :)
Weighted sum of A
along dimensions dims
, weighted by wts
.
\[\frac{\sum_i A_i w_i}\]
RestrictedBoltzmannMachines.zerosum!
— Methodzerosum!(rbm)
In-place zero-sum gauge on rbm
.
RestrictedBoltzmannMachines.zerosum!
— Methodzerosum!(∂, rbm)
Projects the gradient so that it doesn't modify the zerosum gauge.
RestrictedBoltzmannMachines.zerosum
— Methodzerosum(rbm)
Returns an equivalent rbm
in zerosum gauge. Only affects Potts layers. If the rbm
doesn't have Potts
layers, does nothing.
RestrictedBoltzmannMachines.∂cgf
— Function∂cgf(layer, inputs = 0; wts = 1)
Unit activation moments, conjugate to layer parameters. These are obtained by differentiating cgfs
with respect to the layer parameters. Averages over configurations (weigthed by wts
).
RestrictedBoltzmannMachines.∂energy
— Method∂energy(layer, data; wts = nothing)
Derivative of average energy of data
with respect to layer
parameters.
RestrictedBoltzmannMachines.∂free_energy
— Method∂free_energy(rbm, v)
Gradient of free_energy(rbm, v)
with respect to model parameters. If v
consists of multiple samples (batches), then an average is taken.
RestrictedBoltzmannMachines.∂regularize!
— Method∂regularize!(∂, rbm; l2_fields = 0, l1_weights = 0, l2_weights = 0, l2l1_weights = 0)
Updates RBM gradients ∂
, with the regularization gradient.