Layer Types
An RBM is composed of a visible layer and a hidden layer, each of which can be any of the types listed below. The RBM energy function is:
\[E(\mathbf{v}, \mathbf{h}) = U_v(\mathbf{v}) + U_h(\mathbf{h}) - \mathbf{v}^\top \mathbf{w}\, \mathbf{h}\]
where $U_v$ and $U_h$ are the layer potential functions defined below. Each layer type defines a different family of conditional distributions for its units.
Discrete layers
Binary
Units take values in $\{0, 1\}$. The potential function is:
\[U(\mathbf{x}) = -\sum_i \theta_i x_i\]
where $\theta_i$ are external fields. Conditioned on the other layer, each unit follows an independent Bernoulli distribution with probability $\sigma(\theta_i + I_i)$, where $\sigma$ is the sigmoid function and $I_i$ is the input from the other layer.
Constructed with Binary or the convenience function BinaryRBM.
Spin
Units take values in $\{-1, +1\}$. The potential function is:
\[U(\mathbf{s}) = -\sum_i \theta_i s_i\]
Conditioned on the other layer, each unit takes value $+1$ with probability $\sigma(2(\theta_i + I_i))$.
Constructed with Spin or SpinRBM.
Potts
Units are one-hot encoded categorical variables with $q$ categories. The potential function is:
\[U(\mathbf{x}) = -\sum_{i,c} \theta_{c,i}\, x_{c,i}\]
Conditioned on the other layer, each unit follows a categorical distribution (softmax over the $q$ categories).
Constructed with Potts. A GPU-optimized variant PottsGumbel is also available, which uses the Gumbel-softmax trick for sampling.
Continuous layers
Gaussian
Units take values in $\mathbb{R}$. The potential function is:
\[U(\mathbf{x}) = \sum_i \left(\frac{|\gamma_i|}{2} x_i - \theta_i \right) x_i\]
where $\theta_i$ is a location parameter and $\gamma_i$ controls the precision (inverse variance). Conditioned on the other layer, each unit follows a Gaussian distribution with mean $(\theta_i + I_i) / |\gamma_i|$ and variance $1 / |\gamma_i|$.
Constructed with Gaussian or GaussianRBM.
ReLU
Units take values in $[0, \infty)$. The potential function is:
\[U(\mathbf{x}) = \sum_i \left(\frac{|\gamma_i|}{2} x_i - \theta_i\right) x_i\]
for $x_i \geq 0$ (with $U = \infty$ for $x_i < 0$). This is a truncated Gaussian: conditioned on the other layer, each unit follows a rectified Gaussian distribution.
Constructed with ReLU.
dReLU, pReLU, xReLU
These three layer types represent the same family of asymmetric piecewise-quadratic distributions, differing only in parameterization. They can be converted to each other without loss of information.
The distribution is defined by a potential that allows different curvatures and locations for positive and negative values of $x$:
\[U(x) = \begin{cases} \frac{\gamma^+}{2} x^2 + \theta^+ x & \text{if } x \geq 0 \\[4pt] \frac{\gamma^-}{2} x^2 + \theta^- x & \text{if } x < 0 \end{cases}\]
The three types differ in how they parameterize this distribution:
| Type | Parameters | Notes |
|---|---|---|
dReLU | $\theta^+, \theta^-, \gamma^+, \gamma^-$ | Separate parameters for positive and negative parts. Direct but redundant. |
pReLU | $\theta, \gamma, \Delta, \eta$ | Shared scale $\gamma$ with asymmetry ratio $\eta \in (-1, 1)$. |
xReLU | $\theta, \gamma, \Delta, \xi$ | Like pReLU but with unbounded $\xi \in \mathbb{R}$ (related to $\eta$ by ``\xi = \eta / (1 - |
The conversions between parameterizations are given by:
\[\gamma = \frac{2|\gamma^+|\,|\gamma^-|}{|\gamma^+| + |\gamma^-|}, \qquad \eta = \frac{|\gamma^-| - |\gamma^+|}{|\gamma^+| + |\gamma^-|}\]
Use whichever parameterization is most convenient; dReLU is the most explicit, while pReLU and xReLU separate the overall scale from the asymmetry.
Constructing an RBM
You can construct an RBM from any pair of layer types:
import RestrictedBoltzmannMachines as RBMs
# Generic constructor
visible = RBMs.Binary(; θ = zeros(28, 28))
hidden = RBMs.ReLU(; θ = zeros(400), γ = ones(400))
weights = randn(28 * 28, 400) / 100
rbm = RBMs.RBM(visible, hidden, weights)Or use convenience constructors:
| Constructor | Visible | Hidden |
|---|---|---|
BinaryRBM | Binary | Binary |
SpinRBM | Spin | Spin |
GaussianRBM | Gaussian | Gaussian |
HopfieldRBM | Spin | Gaussian |