top of page
Search
Writer's pictureDR.GEEK

Layer-Wise Pretraining

(5th-Dce-2020)


• Unfortunately, training a DBM using stochastic maximum likelihood (as described above) from a random initialization usually results in failure. In some cases, the model fails to learn to represent the distribution adequately. In other cases, the DBM may represent the distribution well, but with no higher likelihood than could be obtained with just an RBM. A DBM with very small weights in all but the first layer represents approximately the same distribution as an RBM. Various techniques that permit joint training have been developed and are described in section . However, the original and most popular method for 20.4.5 overcoming the joint training problem of DBMs is greedy layer-wise pretraining. In this method, each layer of the DBM is trained in isolation as an RBM. The first layer is trained to model the input data. Each subsequent RBM is trained to model samples from the previous RBM’s posterior distribution.

Algorithm 20.1 The variational stochastic maximum likelihood algorithm for training a DBM with two hidden layers.



4 views0 comments

Recent Posts

See All

Comentarios


bottom of page