(10th-Dec-2020)
• Many other variants of Boltzmann machines are possible. Boltzmann machines may be extended with different training criteria. We have focused on Boltzmann machines trained to approximately maximize the generative criterion logp(v). It is also possible to train discriminative RBMs that aim to maximize logp(y | v) instead ( , ). This approach often Larochelle and Bengio 2008 performs the best when using a linear combination of both the generative and the discriminative criteria. Unfortunately, RBMs do not seem to be as powerful supervised learners as MLPs, at least using existing methodology. Most Boltzmann machines used in practice have only second-order interactions in their energy functions, meaning that their energy functions are the sum of many terms and each individual term only includes the product between two random variables. An example of such a term is viWi,jhj . It is also possible to train higher-order Boltzmann machines ( , ) whose energy function terms Sejnowski 1987 involve the products between many variables. Three-way interactions between a hidden unit and two different images can model spatial transformations from one frame of video to the next (Memisevic and Hinton 2007 2010 , , ). Multiplication by a one-hot class variable can change the relationship between visible and hidden units depending on which class is present ( , ). One recent example Nair and Hinton 2009 of the use of higher-order interactions is a Boltzmann machine with two groups of hidden units, with one group of hidden units that interact with both the visible units v and the class label y, and another group of hidden units that interact only with the v input values ( , ). This can be interpreted as encouraging Luo et al. 2011 some hidden units to learn to model the input using features that are relevant to the class but also to learn extra hidden units that explain nuisance details that are necessary for the samples of v to be realistic but do not determine the class of the example.
Yorumlar