(17th-December-2020)
• Auto-regressive networks are directed probabilistic models with no latent random variables. The conditional probability distributions in these models are represented by neural networks (sometimes extremely simple neural networks such as logistic regression). The graph structure of these models is the complete graph. They decompose a joint probability over the observed variables using the chain rule of probability to obtain a product of conditionals of the form P(xd | xd−1,...,x1). Such models have been called fully-visible Bayes networks (FVBNs) and used successfully in many forms, first with logistic regression for each conditional distribution (Frey 1998 , ) and then with neural networks with hidden units (Bengio and Bengio 2000b Larochelle and Murray 2011 , ; , ). In some forms of autoregressive networks, such as NADE ( , ), described Larochelle and Murray 2011 in section below, we can introduce a form of parameter sharing that 20.10.10 brings both a statistical advantage (fewer unique parameters) and a computational advantage (less computation). This is one more instance of the recurring deep learning motif of reuse of features.
Comments