(12th-December-2020)
When a model emits a discrete variable y, the reparametrization trick is not applicable. Suppose that the model takes inputs x and parameters θ, both encapsulated in the vector ω, and combines them with random noise z to
produce y: y z ω = ( f ; ). (20.58)
Because y is discrete, f must be a step function. The derivatives of a step function are not useful at any point. Right at each step boundary, the derivatives are undefined, but that is a small problem. The large problem is that the derivatives are zero almost everywhere, on the regions between step boundaries. The derivatives of any cost function J(y) therefore do not give any information for how to update the model parameters.
Comentários