I was anticipating a plus... angusturner27https://www.blogger.com/profile/07981384823717923666noreply@blogger.comtag:blogger.com,1999:blog-842965756326639856.post-53383152898309186372017-05-12T13:01:16.053-07:002017-05-12T13:01:16.053-07:00I believe the last formula for reverse KL should b...I believe the last formula for reverse KL should be an expectation over q, not over p. Great post. Thanks for your effort.mathnathanhttps://www.blogger.com/profile/02594451096636042529noreply@blogger.comtag:blogger.com,1999:blog-842965756326639856.post-7683365764424279032017-05-07T02:53:22.770-07:002017-05-07T02:53:22.770-07:00Thanks for this, it is a key resource for our read...Thanks for this, it is a key resource for our reading group discussion on VAE today https://github.com/p-i-/machinelearning-IRC-freenode/blob/master/ReadingGroup/README.mdSunFish7https://www.blogger.com/profile/16021806731509723216noreply@blogger.comtag:blogger.com,1999:blog-842965756326639856.post-43732015586799387982017-05-07T02:52:09.541-07:002017-05-07T02:52:09.541-07:00Probabilities sum to 1. i.e. Given a probability d...Probabilities sum to 1. i.e. Given a probability distribution q over Z, summing q(z) over all possible z in Z must give 1.SunFish7https://www.blogger.com/profile/16021806731509723216noreply@blogger.comtag:blogger.com,1999:blog-842965756326639856.post-10751644964120852972017-05-03T16:04:34.950-07:002017-05-03T16:04:34.950-07:00Hi, can you explain me the relation of the sum ove...Hi, can you explain me the relation of the sum over q(z) equal to 1 in equation (1)?. Thanks, I don't catch it. Magdiel JimĂ©nez Guarneroshttps://www.blogger.com/profile/10010153146652007747noreply@blogger.comtag:blogger.com,1999:blog-842965756326639856.post-72041606508229268782017-04-25T22:09:19.233-07:002017-04-25T22:09:19.233-07:00I read a few blogs/articles/slides about variation...I read a few blogs/articles/slides about variational autoencoders, and I personally think this is the best one. The key ideas are pointed out clearly. The technical terms(e.g., ELBO) are well explained, too. I didn't know that! Thank you for sharing this. I hope that interested readers will scroll down and find your comment. Erichttps://www.blogger.com/profile/05932982386234738790noreply@blogger.comtag:blogger.com,1999:blog-842965756326639856.post-941729952203784422016-10-18T13:01:55.459-07:002016-10-18T13:01:55.459-07:00Given the title of your post, it's worth givin...Given the title of your post, it's worth giving some motivation behind the name "mean-field approximation". <br /><br />From a statistical physics point of view, "mean-field" refers to the relaxation of a difficult optimization problem to a simpler one which ignores second-order effects. For example, in the context of graphical models, one can approximate the partition function of a Markov random field via maximization of the Gibbs free energy (i.e., log partition function minus relative entropy) over the set of product measures, which is significantly more tractable than global optimization over the space of all probability measures (see, e.g., M. Mezard and A. I added the minus in front of the KL term.Erichttps://www.blogger.com/profile/05932982386234738790noreply@blogger.comtag:blogger.com,1999:blog-842965756326639856.post-73969413568521932752016-08-08T11:34:33.759-07:002016-08-08T11:34:33.759-07:00There should be a minus in equation (3) for E[log ...There should be a minus in equation (3) for E[log p(x|z)] i.e. E[ -log p(x|z)] otherwise your definition of KL-divergence isn't consistent. <br /><br />Ankur.Incognitohttps://www.blogger.com/profile/02971376934493359965noreply@blogger.com