Dr. Keizo Kato, Fujitsu Laboratories Ltd., and,
Dr. Akira Nakagawa, Artificial Intelligence Laboratory, Fujitsu Laboratories Ltd.
Please note change in start time. This research meeting will start at 6pm EST.
Abstract: To analyze high-dimensional and complex data in the real world, deep generative models, such as variational autoencoder (VAE) embed data in a low-dimensional space (latent space) and learn a probabilistic model in the latent space. However, they struggle to accurately reproduce the probability distribution function (PDF) in the input space from that in the latent space. If the embedding were isometric, this issue can be solved, because the relation of PDFs can become tractable. To achieve isometric property, we propose Rate- Distortion Optimization guided autoencoder inspired by orthonormal transform coding. We show our method has the following properties: (i) the Jacobian matrix between the input space and a Euclidean latent space forms a constantly scaled orthonormal system and enables isometric data embedding; (ii) the relation of inner products, distances, and PDFs in both spaces can become tractable one such as proportional relation. Thanks to this property, our method outperforms state-of-the-art methods in unsupervised anomaly detection with four public datasets.
Furthermore, we also show that VAE can be mapped to an implicit isometric embedding with a scale factor derived from the posterior parameter. By interpreting VAE as a non-linearly scaled isometric embedding, we provide a quantitative understanding of VAE property. From this analysis, we have found that previous discussions regarding rate-distortion trade-off in beta-VAE have been inconsistent with the rate-distortion theory of transform coding.
Our method and analysis will promote to develop quantitatively interpretable deep generative models. It’s time to be free from the stress to interpret the behavior of VAE.
This research meeting will be hosted remotely via Zoom.
Zoom Webinar link: https://mit.zoom.us/j/97502771611?pwd=QkZ3cGkvQTM4OEt4N0R5Qmg5Q1Q0Zz09