000 02081nam a22002657a 4500
003 IIITD
005 20240506131932.0
008 240406b xxu||||| |||| 00| 0 eng d
020 _a9781680839128
040 _aIIITD
082 _a006.1
_bGIR-D
245 _aDynamical variational autoencoders :
_ba comprehensive review
_cby Laurent Girin ... [et al.]
260 _aBoston :
_bNow Publishers,
_c©2021
300 _a187 p. ;
_c23 cm.
500 _aVariational autoencoders (VAEs) are powerful deep generative models widely used to represent high-dimensional complex data through a low-dimensional latent space learned in an unsupervised manner. In the original VAE model, the input data vectors are processed independently. Recently, a series of papers have presented different extensions of the VAE to process sequential data, which model not only the latent space but also the temporal dependencies within a sequence of data vectors and corresponding latent vectors, relying on recurrent neural networks or state-space models. In this monograph, we perform a literature review of these models. We introduce and discuss a general class of models, called dynamical variational autoencoders (DVAEs), which encompasses a large subset of these temporal VAE extensions. Then, we present in detail seven recently proposed DVAE models, with an aim to homogenize the notations and presentation lines, as well as to relate these models with existing classical temporal models. We have reimplemented those seven DVAE models and present the results of an experimental benchmark conducted on the speech analysis-resynthesis task (the PyTorch code is made publicly available). The monograph concludes with a discussion on important issues concerning the DVAE class of models and future research guidelines.
650 _aDynamical
650 _aVariational Autoencoders
700 _aGirin, Laurent
700 _aLeglaive, Simon
700 _aBie, Xiaoyu
700 _aDiard, Julien
700 _aHueber, Thomas
700 _aAlameda-Pineda, Xavier
942 _2ddc
_cBK
999 _c172349
_d172349