MARC View

000			02081nam a22002657a 4500
003			IIITD
005			20240506131932.0
008			240406b xxu\|\|\|\|\| \|\|\|\| 00\| 0 eng d
020			_a9781680839128
040			_aIIITD
082			_a006.1 _bGIR-D
245			_aDynamical variational autoencoders : _ba comprehensive review _cby Laurent Girin ... [et al.]
260			_aBoston : _bNow Publishers, _c©2021
300			_a187 p. ; _c23 cm.
500			_aVariational autoencoders (VAEs) are powerful deep generative models widely used to represent high-dimensional complex data through a low-dimensional latent space learned in an unsupervised manner. In the original VAE model, the input data vectors are processed independently. Recently, a series of papers have presented different extensions of the VAE to process sequential data, which model not only the latent space but also the temporal dependencies within a sequence of data vectors and corresponding latent vectors, relying on recurrent neural networks or state-space models. In this monograph, we perform a literature review of these models. We introduce and discuss a general class of models, called dynamical variational autoencoders (DVAEs), which encompasses a large subset of these temporal VAE extensions. Then, we present in detail seven recently proposed DVAE models, with an aim to homogenize the notations and presentation lines, as well as to relate these models with existing classical temporal models. We have reimplemented those seven DVAE models and present the results of an experimental benchmark conducted on the speech analysis-resynthesis task (the PyTorch code is made publicly available). The monograph concludes with a discussion on important issues concerning the DVAE class of models and future research guidelines.
650			_aDynamical
650			_aVariational Autoencoders
700			_aGirin, Laurent
700			_aLeglaive, Simon
700			_aBie, Xiaoyu
700			_aDiard, Julien
700			_aHueber, Thomas
700			_aAlameda-Pineda, Xavier
942			_2ddc _cBK
999			_c172349 _d172349