Joint density Gaussian mixture model (JD-GMM) based method has been widely used in voice conversion task due to its flexible implementation. However, the statistical averaging effect during estimating the model parameters will result in over-smoothing the target spectral trajectories. Motivated by the local linear transformation method, which uses neighboring data rather than all the training data to estimate the transformation function for each feature vector, we proposed a local partial least square method to avoid the over-smoothing problem of JD-GMM and the over-fitting problem of local linear transformation when training data are limited. We conducted experiments using the VOICES database and measure both spectral distortion and correlation coefficient of the spectral parameter trajectory. The experimental results show that our proposed method obtain better performance as compared to baseline methods.
|Title of host publication||Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2013 Asia-Pacific|
|Publisher||Institute of Electrical and Electronics Engineers (IEEE)|
|Number of pages||6|
|Publication status||Published - 2013|