Recent neural encoder-decoder models often generate dull and generic responses. We present a novel framework that captures the discourse-level diversity in the encoder. The training procedure is improved by introducing a bag-of-word loss.
While recent neural encoder-decoder models have shown great promise in
modeling open-domain conversations, they often generate dull and generic
responses. Unlike past work that has focused on diversifying the output of the
decoder at word-level to alleviate this problem, we present a novel framework
based on conditional variational autoencoders that captures the discourse-level
diversity in the encoder. Our model uses latent variables to learn a
distribution over potential conversational intents and generates diverse
responses using only greedy decoders. We have further developed a novel variant
that is integrated with linguistic prior knowledge for better performance.
Finally, the training procedure is improved by introducing a bag-of-word loss.
Our proposed models have been validated to generate significantly more diverse
responses than baseline approaches and exhibit competence in discourse-level