Deep Text Generation – Using Hierarchical Decomposition to Mitigate the Effect of Rare Data Points
- First Online:
Deep learning has recently been adopted for the task of natural language generation (NLG) and shown remarkable results. However, learning can go awry when the input dataset is too small or not well balanced with regards to the examples it contains for various input sequences. This is relevant to naturally occurring datasets such as many that were not prepared for the task of natural language processing but scraped off the web and originally prepared for a different purpose. As a mitigation to the problem of unbalanced training data, we therefore propose to decompose a large natural language dataset into several subsets that “talk about” the same thing. We show that the decomposition helps to focus each learner’s attention during training. Results from a proof-of-concept study show 73% times faster learning over a flat model and better results.