Generative Data Intelligence

Logical Natural Language Generation from Open-Domain Tables. (arXiv:2004.10404v1 [cs.CL])

Date:

[Submitted on 22 Apr 2020]

Download PDF

Abstract: Neural natural language generation (NLG) models have recently shown
remarkable progress in fluency and coherence. However, existing studies on
neural NLG are primarily focused on surface-level realizations with limited
emphasis on logical inference, an important aspect of human thinking and
language. In this paper, we suggest a new NLG task where a model is tasked with
generating natural language statements that can be emph{logically entailed} by
the facts in an open-domain semi-structured table. To facilitate the study of
the proposed logical NLG problem, we use the existing TabFact dataset
cite{chen2019tabfact} featured with a wide range of logical/symbolic
inferences as our testbed, and propose new automatic metrics to evaluate the
fidelity of generation models w.r.t. logical inference. The new task poses
challenges to the existing monotonic generation frameworks due to the mismatch
between sequence order and logical order. In our experiments, we
comprehensively survey different generation architectures (LSTM, Transformer,
Pre-Trained LM) trained with different algorithms (RL, Adversarial Training,
Coarse-to-Fine) on the dataset and made following observations: 1) Pre-Trained
LM can significantly boost both the fluency and logical fidelity metrics, 2) RL
and Adversarial Training are trading fluency for fidelity, 3) Coarse-to-Fine
generation can help partially alleviate the fidelity issue while maintaining
high language fluency. The code and data are available at
url{this https URL}.

Submission history

From: Wenhu Chen [view email]
[v1]
Wed, 22 Apr 2020 06:03:10 UTC (1,342 KB)

Source: https://arxiv.org/abs/2004.10404

spot_img

Latest Intelligence

spot_img