Content Driven Enrichment of Formal Text using Concept Definitions and Applications

Abstract

Formal text is objective, unambiguous and tends to have complex sentence construction intended to be understood by the target demographic. However, in the absence of domain knowledge it is imperative to define key concepts and their relationship in the text for correct interpretation for general readers. To address this, we propose a text enrichment framework that identifies the key concepts from input text, highlights definitions and fetches the definition from external data sources in case the concept is undefined. Beyond concept definitions, the system enriches the input text with concept applications and a pre-requisite concept graph that showcases the inter-dependency within the extracted concepts. While the problem of learning definition statements is attempted in literature, the task of learning application statements is novel. We manually annotated a dataset for training a deep learning network for identifying application statements in text. We quantitatively compared the results of both application and definition identifica- tion models with standard baselines. To validate the utility of the proposed framework for general readers, we report enrichment accuracy and show promising results.

Publication
In Proceedings of the 29th on Hypertext and Social Media 2018
Abhinav Jain
Abhinav Jain
Machine Learning Engineer

My research interests include computer vision, machine learning and deep reinforcement learning.

Related