Term Extraction and Disambiguation for Semantic Knowledge Enrichment: A Case Study on Initial Public Offering (IPO) Prospectus Corpus

Document Type

Conference Proceeding

Publication Date

2015

Abstract

Domain knowledge bases are a basis for advanced knowledge-based systems, manually creating a formal knowledge base for a certain domain is both resource consuming and non-trivial. In this paper, we propose an approach that provides support to extract, select, and disambiguate terms embedded in domain specific documents. The extracted terms are later used to en-rich existing ontologies/taxonomies, as well as to bridge domain specific knowledge base with a generic knowledge base such as Word Net. The proposed approach addresses two major issues in the term extraction domain, namely quality and efficiency. Also, the proposed approach adopts a feature-based method that assists in topic extraction and integration with existing ontologies in the given domain. The proposed approach is realized in a research prototype, and then a case study is conducted in order to illustrate the feasibility and the efficiency of the proposed method in the finance domain. A preliminary empirical validation by the domain experts is also conducted to determine the accuracy of the proposed approach. The results from the case study indicate the advantages and potential of the proposed approach.

Comments

© 2015 IEEE

A link to full text has been provided for authorized subscribers.

Publication Title

2015 48th Hawaii International Conference on System Sciences (HICSS)

Published Citation

Tao, Jie, Omar F. El-Gayar, Amit V. Deokar, Yenling Chang. "Term Extraction and Disambiguation for Semantic Knowledge Enrichment: A Case Study on Initial Public Offering (IPO) Prospectus Corpus." 2015 48th Hawaii International Conference on System Sciences (HICSS). IEEE, 2015. pp. 3719 - 3728 10.1109/HICSS.2015.448

DOI

10.1109/HICSS.2015.448

Peer Reviewed

Share

COinS