On Embedding Machine-Processable Semantics into Documents

Research output: Contribution to journalArticlepeer-review

Abstract

Most Web and legacy paper-based documents are available in human comprehensible text form, not readily accessible to or understood by computer programs. Here, we investigate an approach to amalgamate XML technology with programming languages for representational purposes that can enhance traceability, thereby facilitating semiautomatic extraction and update. Specifically, we propose a modular technique to embed machine-processable semantics into a text document with tabular data via annotations, resulting sometimes in ill-formed XML fragments, and evaluate this technique vis a vis document querying, manipulation, and integration. The ultimate aim is to be able to author and extract human-readable and machine-comprehensible parts of a document hand in hand and keep them side by side.

Original languageAmerican English
Pages (from-to)1014 - 1018
JournalIEEE Transactions on Knowledge and Data Engineering
Volume17
Issue number7
DOIs
StatePublished - Jul 2005

Keywords

  • Index Terms-Structured data and knowledge representation
  • XML-based programming language
  • Semantic Web

Disciplines

  • Databases and Information Systems
  • OS and Networks
  • Science and Technology Studies
  • Cataloging and Metadata

Cite this