Integrated Retrieval from Web of Documents and Data

Krishnaprasad Thirunarayan, Trivikram Immaneni

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

The Semantic Web is evolving into a property-linked web of data, conceptually different from but contained in the Web of hyperlinked documents. Data Retrieval techniques are typically used to retrieve data from the Semantic Web while Information Retrieval techniques are used to retrieve documents from the Hypertext Web. We present a Unified Web model that integrates the two webs and formalizes connection between them. We then present an approach to retrieving documents and data that captures best of both the worlds. Specifically, it improves recall for legacy documents and provides keyword-based search capability for the Semantic Web. We specify the Hybrid Query Language that embodies this approach, and the prototype system SITAR that implements it. We conclude with areas of future work.

Original languageEnglish
Title of host publicationAdvances in Data Management
EditorsZbigniew Ras, Agnieszka Dardzinska
PublisherSpringer Berlin Heidelberg
Pages25-48
Number of pages24
ISBN (Electronic)978-3-642-02190-9
ISBN (Print)9783642021893
DOIs
StatePublished - 2009

Publication series

NameStudies in Computational Intelligence
Volume223
ISSN (Print)1860-949X

ASJC Scopus Subject Areas

  • Artificial Intelligence

Keywords

  • Data Retrieval
  • Hybrid Query Language
  • Hypertext Web
  • Information Retrieval
  • Semantic Web
  • Unified Web

Disciplines

  • Bioinformatics
  • Communication Technology and New Media
  • Databases and Information Systems
  • OS and Networks
  • Science and Technology Studies

Cite this