Deep web search: an overview and roadmap

Research output: Book/ReportReportProfessional

78 Downloads (Pure)

Abstract

We review the state-of-the-art in deep web search and propose a novel classification scheme to better compare deep web search systems. The current binary classification (surfacing versus virtual integration) hides a number of implicit decisions that must be made by a developer. We make these decisions explicit by distinguishing 7 system aspects that describe a system in terms of its functionality (what it can, and what it cannot do) and in terms of its solution to a specific problem. We then motivate the need for a search system which has a single-field free-text query interface that supports real-time structured search over multiple sources. To this end, we discuss two possible federated architectures and state the scientific challenges. Finally, we present the findings of our ongoing project and briefly outline related work to free-text interfaces over structured data.
Original languageUndefined
Place of PublicationEnschede
PublisherCentre for Telematics and Information Technology (CTIT)
Number of pages18
Publication statusPublished - Oct 2011

Publication series

NameCTIT Technical Report Series
PublisherCentre for Telematics and Information Technology, University of Twente
No.TR-CTIT-12-32
ISSN (Print)1381-3625

Keywords

  • Review
  • Interfaces
  • surfacing
  • Deep Web
  • EWI-22746
  • OneBox
  • free text
  • IR-84377
  • METIS-293268
  • Survey
  • deep web search
  • natural language

Cite this

Tjin-Kam-Jet, K., Trieschnigg, R. B., & Hiemstra, D. (2011). Deep web search: an overview and roadmap. (CTIT Technical Report Series; No. TR-CTIT-12-32). Enschede: Centre for Telematics and Information Technology (CTIT).
Tjin-Kam-Jet, Kien ; Trieschnigg, Rudolf Berend ; Hiemstra, Djoerd. / Deep web search: an overview and roadmap. Enschede : Centre for Telematics and Information Technology (CTIT), 2011. 18 p. (CTIT Technical Report Series; TR-CTIT-12-32).
@book{8e2d4dbb7f444bdd9812bda4b0710393,
title = "Deep web search: an overview and roadmap",
abstract = "We review the state-of-the-art in deep web search and propose a novel classification scheme to better compare deep web search systems. The current binary classification (surfacing versus virtual integration) hides a number of implicit decisions that must be made by a developer. We make these decisions explicit by distinguishing 7 system aspects that describe a system in terms of its functionality (what it can, and what it cannot do) and in terms of its solution to a specific problem. We then motivate the need for a search system which has a single-field free-text query interface that supports real-time structured search over multiple sources. To this end, we discuss two possible federated architectures and state the scientific challenges. Finally, we present the findings of our ongoing project and briefly outline related work to free-text interfaces over structured data.",
keywords = "Review, Interfaces, surfacing, Deep Web, EWI-22746, OneBox, free text, IR-84377, METIS-293268, Survey, deep web search, natural language",
author = "Kien Tjin-Kam-Jet and Trieschnigg, {Rudolf Berend} and Djoerd Hiemstra",
year = "2011",
month = "10",
language = "Undefined",
series = "CTIT Technical Report Series",
publisher = "Centre for Telematics and Information Technology (CTIT)",
number = "TR-CTIT-12-32",
address = "Netherlands",

}

Tjin-Kam-Jet, K, Trieschnigg, RB & Hiemstra, D 2011, Deep web search: an overview and roadmap. CTIT Technical Report Series, no. TR-CTIT-12-32, Centre for Telematics and Information Technology (CTIT), Enschede.

Deep web search: an overview and roadmap. / Tjin-Kam-Jet, Kien; Trieschnigg, Rudolf Berend; Hiemstra, Djoerd.

Enschede : Centre for Telematics and Information Technology (CTIT), 2011. 18 p. (CTIT Technical Report Series; No. TR-CTIT-12-32).

Research output: Book/ReportReportProfessional

TY - BOOK

T1 - Deep web search: an overview and roadmap

AU - Tjin-Kam-Jet, Kien

AU - Trieschnigg, Rudolf Berend

AU - Hiemstra, Djoerd

PY - 2011/10

Y1 - 2011/10

N2 - We review the state-of-the-art in deep web search and propose a novel classification scheme to better compare deep web search systems. The current binary classification (surfacing versus virtual integration) hides a number of implicit decisions that must be made by a developer. We make these decisions explicit by distinguishing 7 system aspects that describe a system in terms of its functionality (what it can, and what it cannot do) and in terms of its solution to a specific problem. We then motivate the need for a search system which has a single-field free-text query interface that supports real-time structured search over multiple sources. To this end, we discuss two possible federated architectures and state the scientific challenges. Finally, we present the findings of our ongoing project and briefly outline related work to free-text interfaces over structured data.

AB - We review the state-of-the-art in deep web search and propose a novel classification scheme to better compare deep web search systems. The current binary classification (surfacing versus virtual integration) hides a number of implicit decisions that must be made by a developer. We make these decisions explicit by distinguishing 7 system aspects that describe a system in terms of its functionality (what it can, and what it cannot do) and in terms of its solution to a specific problem. We then motivate the need for a search system which has a single-field free-text query interface that supports real-time structured search over multiple sources. To this end, we discuss two possible federated architectures and state the scientific challenges. Finally, we present the findings of our ongoing project and briefly outline related work to free-text interfaces over structured data.

KW - Review

KW - Interfaces

KW - surfacing

KW - Deep Web

KW - EWI-22746

KW - OneBox

KW - free text

KW - IR-84377

KW - METIS-293268

KW - Survey

KW - deep web search

KW - natural language

M3 - Report

T3 - CTIT Technical Report Series

BT - Deep web search: an overview and roadmap

PB - Centre for Telematics and Information Technology (CTIT)

CY - Enschede

ER -

Tjin-Kam-Jet K, Trieschnigg RB, Hiemstra D. Deep web search: an overview and roadmap. Enschede: Centre for Telematics and Information Technology (CTIT), 2011. 18 p. (CTIT Technical Report Series; TR-CTIT-12-32).