Abstract
Accessing information is an essential factor in decision making processes occurring in different domains. Therefore, broadening the coverage of available information for the decision makers is of a vital importance. In such a information-thirsty environment, accessing every source of information is considered highly valuable. Nowadays, the main or the most general approach for finding and accessing information sources is searching queries over general search engines such as Google, Yahoo, or Bing. However, these search engines do not cover all the data available on the Web. In addition to the fact that none of these search engines cover all the webpages existing on the Web, they miss the data behind web search forms. This data is defined as hidden web or deep web which is not accessible through search engines. It is estimated that deep web contains data in a scale several times bigger than the data accessible through search engines which is referred to as surface web [9, 6]. Although this information on deep web could be accessed through their own interfaces, finding and querying all the interesting sources of information that might be useful could be a difficult, time-consuming and tiring task. Considering the huge amount of information that might be related to one's information needs, it might be even impossible for a person to cover all the deep web sources of his interest. Therefore, there is a great demand for applications which can facilitate accessing this big amount of data being locked behind web search forms. Realizing approaches to meet this demand is one of the main issues targeted in this PhD project. Having provided the access to deep web data, different technique can be applied to provide users with additional values out of this data. Analyzing data, finding patterns and relationships among different data items and also data sources are considered as some of these techniques. However, in this research, monitoring entities existing in deep web sources is targeted.
Original language | Undefined |
---|---|
Title of host publication | Proceedings of the 22nd international conference on World Wide Web companion, WWW 2013 |
Place of Publication | Republic and Canton of Geneva, Switzerland |
Publisher | International World Wide Web Conferences Steering Committee |
Pages | 377-382 |
Number of pages | 5 |
ISBN (Print) | 978-1-4503-2038-2 |
Publication status | Published - May 2013 |
Event | 22nd International World Wide Web Conference, WWW 2013 - Rio de Janeiro, Brazil Duration: 13 May 2013 → 17 May 2013 Conference number: 22 http://www2013.wwwconference.org/ |
Publication series
Name | |
---|---|
Publisher | International World Wide Web Conferences Steering Committee |
Conference
Conference | 22nd International World Wide Web Conference, WWW 2013 |
---|---|
Abbreviated title | WWW |
Country | Brazil |
City | Rio de Janeiro |
Period | 13/05/13 → 17/05/13 |
Internet address |
Keywords
- crawling
- web harvesting
- EWI-23496
- METIS-297724
- IR-86796
- Deep Web
- entity monitoring