Results 1 -
3 of
3
A Assessing Relevance and Trust of the Deep Web Sources and Results Based on Inter-Source Agreement
"... Deep web search engines face the formidable challenge of retrieving high quality results from the vast collection of searchable databases. Deep web search is a two step process of selecting the high quality sources and ranking the results from the selected sources. Though there are existing methods ..."
Abstract
- Add to MetaCart
Deep web search engines face the formidable challenge of retrieving high quality results from the vast collection of searchable databases. Deep web search is a two step process of selecting the high quality sources and ranking the results from the selected sources. Though there are existing methods for both the steps, they assess the relevance of the sources and the results using the query-result similarity. When applied to the deep web these methods have two deficiencies. First is that they are agnostic to the correctness (trustworthiness) of the results. Secondly, the query based relevance does not consider the importance of the results and sources. These two considerations are essential for the deep web and open collections in general. Since a number of deep web sources provide answers to any query, we conjuncture that the agreements between these answers are helpful in assessing the importance and the trustworthiness of the sources and the results. For assessing source quality, we compute the agreement between the sources as the agreement of the answers returned. While computing the agreement, we also measure and compensate for the possible collusion between the sources. This adjusted agreement is modeled as a graph with sources at the vertices. On this agreement graph, a quality score of a source that we call SourceRank, is calculated as the stationary visit probability of a random walk. For ranking results, we analyze the second order agreement between the results. Further extending SourceRank to multi-domain search, we propose a source ranking sensitive to the
AAssessing Relevance and Trust of the Deep Web Sources and Results Based on Inter-Source Agreement1
"... Deep web search engines face the formidable challenge of retrieving high quality results in the vast collection of searchable databases. Deep web search is a two step process consisting of: (i) selecting the high quality sources (ii) ranking the results from these selected sources. Though there are ..."
Abstract
- Add to MetaCart
Deep web search engines face the formidable challenge of retrieving high quality results in the vast collection of searchable databases. Deep web search is a two step process consisting of: (i) selecting the high quality sources (ii) ranking the results from these selected sources. Though there are existing methods for both these steps, they assess the relevance of sources and the results using the query-result similarity. When applied to the deep web these methods have two deficiencies. First is that they are agnostic to the correctness (trustworthiness) of the results. Secondly, the query based relevance does not consider the importance of the results and sources. These two considerations are essential for the open collections like the deep web. Since a number of deep web sources provide answers to any query, we conjuncture that the agreements between these answers are likely to be helpful in assessing the importance and the trustworthiness of the sources as well as the results. For the first step of assessing source quality, we compute the agreement between the sources as the agreement of the answers returned. While computing the agreement, we also measure and compensate for possible collusion between the sources. This adjusted agreement is modeled as a graph with sources at the vertices. On this agreement graph, a quality score of a source that we call SourceRank, is calculated as the stationary visit probability of a random walk. To extend the agreement analysis to the second step of ranking results, we base our analysis on the second order agreement between the results.