New Re-ranking Approach in Merging Search Results

Hung Trung Vo

Abstract


When merging query results from various information sources or from different search engines, popular methods based on available documents scores or on order ranks in returned lists, its can ensure the fast response, but results are often inconsistent. Another approach is downloading contents of top documents for re-indexing and re-ranking to create final ranked result list. This method guarantees better quality but is resource-consuming. In this paper, we compare two methods of merging search results: a) applying formulas to re-evaluate document based on different combinations of returned order ranks, documents titles and snippets; b) Top-Down Re-ranking algorithm (TDR) gradually downloads, calculates scores and adds top documents from each source into the final list. We propose also a new way to re-rank search results based on genetic programming and re-ranking learning. Experimental result shows that the proposed method is better than traditional methods in terms of both quality and time.

Full Text:

PDF

References


Kurt I. Munson (2000), Internet Search Engines: Understanding Their Design to Improve Information Retrieval, Journal of Library Metadata, Volume 2, p.p. 47-60.

https://doi.org/10.1300/J141v02n03_04

M. Shokouhi and L. Si (2011), Foundations and Trends® in Information Retrieval, Federated Search, Volume 5 (No. 1), p.p. 101-107.

https://doi.org/10.1561/1500000010

J. Callan (2002), Distributed information retrieval, The Information Retrieval Series: Springer, INRE, Volume 7, p.p. 127-150.

https://doi.org/10.1007/0-306-47019-5_5

S. Wu, F. Crestani, Y. Bi (2006), Evaluating Score Normalization Methods in Data Fusion, Information Retrieval Technology, Proceedings of 3rd Asia Information Retrieval Symposium, AIRS 2006, Singapore, p.p. 642-648.

https://doi.org/10.1007/11880592_57

W. Shengli, B. Yaxin, Z. Xiaoqin (2011), The linear combination data fusion method in information retrieval, Lecture Notes in Computer Science book series (LNCS, volume 6861), pp. 219–233.

https://doi.org/10.1007/978-3-642-23091-2_20

S. Wu, S. McClean (2005), Data Fusion with Correlation Weights, Lecture Notes in Computer Science, Volume 3408/2005, p.p. 275-286.

https://doi.org/10.1007/978-3-540-31865-1_20

B. Xu, S. Luo, K. Sun (2012), Towards Multimodal Query in Web Service Search, 19th International Conference on Web Services, IEEE.

https://doi.org/10.1109/icws.2012.42

Y. Rasolofo, F. Abbaci, J. Savoy (2001), Approaches to collection selection and results merging for distributed information retrieval, CIKM'01 Proceedings of the 10th international conference on Information and knowledge management, ACM, p.p. 191 - 198.

https://doi.org/10.1145/502585.502618

L. Hang (2011), Learning to Rank for Information Retrieval and Natural Language Processing, Synthesis Lectures on Human Language Technologies, Morgan & Claypool Publishers, p.p. 1-113.

https://doi.org/10.2200/s00348ed1v01y201104hlt012

C. Koby, S. Yoram (2002), Pranking with Ranking, Advances in Neural Information Processing Systems 14, Volume 14, p.p. 641-647.

https://doi.org/10.7551/mitpress/1120.003.0087

M.R. Yousefi, T.M. Breuel (2012), Gated Boosting: Efficient Classifier Boosting and Combining, Lecture Notes in Computer Science, p.p. 262-265.

https://doi.org/10.1007/978-3-642-33347-7_28

L. Yu-Ting, L. Tie-Yan, Q. Tao, M. Zhi-Ming, L. Hang (2007), Supervised rank aggregation, Proceedings of the 16th international conference on World Wide Web - WWW ’07, p.p. 481–490.

https://doi.org/10.1145/1242572.1242638

K. Veningston, R. Shanmugalakshmi (2012), Enhancing personalized web search re-ranking algorithm by incorporating user profile, Third International Conference on Computing, Communication and Networking Technologies (ICCCNT'12).

https://doi.org/10.1109/icccnt.2012.6396036

P.A. Chirita, W. Nejdl, R. Paiu, C. Kohlschütter (2005), Using ODP metadata to personalize search, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '05, p.p. 178--185.

https://doi.org/10.1145/1076034.1076067

T. Nasrin, H. Faili (2016), Automatic Wordnet Development for Low-Resource Languages using Cross-Lingual WSD, Journal of Artificial Intelligence Research, Volume 56, p.p. 61–87.

https://doi.org/10.1613/jair.4968

Y. Rasolofo, D. Hawking, J. Savoy (2003), Result Merging Strategies for a Current News MetaSearcher, Information Processing & Management, No 39(4), p.p. 581–609.

https://doi.org/10.1016/s0306-4573(02)00122-x

P.J. Angeline (1994), Genetic programming: On the programming of computers by means of natural selection, Biosystems, MIT Press Cambridge, p.p. 69-73.

https://doi.org/10.1016/0303-2647(94)90062-0

Q. Tao, L.T. Yan, X. Jun, L. Hang (2010), LETOR: A benchmark collection for research on learning to rank for information retrieval, Information Retrieval, Volume 13, No. 4, p.p. 346–374.

https://doi.org/10.1007/s10791-009-9123-y

C. Zhai, J. Lafferty (2001), A study of smoothing methods for language models applied to Ad Hoc information retrieval, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR ’01, p.p. 334–342.

https://doi.org/10.1145/383952.384019

T.G. Lam, T.H. Vo, C.P. Huynh (2015), Building Structured Query in Target Language for Vietnamese – English Cross Language Information Retrieval Systems, International Journal of Engineering Research & Technology (IJERT), Volume 4, No. 04, p.p. 146–151.

https://doi.org/10.17577/ijertv4is040317




DOI: https://doi.org/10.31449/inf.v43i2.2132

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.