DESIGN OF AN IMPROVED MODEL FOR INFORMATION RETRIEVAL USING HIERARCHICAL FCM CLUSTERING AND RERANKING

Authors

  • R.D.Bhoyar ,Dr.D.N.Satange Author

Abstract

The need for enhanced information retrieval systems is driven by the exponential growth of web content, necessitating more accurate and efficient methods to manage and retrieve relevant information. Traditional information retrieval systems often struggle with processing large volumes of data, leading to suboptimal precision, accuracy, and recall. These systems typically employ simplistic clustering techniques and basic ranking algorithms that fail to capture the intricate relationships within the data, resulting in low quality search outcomes. To address these limitations, we propose a novel framework for information retrieval that leverages advanced web content mining techniques, specifically through the integration of Hierarchical Fuzzy C-Means (Hierarchical FCM) clustering and reranking strategies. The preprocessing phase involves the use of Natural Language Processing (NLP) to remove stop words from the input query, thereby refining the search query and reducing noise. Subsequently, Hierarchical FCM, a fusion of hierarchical clustering and fuzzy c-means, is employed to form clusters that reflect different types of search results. This method combines the hierarchical structure's ability to handle varying levels of granularity with the adaptability of fuzzy clustering, allowing for more nuanced and flexible cluster formation. The clustered results are then reranked based on a combination of similarity scores and click scores. Similarity scores quantify the relevance of the results to the query, while click scores incorporate user interaction data, thus enhancing the ranking accuracy by reflecting actual user preferences. This dual reranking mechanism ensures that the most relevant and user-preferred results are prioritized, improving the overall retrieval performance.Our model demonstrates significant improvements over existing methods, achieving 8.5% higher precision, 5.9% higher accuracy, and 8.3% higher recall. These enhancements underscore the effectiveness of our proposed framework in delivering more accurate and relevant search results. The integration of advanced clustering and reranking techniques addresses the shortcomings of traditional methods, providing a robust solution for managing the complexity of web content. This work has profound implications for the field of information retrieval, offering a scalable and efficient approach to meet the growing demands for precise and relevant information in the digital ages.

Downloads

Published

2024-10-19

Issue

Section

Articles

How to Cite

DESIGN OF AN IMPROVED MODEL FOR INFORMATION RETRIEVAL USING HIERARCHICAL FCM CLUSTERING AND RERANKING. (2024). Machine Intelligence Research, 18(2), 245-260. http://machineintelligenceresearchs.com/index.php/mir/article/view/160