Focus Web Crawler on Drug Herbs Interaction Patterns

Fatini Nadhirah Mohd Nain, Nurul Hashimah Ahamed Hassain Malim, J. Joshua Thomas, Mei Lan Tan


The types of pharmaceutical products include cosmetics and drugs. Some of the pharmaceutical products comprise a mix of drugs and herbs without considering their interaction effects. Drug-herb interactions (DHIs) refer to the interactions between conventional drugs and herb medicines. However, the available information on DHIs are scattered because it has heterogeneous databases and website resources, apart from some of the paid or subscribed databases. Easy access to information on DHIs would allow researchers to explore more. Therefore, this study proposes improvements in the focus web crawler to collect DHIs information from the heterogeneous resources on the Internet, present priority levels of a resource link through anchor text and URLs, and traversing the link with the aid of depth. The improved focused crawler was tested on two algorithms namely the Breadth-First Search (BFS) and PageRank. Information of DHIs crawled 4,744 herbals from the focus web crawler. The accuracy values for Chinese Med Digital Projects and MedlinePlus were 98% for PageRank and 71% for BFS. Additionally, a focused web crawler may gather more relevant web pages in the same amount of time as a wide crawler. Hence, the proposed crawler may successfully gather DHIs on the web in response to the user queries.

