Focus Web Crawler on Drug Herbs Interaction Patterns

Fatini Nadhirah Mohd Nain, Nurul Hashimah Ahamed Hassain Malim, J. Joshua Thomas, Mei Lan Tan


The types of pharmaceutical products include cosmetics and drugs. Some of the pharmaceutical products comprise a mix of drugs and herbs without considering their interaction effects. Drug-herb interactions (DHIs) refer to the interactions between conventional drugs and herb medicines. However, the available information on DHIs are scattered because it has heterogeneous databases and website resources, apart from some of the paid or subscribed databases. Easy access to information on DHIs would allow researchers to explore more. Therefore, this study proposes improvements in the focus web crawler to collect DHIs information from the heterogeneous resources on the Internet, present priority levels of a resource link through anchor text and URLs, and traversing the link with the aid of depth. The improved focused crawler was tested on two algorithms namely the Breadth-First Search (BFS) and PageRank. Information of DHIs crawled 4,744 herbals from the focus web crawler. The accuracy values for Chinese Med Digital Projects and MedlinePlus were 98% for PageRank and 71% for BFS. Additionally, a focused web crawler may gather more relevant web pages in the same amount of time as a wide crawler. Hence, the proposed crawler may successfully gather DHIs on the web in response to the user queries.

Full Text:



A. Fugh-Berman, E. Ernst, Herb-drug interactions: Review and assessment of report reliability, Br. J. Clin. Pharmacol. 52 (2001).

R. Hooda, Herbal drug interactions - a major safety concern, Res. Rev. J. Pharmacogn. Phytochem. 4 (2016).

B. Li, B. Zhao, Y. Liu, M. Tang, B. Lüe, Z. Luo, H. Zhai, Herb-drug enzyme-mediated interactions and the associated experimental methods: a review, J. Tradit. Chin. Med. 36 (2016).

J.J. Bruno, J.J. Ellis, Herbal use among US elderly: 2002 National Health Interview Survey, Ann. Pharmacother. 39 (2005).

I. Meijerman, J.H. Beijnen, J.H.M. Schellens, Herb‐Drug Interactions in Oncology: Focus on Mechanisms of Induction, Oncologist. 11 (2006).

I. Cascorbi, Drug interactions - Principles, examples and clinical consequences, Dtsch. Arztebl. Int. 109 (2012).

N.C. for C. and I. Health, Herb-drug interactions, 355 (2015).

M. Diligentit, F.M. Coetzee, S. Lawrence, C.L. Giles, M. Gori, Focused crawling using context graphs, in: Proc. 26th Int. Conf. Very Large Data Bases, VLDB’00, 2000.

B. Novak, a Survey of Focused Web Crawling Algorithms, Proc. SIKDD. 5558 (2004).

C. De Groc, Babouk: Focused web crawling for corpus compilation and automatic terminology extraction, in: Proc. - 2011 IEEE/WIC/ACM Int. Conf. Web Intell. WI 2011, 2011.

R. Gaur, D.K. Sharma, Review of ontology based focused crawling approaches, in: ICSCTET 2014 - Int. Conf. Soft Comput. Tech. Eng. Technol., 2016.

N. Pawar, K. Rajeswari, A. Joshi, Implementation of an Efficient web crawler to search medicinal plants and relevant diseases, in: Proc. - 2nd Int. Conf. Comput. Commun. Control Autom. ICCUBEA 2016, 2017.

Y. Qian, X. Ye, W. Du, J. Ren, Y. Sun, H. Wang, B. Luo, Q. Gao, M. Wu, J. He, A computerized system for detecting signals due to drug-drug interactions in spontaneous reporting systems, Br. J. Clin. Pharmacol. 69 (2010).

H. Ibrahim, A. Saad, A. Abdo, A. Sharaf Eldin, Mining association patterns of drug-interactions using post marketing FDA’s spontaneous reporting data, J. Biomed. Inform. 60 (2016).

M.L. Rethlefsen, MEDLINE: A Guide to Effective Searching in PubMed and Other Interfaces, J. Med. Libr. Assoc. 95 (2007).

B. (MD), NCBI Help Manual, Natl. Cent. Biotechnol. (2005) 1. (accessed September 14, 2018).

R.B. Haynes, N.L. Wilczynski, Optimal search strategies for retrieving scientifically strong studies of diagnosis from Medline: Analytical survey, Br. Med. J. 328 (2004).

E. Sayers, E-utilities Quick Start, (n.d.). (accessed September 16, 2018).

NCI, NCI Dictionary of Cancer Terms, (n.d.). (accessed October 19, 2018).

S. Scott, J. Thompson, Adverse drug reactions, Anaesth. Intensive Care Med. 15 (2014).

J.K. Aronson, Medication errors: Definitions and classification, Br. J. Clin. Pharmacol. 67 (2009).

WHO, Electronic Tools: Technical Series on Safer Primary Care., WHO Press. (2016) 1–21.

J.K. Aronson, Medication errors: What they are, how they happen, and how to avoid them, QJM. 102 (2009).

I.R. Edwards, J.K. Aronson, Adverse drug reactions: Definitions, diagnosis, and management, Lancet. 356 (2000).

A.M. Mayo, D. Duncan, Nurse perceptions of medication errors what we need to know for patient safety, J. Nurs. Care Qual. 19 (2004).

M. Shamna, C. Dilip, M. Ajmal, P. Linu Mohan, C. Shinu, C.P. Jafer, Y. Mohammed, A prospective study on Adverse Drug Reactions of antibiotics in a tertiary care hospital, Saudi Pharm. J. 22 (2014).

J.R. Nebeker, P. Barach, M.H. Samore, Clarifying Adverse Drug Events: A Clinician’s Guide to Terminology, Documentation, and Reporting, Ann. Intern. Med. 140 (2004).

S. V Taché, A. Sönnichsen, D.M. Ashcroft, Prevalence of Adverse Drug Events in Ambulatory Care: A Systematic Review, Ann. Pharmacother. 45 (2011).

A. Krähenbühl-Melcher, R. Schlienger, M. Lampert, M. Haschke, J. Drewe, S. Krähenbühl, Drug-related problems in hospitals: A review of the recent literature, Drug Saf. 30 (2007).

USER929, Introduction to MADRAC, (n.d.). (accessed October 19, 2018).


P. Posadzki, L. Watson, E. Ernst, Herb-drug interactions: An overview of systematic reviews, Br. J. Clin. Pharmacol. 75 (2013).

A. Nahrstedt, V. Butterweck, Lessons learned from herbal medicinal products: The example of St. John’s wort, J. Nat. Prod. 73 (2010).

P.C. Chan, Q. Xia, P.P. Fu, Ginkgo biloba leave extract: Biological, medicinal, and toxicological effects, J. Environ. Sci. Heal. - Part C Environ. Carcinog. Ecotoxicol. Rev. 25 (2007).

C. Gaudineau, R. Beckerman, S. Welbourn, K. Auclair, Inhibition of human P450 enzymes by multiple constituents of the Ginkgo biloba extract, Biochem. Biophys. Res. Commun. 318 (2004).

P. de Bra, G.-J. Houben, Y. Kornatzky, R. Post, Information Retrieval in Distributed Hypertexts, RIAO. (1994).

G.H. Agre, N. V. Mahajan, Keyword focused web crawler, in: 2nd Int. Conf. Electron. Commun. Syst. ICECS 2015, 2015.

N. Goyal, R. Bhatia, M. Kumar, A genetic algorithm based focused web crawler for automatic webpage classification, in: IET Conf. Publ., 2016.

G.A.F. Alfarisy, F.A. Bachtiar, Focused web crawler for Indonesian recipes, in: Proc. - 2017 Int. Conf. Sustain. Inf. Eng. Technol. SIET 2017, 2018.

K. Das, S.K. Sinha, Essential pre-processing tasks involved in data preparation for social network user behaviour analysis, in: Proc. Int. Conf. Intell. Sustain. Syst. ICISS 2017, 2018.

C. Jain, P. Flick, T. Pan, O. Green, S. Aluru, An Adaptive Parallel Algorithm for Computing Connected Components, IEEE Trans. Parallel Distrib. Syst. 28 (2017).

Y. Chen, J. Hu, H. Zhao, Y. Xiao, P. Hui, Measurement and Analysis of the Swarm Social Network with Tens of Millions of Nodes, IEEE Access. 6 (2018).

A. London, T. Németh, A. Pluhár, T. Csendes, A local PageRank algorithm for evaluating the importance of scientific articles, Ann. Math. Informaticae. 44 (2015).

A. Vishwakarma, R. Saxena, M. Awasthi, M. Yamuna, Comparative analysis of PageRank and hits: A review, Int. J. Pharm. Technol. 8 (2016).

R. Prajapati, S. Kumar, Enhanced weighted PageRank algorithm based on contents and link visits, in: Proc. 10th INDIACom; 2016 3rd Int. Conf. Comput. Sustain. Glob. Dev. INDIACom 2016, 2016.

P. Lahoti, G. De Francisci Morales, A. Gionis, Finding topical experts in twitter via query-dependent personalized PageRank, in: Proc. 2017 IEEE/ACM Int. Conf. Adv. Soc. Networks Anal. Mining, ASONAM 2017, 2017.

D. Shestakov, Intelligent Web Crawling, IEEE Intell. Informatics Bull. 14 (2013) 5–7. (accessed August 1, 2019).

I. Rogers., The Google PageRank Algorithm and How It Works, (n.d.).


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.