عنوان مقاله [English]
Objective: The aim of this study was to explain the application of text corpus tagging method in sense disambiguation from specialized homographs and increasing the retrieval Precision of scientific texts containing such homographs.
Method: This research was conducted experimentally and it is a supervised method that is one of the three methods of word sense disambiguation. The research sample consisted of 442 scientific articles of two groups of experimental group and control group. The control group had 221 full-text articles without tags and the experimental group had the same 221 tagged articles, which were tested in the information retrieval system to measure the effectiveness of tagging in sense disambiguation from specialized homographs.
Findings: The research findings indicate that while retrieval in the control group due to sense ambiguity of specialized homographs is accompanied with false drop and reduced precision, tagging of specialized homographs in the full text of articles in the experimental group have direct effect in sense disambiguation from specialized homographs. It is possible to retrieve specialized homographs related to each tag, while in retrieval based on the control group, this is not possible. The level of significance of the Wilcoxon signed-rank test (P = 0.0001, Z = -5/909) shows that the accuracy of retrieval results of specialized homograph after using the tagged text corpus in the information retrieval system is significantly different. Examination of negative and positive rankings shows that the accuracy of the results after using the tagged text corpus has increased significantly and has reached its maximum level of 1.
Conclusion: The rate of precision in retrieving scientific texts in the research findings is evidence of acceptable tagging effectiveness in sense disambiguation of specialized homographs and its effective role in optimizing the information retrieval system. If retrieval system designers focus on optimizing retrieval formulas in search of specialized homograph and empower retrieval systems to search for related documents, researchers with any physiological, experimental, and knowledge characteristics will be able to access related documents. Access their information needs in a short time. In this study, the value of the text corpus as a rich treasure of knowledge-based for information retrieval system was revealed in distinguishing the semantic role of specialized homographs. Although the research was conducted on limited corpus, the researcher believes that because this limited text corpus was designed in a principled way and the texts were consciously selected, the results of the findings can be generalized to all scientific texts in various fields.