نوع مقاله : مقاله پژوهشی
عنوان مقاله English
نویسنده English
Objective: This study aims to extract an optimal model for the implementation, management, and publication of Iran's General Administrative Thesaurus based on semantic web standards (SKOS/RDF) using open-source software.
Methodology: This research was conducted using the action research method within the framework of Lewin’s three-phase model (planning, execution, and evaluation). The research population comprises the Iran's General Administrative Thesaurus dataset. In the planning phase, the initial status of the thesaurus data was analyzed, and three models for data preparation were identified and evaluated. Following the analysis, an Excel template compatible with SKOS-Play was developed. In the execution phase, the data were organized in a spreadsheet and converted into RDF format. To ensure data validity, the SKOS-Play validation rules were reviewed, and identified errors were corrected. Subsequently, the thesaurus was uploaded into VocBench, processed, and exported in Turtle format.
Findings: The research findings led to the development of a six-stage model for managing and publishing the Iranian Public Administration Thesaurus. This model consists of: (1) Data preparation, (2) Conversion of the thesaurus dataset to RDF,
(3) Transfer of RDF data to VocBench, (4) Serialization of the thesaurus dataset in Turtle format, (5) Publication of the thesaurus dataset in Skosmos, and (6) Provision of access and retrieval services. The results showed that the thesaurus consists of 564 concepts, five main collections, and 18 sub-thesauri, comprising 3,136 RDF triples, with an average triple density of 5.56 per concept. Furthermore, three primary access methods were implemented: (1) Browsing via a web-based system, (2) Using a RESTful API, and (3) Executing semantic queries through SPARQL. Additionally, standard data formats including RDF, Turtle, N-Triples, and N-Quads were provided for data retrieval and integration into other systems.
Conclusion: The developed model in this research is a comprehensive and process-driven approach that can be generalized to other thesauri. The results indicated that the average RDF triple density of 5.56 in this thesaurus demonstrates a well-structured conceptual relationship, contributing to enhanced semantic search and information retrieval in the semantic web. Moreover, the hierarchical structure, the assignment of globally unique identifiers (URIs), and the resolution of technical challenges related to Persian language processing significantly improved the accuracy and efficiency of the thesaurus compared to similar projects. Additionally, the availability of multiple data formats (RDF, Turtle, N-Triples, N-Quads) and access via REST API and SPARQL facilitates the integration of thesaurus data into knowledge management systems. One of the key applications of this study is the integration of the thesaurus into administrative automation systems across the country, enabling interaction and gradual standardization of terminology within governmental organizations. This study demonstrated that adopting semantic web standards and open-source tools provides a sustainable and operational model for managing and publishing national thesauri, serving as a framework for future national projects in this domain.
کلیدواژهها English