An approach to manage semantic heterogeneity in unstructured P2P information retrieval systems

An approach to manage semantic heterogeneity in unstructured P2P information retrieval systems In unstructured information retrieval P2P systems, semantic heterogeneity comes from the use of different ontologies. Semantic interoperability refers to the ability of peers to communicate with each others. We take into account these notions separately, as raising two different problems. Hence we propose two independent and complementary solutions. The GoOD-TA protocol aims at reducing heterogeneity through ontology-driven topology adaptation. DiQuESh is a top-k algorithm for distributed information retrieval that is intended to ensure interoperability. This distinction enables highlighting their respective benefits on the IR performances and leads to a modular architecture. For our experiments we obtained a set of actively used real-world ontologies through the NCBO BioPortal. We implemented GoOD-TA and DiQuESH in Java and used theĀ PeerSimĀ simulator. We first show that GoOD-TA nicely reduces the semantic heterogeneity related to the system topology, handles the evolution of peers’ descriptors, and is suitable for dynamic systems. Then, GoOD-TA and DiQuESh are run simultaneously, with a significant increase of precision and recall. This enables to identify the indirect contribution of heterogeneity reduction obtained with GoOD-TA to improving interoperability.