Accéder directement au contenu Accéder directement à la navigation
Communication dans un congrès

Ontology-based approach for unsupervised and adaptive focused crawling

Abstract : Information from the web is a key resource exploited in the domain of competitive intelligence. These sources represent important volumes of information to process everyday. As the amount of information available grows rapidly, this process becomes overwhelming for experts. To leverage this challenge, this paper presents a novel approach to process such sources and extract only the most valuable pieces of information. The approach is based on an unsupervised and adaptive ontology-learning process. The resulting ontology is used to enhance the performance of a focused crawler. The combination of Big Data and Semantic Web technologies allows to classify information precisely according to domain knowledge, while maintaining optimal performances. The approach and its implementation are described, and an presents the feasibility and performance of the approach.
Type de document :
Communication dans un congrès
Domaine :
Liste complète des métadonnées
Contributeur : Le2i - Université de Bourgogne Connectez-vous pour contacter le contributeur
Soumis le : mardi 18 juillet 2017 - 15:55:07
Dernière modification le : mercredi 3 novembre 2021 - 08:57:28



Hassan Thomas, Christophe Cruz, Aurélie Bertaux. Ontology-based approach for unsupervised and adaptive focused crawling. MOD International Conference on Management of Data , May 2017, Chicago, United States. ⟨10.1145/3066911.3066912⟩. ⟨hal-01564206⟩



Les métriques sont temporairement indisponibles