Ontology-based approach for unsupervised and adaptive focused crawling - Université de Bourgogne Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

Ontology-based approach for unsupervised and adaptive focused crawling

Résumé

Information from the web is a key resource exploited in the domain of competitive intelligence. These sources represent important volumes of information to process everyday. As the amount of information available grows rapidly, this process becomes overwhelming for experts. To leverage this challenge, this paper presents a novel approach to process such sources and extract only the most valuable pieces of information. The approach is based on an unsupervised and adaptive ontology-learning process. The resulting ontology is used to enhance the performance of a focused crawler. The combination of Big Data and Semantic Web technologies allows to classify information precisely according to domain knowledge, while maintaining optimal performances. The approach and its implementation are described, and an presents the feasibility and performance of the approach.

Domaines

Web
Fichier non déposé

Dates et versions

hal-01564206 , version 1 (18-07-2017)

Identifiants

Citer

Hassan Thomas, Christophe Cruz, Aurélie Bertaux. Ontology-based approach for unsupervised and adaptive focused crawling. MOD International Conference on Management of Data , May 2017, Chicago, United States. ⟨10.1145/3066911.3066912⟩. ⟨hal-01564206⟩
355 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More