Logo do repositório
 
Miniatura indisponível
Publicação

Automated integration of transport timetable information

Utilize este identificador para referenciar este registo.
Nome:Descrição:Tamanho:Formato: 
DM_JoaoWesterberg_2020_MEI.pdf4.69 MBAdobe PDF Ver/Abrir

Resumo(s)

The ever-growing Web contains a large amount of data. This large amount of data is useful when combined with applications that can refine it and use it to improve its users’ lives. However, using the data available is not an easy task since most of the information is not represented in machine-friendly formats. Instead, this information is represented in formats ideal for human users, resulting in an additional effort for having machines interpreting, extracting, and integrating it, while at the same time ensuring the consistency of information from different sources. In this project, a solution using an ontology-based integration combined with web robots’ extraction automates the process required for updating information regarding schedules of public transports. An already existing application receives that information and uses it to calculate efficient routes for commuters. The proposed solution can extract information from multiple online sources and transform it into different formats. It can extract and transform the information from PDFs and HTML. The system provides a web service for the exportation of these formats by a route optimization system. This document contains the detailed process of the design and construction of the integration system. It describes the alternatives and selections that lead to the application created. Lastly, it evaluates the solution by performing extraction from several sources relevant to the project’s domain.

Descrição

Palavras-chave

Information Retrieval Web crawling Information Integration Ontology PDF extraction

Contexto Educativo

Citação

Projetos de investigação

Unidades organizacionais

Fascículo

Editora

Licença CC