Scalable data analytics using crowdsourced repositories and streams

Veloso, Bruno; Leal, Fátima; González-Veléz, Horacio; Malheiro, Benedita; Burguillo, Juan Carlos

http://hdl.handle.net/10400.22/11910

Utilize este identificador para referenciar este registo.

Nome:	Descrição:	Tamanho:	Formato:
ART_LSA_BeneditaMaleiro_ 2018.pdf		927.99 KB	Adobe PDF	Ver/Abrir

Contacte-nos

Autores

Veloso, Bruno

Leal, Fátima

González-Veléz, Horacio

Malheiro, Benedita

Burguillo, Juan Carlos

Resumo(s)

The scalable analysis of crowdsourced data repositories and streams has quickly become a critical experimental asset in multiple fields. It enables the systematic aggregation of otherwise disperse data sources and their efficient processing using significant amounts of computational resources. However, the considerable amount of crowdsourced social data and the numerous criteria to observe can limit analytical off-line and on-line processing due to the intrinsic computational complexity. This paper demonstrates the efficient parallelisation of profiling and recommendation algorithms using tourism crowdsourced data repositories and streams. Using the Yelp data set for restaurants, we have explored two different profiling approaches: entity-based and feature-based using ratings, comments, and location. Concerning recommendation, we use a collaborative recommendation filter employing singular value decomposition with stochastic gradient descent (SVD-SGD). To accurately compute the final recommendations, we have applied post-recommendation filters based on venue suitability, value for money, and sentiment. Additionally, we have built a social graph for enrichment. Our master–worker implementation shows super-linear scalability for 10, 20, 30, 40, 50, and 60 concurrent instances.

Palavras-chave

High performance computing Crowdsourcing Recommender systems Big data Data analytics Parallel processing Distributed computing Smart tourism

URI

http://hdl.handle.net/10400.22/11910

Editora

Elsevier

DOI

10.1016/j.jpdc.2018.06.013

Coleções

ISEP – LSA – Artigos

Métricas Alternativas

Ver registo completo