Repository logo
 
Publication

Anomaly Detection on Natural Language Processing to Improve Predictions on Tourist Preferences

dc.contributor.authorMeira, Jorge
dc.contributor.authorCarneiro, João
dc.contributor.authorBolón-Canedo, Verónica
dc.contributor.authorAlonso-Betanzos, Amparo
dc.contributor.authorNovais, Paulo
dc.contributor.authorMarreiros, Goreti
dc.date.accessioned2023-02-01T09:50:40Z
dc.date.available2023-02-01T09:50:40Z
dc.date.issued2022
dc.description.abstractArgumentation-based dialogue models have shown to be appropriate for decision contexts in which it is intended to overcome the lack of interaction between decision-makers, either because they are dispersed, they are too many, or they are simply not even known. However, to support decision processes with argumentation-based dialogue models, it is necessary to have knowledge of certain aspects that are specific to each decision-maker, such as preferences, interests, and limitations, among others. Failure to obtain this knowledge could ruin the model’s success. In this work, we sought to facilitate the information acquisition process by studying strategies to automatically predict the tourists’ preferences (ratings) in relation to points of interest based on their reviews. We explored different Machine Learning methods to predict users’ ratings. We used Natural Language Processing strategies to predict whether a review is positive or negative and the rating assigned by users on a scale of 1 to 5. We then applied supervised methods such as Logistic Regression, Random Forest, Decision Trees, K-Nearest Neighbors, and Recurrent Neural Networks to determine whether a tourist likes/dislikes a given point of interest. We also used a distinctive approach in this field through unsupervised techniques for anomaly detection problems. The goal was to improve the supervised model in identifying only those tourists who truly like or dislike a particular point of interest, in which the main objective is not to identify everyone, but fundamentally not to fail those who are identified in those conditions. The experiments carried out showed that the developed models could predict with high accuracy whether a review is positive or negative but have some difficulty in accurately predicting the rating assigned by users. Unsupervised method Local Outlier Factor improved the results, reducing Logistic Regression false positives with an associated cost of increasing false negatives.pt_PT
dc.description.sponsorshipThis work was supported by the GrouPlanner Project under the European Regional Development Fund POCI-01-0145-FEDER-29178 and by National Funds through the FCT—Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) within the Projects UIDB/00319/2020 and UIDP/00760/2020.pt_PT
dc.description.versioninfo:eu-repo/semantics/publishedVersionpt_PT
dc.identifier.doi10.3390/electronics11050779pt_PT
dc.identifier.urihttp://hdl.handle.net/10400.22/22044
dc.language.isoengpt_PT
dc.peerreviewedyespt_PT
dc.publisherMDPIpt_PT
dc.relationPOCI-01-0145-FEDER-29178pt_PT
dc.relationALGORITMI Research Center
dc.relationResearch Group on Intelligent Engineering and Computing for Advanced Innovation and Development
dc.relation.publisherversionhttps://www.mdpi.com/2079-9292/11/5/779pt_PT
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/pt_PT
dc.subjectMachine Learningpt_PT
dc.subjectNatural Language Processingpt_PT
dc.subjectSentiment analysispt_PT
dc.subjectArgumentation-based dialoguespt_PT
dc.subjectTourismpt_PT
dc.subjectTripAdvisorpt_PT
dc.titleAnomaly Detection on Natural Language Processing to Improve Predictions on Tourist Preferencespt_PT
dc.typejournal article
dspace.entity.typePublication
oaire.awardTitleALGORITMI Research Center
oaire.awardTitleResearch Group on Intelligent Engineering and Computing for Advanced Innovation and Development
oaire.awardURIinfo:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDB%2F00319%2F2020/PT
oaire.awardURIinfo:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDP%2F00760%2F2020/PT
oaire.citation.issue5pt_PT
oaire.citation.startPage779pt_PT
oaire.citation.titleElectronicspt_PT
oaire.citation.volume11pt_PT
oaire.fundingStream6817 - DCRRNI ID
oaire.fundingStream6817 - DCRRNI ID
person.familyNameMeira
person.familyNameCarneiro
person.familyNameMarreiros
person.givenNameJorge
person.givenNameJoão
person.givenNameGoreti
person.identifier.ciencia-id5013-AE4F-F111
person.identifier.ciencia-idAE10-74A4-9DC6
person.identifier.ciencia-idA412-138E-2389
person.identifier.orcid0000-0002-1502-780X
person.identifier.orcid0000-0003-1430-5465
person.identifier.orcid0000-0003-4417-8401
person.identifier.ridM-4583-2013
person.identifier.scopus-author-id9332465700
project.funder.identifierhttp://doi.org/10.13039/501100001871
project.funder.identifierhttp://doi.org/10.13039/501100001871
project.funder.nameFundação para a Ciência e a Tecnologia
project.funder.nameFundação para a Ciência e a Tecnologia
rcaap.rightsopenAccesspt_PT
rcaap.typearticlept_PT
relation.isAuthorOfPublication1e842d5b-b0fe-4c09-bc2a-f44540b539d2
relation.isAuthorOfPublicatione6c3294b-e8d7-45ac-9303-f763b9257745
relation.isAuthorOfPublicationf084569f-09f5-4d00-b759-aa4a5802f051
relation.isAuthorOfPublication.latestForDiscovery1e842d5b-b0fe-4c09-bc2a-f44540b539d2
relation.isProjectOfPublicationc9f04b82-b1b1-46ea-b7f6-87f447dd79f0
relation.isProjectOfPublication6eb94c83-adf9-4d9d-a75c-be95f44e3ca5
relation.isProjectOfPublication.latestForDiscoveryc9f04b82-b1b1-46ea-b7f6-87f447dd79f0

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ART5_GECAD_MGT_2022.pdf
Size:
1.29 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: