Repository logo
 
Publication

Machine learning in tumor classification in breast cancer

dc.contributor.authorLima, Ana Sofia
dc.contributor.authorCoutinho, Carolina
dc.contributor.authorMachado, Raquel
dc.contributor.authorOliveira, Alexandra Alves
dc.contributor.authorFaria, Brígida Mónica
dc.contributor.authorFaria, Brigida Monica
dc.contributor.authorOliveira, Alexandra
dc.date.accessioned2025-11-18T12:00:29Z
dc.date.available2025-11-18T12:00:29Z
dc.date.issued2025-04-14
dc.description.abstractBreast cancer is the primary cause of mortality among women worldwide (1). Discernible patterns can be found within the disease, presenting an opportunity for the application of machine learning (ML), garnering effective results in screening and diagnosis. Different ML algorithms were tested - Decision Tree, Deep Learning (DL), k-Nearest Neighbors (k-NN) and Naïve Bayes - to construct a predictive model allowing the early classification of a breast tumor as benign or malignant, avoiding the need to proceed to a more invasive technique. The ML models were constructed and applied to a database of 201 individuals with breast cancer and descriptive attributes (e.g. age, tumor size, presence of invasive nodes) (2) by using RapidMiner Studio. The evaluation of the models was done by analyzing their accuracy, true negative (TNR) and true positive rates (TPR), their ROC (Receiver Operating Characteristic) curves and AUC (Area Under Curve). During a first exploratory phase, fours clusters were detected: smaller tumor sizes, younger patients, and a benign diagnosis; older age, bigger tumor sizes and a malignant diagnosis; and two more with the opposite characteristics. These characteristics were later found to be important factors in the construction of the Decision Tree. When comparing the models accuracy, the best model was Naïve Bayes (91.04%), followed by the Decision Tree (90.55%), DL (90.02%) and k-NN (86.32%). There is a statistically significant difference between the performances of every model (p<0.05) except between the DL and the Decision Tree models. Naïve Bayes presented the highest TPR (98.21%) while DL presented the highest TNR (83.15%). The Decision Tree model presented the highest AUC (0.976), followed by Naïve Bayes (0.961). The Decision Tree model best achieved our goal by having the highest AUC which denotes an exceptional sensitivity rate, surpassing Naïve Bayes while maintaining a similar accuracy and TNR.por
dc.identifier.citationLima, A. S., Coutinho, C., Machado, R., Oliveira, A. A., & Faria, B. M. (2024). Machine learning in tumor classification in breast cancer. Proceedings of the 1st Symposium on Biostatistics and Bioinformatics Applied to Health, 19–20. https://recipp.ipp.pt/entities/publication/a634fd4f-6053-47fa-8145-4f876572cba7
dc.identifier.isbn978-989-9045-35-4
dc.identifier.urihttp://hdl.handle.net/10400.22/30955
dc.language.isoeng
dc.peerreviewedn/a
dc.publisherESS | P. PORTO Edições
dc.relation.hasversionhttps://recipp.ipp.pt/entities/publication/a634fd4f-6053-47fa-8145-4f876572cba7
dc.rights.uriN/A
dc.subjectMachine learning
dc.subjectPredictive models
dc.subjectBreast cancer
dc.titleMachine learning in tumor classification in breast cancerpor
dc.typeconference poster
dspace.entity.typePublication
oaire.citation.conferenceDate2024-05-03
oaire.citation.conferencePlacePorto
oaire.citation.endPage20
oaire.citation.startPage19
oaire.citation.titleProceedings of the 1st Symposium on Biostatistics and Bioinformatics Applied to Health
oaire.versionhttp://purl.org/coar/version/c_970fb48d4fbd8a85
person.familyNameFaria
person.familyNameOliveira
person.givenNameBrigida Monica
person.givenNameAlexandra
person.identifierR-000-T1F
person.identifier.ciencia-id0D1F-FB5E-55E4
person.identifier.ciencia-id161A-55D9-C256
person.identifier.orcid0000-0003-2102-3407
person.identifier.orcid0000-0001-5872-5504
person.identifier.ridC-6649-2012
person.identifier.scopus-author-id6506476517
person.identifier.scopus-author-id56340903500
relation.isAuthorOfPublication85832a40-7ef9-431a-be0c-78b45ebbae86
relation.isAuthorOfPublicationd6f940a1-3dba-41d2-9a5e-dc1f313eec07
relation.isAuthorOfPublication.latestForDiscovery85832a40-7ef9-431a-be0c-78b45ebbae86

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
POSTER_Raquel Machado.pdf
Size:
300.88 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
4.03 KB
Format:
Item-specific license agreed upon to submission
Description: