Repository logo
 
No Thumbnail Available
Publication

Machine learning in tumor classification in breast cancer

Use this identifier to reference this record.
Name:Description:Size:Format: 
POSTER_Raquel Machado.pdf300.88 KBAdobe PDF Download

Advisor(s)

Abstract(s)

Breast cancer is the primary cause of mortality among women worldwide (1). Discernible patterns can be found within the disease, presenting an opportunity for the application of machine learning (ML), garnering effective results in screening and diagnosis. Different ML algorithms were tested - Decision Tree, Deep Learning (DL), k-Nearest Neighbors (k-NN) and Naïve Bayes - to construct a predictive model allowing the early classification of a breast tumor as benign or malignant, avoiding the need to proceed to a more invasive technique. The ML models were constructed and applied to a database of 201 individuals with breast cancer and descriptive attributes (e.g. age, tumor size, presence of invasive nodes) (2) by using RapidMiner Studio. The evaluation of the models was done by analyzing their accuracy, true negative (TNR) and true positive rates (TPR), their ROC (Receiver Operating Characteristic) curves and AUC (Area Under Curve). During a first exploratory phase, fours clusters were detected: smaller tumor sizes, younger patients, and a benign diagnosis; older age, bigger tumor sizes and a malignant diagnosis; and two more with the opposite characteristics. These characteristics were later found to be important factors in the construction of the Decision Tree. When comparing the models accuracy, the best model was Naïve Bayes (91.04%), followed by the Decision Tree (90.55%), DL (90.02%) and k-NN (86.32%). There is a statistically significant difference between the performances of every model (p<0.05) except between the DL and the Decision Tree models. Naïve Bayes presented the highest TPR (98.21%) while DL presented the highest TNR (83.15%). The Decision Tree model presented the highest AUC (0.976), followed by Naïve Bayes (0.961). The Decision Tree model best achieved our goal by having the highest AUC which denotes an exceptional sensitivity rate, surpassing Naïve Bayes while maintaining a similar accuracy and TNR.

Description

Keywords

Machine learning Predictive models Breast cancer

Pedagogical Context

Citation

Lima, A. S., Coutinho, C., Machado, R., Oliveira, A. A., & Faria, B. M. (2024). Machine learning in tumor classification in breast cancer. Proceedings of the 1st Symposium on Biostatistics and Bioinformatics Applied to Health, 19–20. https://recipp.ipp.pt/entities/publication/a634fd4f-6053-47fa-8145-4f876572cba7

Research Projects

Organizational Units

Journal Issue

Publisher

ESS | P. PORTO Edições

CC License

Without CC licence