ISEP - DM – Engenharia de Inteligência Artificial
Permanent URI for this collection
Browse
Recent Submissions
- Ensemble AI Solutions for Personalized Sleep Monitoring Using Wrist-worn WearablesPublication . SILVA, VASCO ANTÓNIO PORTILHO CARVALHO DA; Conceição, Luis Manuel SilvaSleep disorders, including insomnia and sleep apnoea, affect a significant proportion of the global population and are closely linked to cardiovascular, metabolic, and mental health conditions. Accurate and long-term monitoring of sleep is therefore a public health priority, as early detection and personalised management can substantially improve quality of life and reduce healthcare costs. This dissertation explores how wrist-worn wearable devices, combined with advanced machine learning and explainable artificial intelligence (XAI) techniques, can enhance the monitoring and analysis of sleep. While polysomnography (PSG) remains the clinical gold standard for sleep assessment, its cost, intrusiveness, and limited scalability restrict its long-term and widespread applicability. To address these limitations, this work proposes an integrated framework that leverages multimodal data, including photoplethysmography (PPG) and accelerometry, for automatic sleep stage classification and the detection of sleep apnoea. The system incorporates ensemble machine learning models to generate high-quality, personalised insights into sleep quality. Furthermore, explainability is ensured through the application of XAI methods, namely SHAP and LIME, enabling healthcare professionals and end-users to understand and trust model predictions. Experimental validation was conducted using multiple publicly available datasets, demonstrating the system’s robustness and generalisability across heterogeneous populations. Ultimately, this research contributes to the development of transparent, non-invasive, and scalable sleep monitoring solutions. It lays the groundwork for real-world applications in personalised healthcare and the early detection of sleep disorders, promoting better clinical decision-making and long-term well-being.
- Sistema Conversacional Especializado em Laudos de Honorários e Deontologia Médica com Recurso a Grafos de ConhecimentoPublication . FARIA, RICARDO MIGUEL PEIXOTO; Faria, Luiz Felipe Rocha deEste documento apresenta o desenvolvimento e a avaliação de um sistema conversacional especializado no domínio de laudos de honorários e deontologia médica, com base na framework LightRAG, que combina recuperação de informação com grafos de conhecimento para mitigar alucinações. A partir de um domínio complexo, normativo e sensível, procurou-se garantir respostas factualmente sustentadas em documentos institucionais. A arquitetura implementada alinha a pergunta do utilizador com passagens recuperadas do corpus e com entidades e relações do grafo, o que incentiva uma geração ancorada em evidências. A avaliação do sistema conversacional recorreu a métricas semânticas, onde se observou boa cobertura temática, de 84%, e elevada recuperação do contexto e de entidades, de 93% e 92% correspondente, mas com uma precisão de recuperação e utilização parcial do contexto reduzida, de 24% e 58% respetivamente, coerentes com a utilização de modelos locais de pequena dimensão para embeddings e geração e de um grafo pouco denso, composto por cinquenta (50) nós e cinco (5) relações, o que corresponde a um grau médio de 0.2 e uma densidade de 0.00408. Conclui-se que esta combinação é promissora para domínios críticos, mas a sua eficiência depende da qualidade do grafo, da seletividade do recuperador e da capacidade geradora. Propõe-se, como trabalho futuro, evoluir para modelos de embeddings e LLM de maior dimensão e curadoria contínua do grafo, o que visa maior precisão, melhor uso do contexto e menor probabilidade de alucinação.
- Um Estudo Comparativo entre CNNs e Vision Transformers para Reconhecimento Facial em Sistemas de AutenticaçãoPublication . FERREIRA, GUSTAVO LEVI VIEIRA; Ramos, Carlos Fernando da SilvaFacial recognition has established itself as one of the most promising solutions for authentication systems, combining practicality, speed, and no explicit interaction on the part of the user. However, its use in real environments raises critical challenges, especially in balancing productivity and security. Spoofing attacks and fraudulent login attempts pose significant threats that can compromise the reliability and security of these systems. Therefore, this thesis proposes a solution that aims to implement a facial recognition-based authentication mechanism capable of combining performance and resilience in the face of multiple attack attempts. To this end, Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) architectures were explored and compared, evaluating their behavior in terms of speed, performance, and accuracy. The results highlight the advantages and limitations of each approach, providing possible architecture and development for the topic in question in order to achieve a solution that is both useful and secure.
- ARMS: Augmented Reasoning Multi-Agent SystemPublication . OLIVEIRA, FRANCISCO LUÍS PEREIRA; Gomes, Luís Filipe de OliveiraThe developments brought by the transformer architecture have sparked a technological revolution that created a wide range of possible use cases where these models are employed as individuals responsible for handling a diverse array of tasks, from chatbots to more deterministic such as API call and control. However, due to the novelty of these models there has been a lack for standardization when developing proper, controlled real implementations. It is assumed that the present time is considered to be an alchemy-like stage of large-language model usage, and many dierent innovations are born almost every day and everywhere around the world. Great investments are also being made on the field, and there has never been better time to dedicate eorts into discovering and exploring the limits and capacities of this technology of the future. Multi-agent systems belong to a domain of artificial intelligence that has been in the development for many years resulting in refined and mature architectures, communication protocols, and implementation paradigms. However, implementation might sometimes be diicult due to the overhead required in orchestrating proper communication protocols, decision engines, and agent architecture. Furthermore, agent-to-human communication is not always seamless since most agents have programmatic-machine language which might not be easy for actors that are not contextualized or are technically inclined to interact with. This dissertation proposes a system that aims to fuse the capabilities of large-language models to communicate through natural language and rationalize inputs with the capabilities that distributed multi-agent systems oer to resolve tasks that might be present in industrial and smart-building scenarios. Moreover, through the implementation of specific pieces of hardware, referred below as tools, the proposed system tries to increase the degree of impact that decisions made by large-language models have in the environment around them. The system proposed, named “Augmented Reasoning Multi-Agent System” (ARMS), also allows users to communicate directly with agents through natural language conversations facilitating information and desire exchange. Agent-to-Agent communication is also deeply investigated and controlled using specific techniques to manage communication flow and objective-oriented exchanges. Besides a review of the state-of-the-art on topics related to the solution that culminates in a discussion about large-language model-powered agents vs traditional agents, this thesis includes five dierent that test the solution: basic task delegation, interconnected agents, user registration system, vacation system, and building control. Each of these case studies were built incrementally, meaning that the most basic and core principles were firstly tested on the first use cases, culminating on a final one that integrated multiple components previously tested at a large scale. The results from the case studies demonstrated positive results in achieving a multi-agent system that can manipulate the world around it and establish human communication as needed, leveraging large-language models’ capabilities for the decision-making processes, as well as inter-connection.
- Machine Unlearning Approaches applied to Tree-Based Models with Tabular DataPublication . MAGALHÃES, DIANA CATARINA PINTO; Pereira, Isabel Cecília Correia da Silva Praça Gomes; Maia, Eva Catarina GomesThe growing need for compliance with data protection regulations, such as the GDPR’s Article 17 “right to be forgotten”, has intensified research efforts in Machine Unlearning (MU), which is the ability of machine learning models to forget specific training data instances without requiring full model retraining. While most prior work has focused on deep learning and image classification, the applicability of MU to traditional models and tabular data is still underexplored. This thesis investigates the integration of MU approaches into tree-based models trained on tabular datasets. For this purpose, an MU framework called Machine Unlearning Framework for Tree-based models (MUFT) was developed, encapsulating two exact unlearning approaches, SISA and DaRE, with SISA being adapted to work with the XGBoost model. The experimental evaluation was conducted using the binary classification version of two datasets, IoT-23 and GeNIS, and included several evaluation metrics to measure model utility, unlearning efficiency, and forgetting quality under removal ratios of 0.1% and 10%. The obtained results evidenced that SISA and DaRE can achieve effective instance removal with substantially reduced computational costs. Performance, however, varied across datasets and removal ratios. Importantly, the evaluation showed some limitations in existing metrics, which in some cases were not able to fully capture unlearning success. These limitations highlighted the need for improved evaluation metrics. Overall, this work demonstrates how MU approaches can be used and adapted to ensure compliance and improve trust in tree-based models.
- Otimização Inteligente na Gestão de Armazéns: Aplicação de Machine Learning para Previsão de Quantidades e Alocação EficientePublication . FRANCO, ÁLVARO JOSÉ FERNANDES; Faria, Luiz Felipe Rocha deThis thesis proposes an integrated artificial intelligence framework to optimize warehouse operations through two complementary tasks: forecasting daily product outflows and solving the 3D Bin Packing Problem for space-efficient storage allocation. In the first stage, machine learning techniques are applied to historical warehouse movement data to predict the total quantity of products expected to be transferred each day. In the second stage, the predicted volumes are packed into constrained three-dimensional storage bins using reinforcement learning-based methods. A custom OpenAI Gym environment is developed to simulate realistic packing conditions, including box rotation, collision detection, stacking constraints, and compactness rewards. The agent learns packing strategies through interaction with the environment and is evaluated against traditional heuristic baselines. The main contributions of this work include the development of a reinforcement learning– based environment, carefully designed reward functions that encourage efficient packing behavior, and the integration of product forecasting with spatial decision-making. Together, these elements form a complete pipeline that turns historical warehouse data into smart, automated decisions for daily storage planning.
- Modelos híbridos para previsão de resultados de jogos da Premier League usando machine learning e análise de sentimentoPublication . NASCIMENTO, RUBENS FABRÍCIO DO ROSÁRIO SOARES; Ramos, Carlos Fernando da SilvaThis study explores whether combining structured match statistics with pre-match tweet sentiment can enhance probabilistic forecasting of football results. Focusing on English Premier League fixtures, it aligns social signals with each game and compares three families of models: those based solely on statistics, those relying only on tweets, and hybrid approaches that integrate both. The evaluation respects the chronological order of matches, employing sequential training and validation together with a strict 2024/25 holdout. In terms of assessment, Log Loss serves as the primary metric, complemented by calibration measures (ECE, Brier, RPS) as well as accuracy. When comparing different families of models, statistical learners provide the strongest foundation. Within this group, an RBF-SVM delivers a holdout Log Loss of 0.9066 with 58.16% accuracy, while a regularised Logistic Regression remains competitive, suggesting that engineered features capture a substantial linear signal. By contrast, tweet-only models offer useful but weaker contributions. The best-performing configuration, a Linear SVM applied to SBERT-MPNet embeddings, records a Log Loss of 1.0313 and an accuracy of 47.89%, yet generalises consistently across both validation and test. Across the different model families, hybrid approaches provide the most consistent improvements. In particular, Early Fusion with Logistic Regression, which combines sentiment with structured inputs, delivers 59.74% accuracy and a Log Loss of 0.8954 on the holdout, together with a Brier Score of 0.1758 and an RPS of 0.1171. Moreover, Residual Stacking extends these gains by further reducing both Log Loss and Expected Calibration Error compared with the statistical baseline, with the benefits especially clear in lower-confidence fixtures and in predicting draws. The main improvements come from modest probability refinements that reduce error penalties without frequent class flips, while also enhancing calibration. At the same time, certain limitations remain, including the focus on a single league, the risk of temporal drift in team performance, and the presence of noise, ambiguity, and attention bias in social text. Taken together, the findings demonstrate that combining structured match data with curated sentiment yields robust and well-calibrated forecasts, particularly valuable in uncertain fixtures and in outcomes that are traditionally harder to predict.
- Evaluation of explainability AI (XAI) techniques for mitigating ethical and legal challengesPublication . ESQUIÇATO, RAFAEL PORCIDONIO FERNANDES; Marreiros, Maria Goreti CarvalhoThe integration of Artificial Intelligence (AI) into healthcare systems raises significant ethical and legal concerns. This study investigates how Explainable AI (XAI) techniques can enhance the transparency and trustworthiness of medical image classification systems. Through a systematic literature review of 860 papers and experiments using COVID-19 radiography and skin lesion datasets, the research identifies and evaluates XAI methods such as Grad-CAM, SHAP, and ABELE. These methods were assessed for their ability to clarify decision-making processes, improve model accountability, and support regulatory compliance. The study proposes an explainability module that combines different techniques to provide human-readable explanations, aiming to bridge the gap between AI predictions and clinical trust. Findings indicate that XAI not only addresses transparency and bias issues but can also improve diagnostic performance and decision support in critical applications.
- Design and implementation of a low-cost computer vision pipeline for amateur football analysisPublication . ALVES, RAFAEL NUNO DE SOUSA; Matos, Paulo Sérgio dos Santos; Martins, António Constantino LopesThe advancement of computer vision and artificial intelligence has opened new possibilities for sports analytics, particularly in football. This dissertation explores the development of an AI-powered multi-platform application designed to track and analyze amateur football matches without the need for wearable sensors. By leveraging computer vision techniques such as object detection, multi-object tracking, and real-time analytics, this research aims to provide an accessible and cost-effective solution for performance analysis in amateur football. The work presents a systematic review of existing methodologies, identifying key challenges such as occlusion, motion blur, and real-time computational constraints. A methodological framework based on the Design Science Research (DSR) approach guides the investigation, ensuring iterative development, validation, and refinement of the proposed system. The findings of this study lay the groundwork for the future implementation of a fully functional AI-based tracking system. Over the next six months, the research will transition into a practical phase, involving model training, system deployment, and real-world testing. By addressing the identified challenges and leveraging recent advancements in AI and computer vision, this project aims to bridge the gap between professional and amateur sports analytics.
- Ai-driven emotion recognition for mental health diagnoses: Assessing mental health through emotional state evaluationPublication . PRETO, PEDRO MIGUEL PERES; Conceição, Luís Manuel Silva; Figueiredo, Ana Maria Neves Almeida BaptistaMental health conditions remain a concerning challenge across the globe, requiring timely and reliable approaches to correctly make accurate diagnoses and effective interventions. Traditional assessment methods often rely on subjective self-reports and clinical interviews, which may not always capture the full spectrum of an individual’s emotional state. In this context, computational techniques for emotion analysis provide a complementary perspective by identifying patterns in facial expressions, speech, and language. This dissertation evaluates the potential of multimodal emotional state analysis and its contribution to mental health assessment, through the development of a computational application. A systematic review was conducted to evaluate existing methodologies and highlight their strengths, limitations, and applicability in clinical contexts. Building on this review, the present work explores an integration of visual, vocal, textual patterns, assessing the contribution of their combined capacity to improve the consistency and depth of emotional interpretation. An analysis centered on methodological design was conducted by applying techniques such as preprocessing, fine-tuning, and data augmentation on the datasets to enhance the model’s capacity. Ethical and security considerations were also incorporated to strengthen system robustness and ensure responsible deployment in the market. The proposed solution consists of an artificial intelligence based multimodal system that integrates the analysis of emotions present in facial expressions, voice, and text patterns to provide a comprehensive assessment of the user’s emotional state. The application’s modular architecture enables real-time processing and the generation of clinical reports. The experimental validation of the system revealed promising results across several DSM-5 domains, the clinical reference manual that defines diagnostic criteria for mental disorders cases. High F1-scores were recorded in domains such as Anger (0.84) and Personality Functioning (0.87), while more subtle domains, such as Dissociation (0.43) and Repetitive Behaviors (0.52), revealed more modest performance. The overall analysis resulted in an observed agreement level of 71.9% and a Cohen’s Kappa of 0.42, indicating moderate agreement with the DSM-5. The findings underline the promise of computational emotion analysis as a supplementary tool for mental health professionals, while also emphasizing the importance of critical evaluation of its limitations and careful integration into clinical practice.
