ISEP - DM – Engenharia de Inteligência Artificial
URI permanente para esta coleção:
Navegar
Percorrer ISEP - DM – Engenharia de Inteligência Artificial por Domínios Científicos e Tecnológicos (FOS) "Engenharia e Tecnologia"
A mostrar 1 - 10 de 72
Resultados por página
Opções de ordenação
- Adversarial agent for synthetic data generation for phishing detectionPublication . CARDOSO, FRANCISCO FONSECA FERREIRA; Pereira, Isabel Cecília Correia da Silva Praça Gomes; Maia, Eva Catarina GomesPhishing attacks continue to be a significant security challenge, causing financial and reputational damage to organizations and individuals, with emails being the primary way for these attacks. While modern defenses continue to rely on phishing detection systems, their effectiveness is being challenged by the evolution of these attacks. Attackers are moving from generic emails to highly personalised and context-specific messages, which conventional models struggle to detect. The performance of these systems is mostly limited by the scarcity of specialised, domain-specific training data needed to recognise such threats. This thesis tries to address this gap by introducing CANDACE, a modular framework designed to generate context-aware synthetic email messages to train and improve these detection systems. The main innovation of CANDACE comes from its dual Knowledge Graph (KG) architecture, which gives the generation process a contextual foundation. The first KG maps external, real-world information about an organization, while the second models its internal structure, such as employees and projects. A Small Language Model (SLM) then uses the information of these KGs, with other important components, such as URL, to generate an email message that is contextually relevant to the domain of the organization. The contributions of this work include the complete design, end-to-end implementation, and validation of the CANDACE pipeline. A case study in the Public Administration sector presents the framework’s ability to produce convincing, context-aware synthetic messages. The findings confirm that contextual grounding is essential for creating better and more focused training data. This research shows the need to move beyond generic emails datasets, to build more resilient detection systems capable of detecting the more sophisticated and personalised phishing attacks.
- AI-based synthesis of bacterial colony evolution imagesPublication . SILVA, MIGUEL ÂNGELO FERRAZ DA; Martinho, Diogo Emanuel Pereira; Marreiros, Maria Goreti CarvalhoThe growing demand for safety and efficiency in healthcare highlights the importance of optimising sterilisation procedures, where delays or errors can compromise patient outcomes. In this context, microbiological analysis of agar plates is a fundamental step, as it allows the identification of microbial growth that may compromise sterilisation quality. However, traditional inspection methods are time-consuming and rely heavily on manual observation, which limits their scalability in clinical environments. Meanwhile, Artificial Intelligence has demonstrated strong potential in image analysis and forecasting, offering opportunities to enhance microbiological analysis and support decision-making in healthcare workflows. This dissertation addresses the problem of detecting and predicting the growth of bacterial colonies on agar plates. Anticipating how colonies evolve is essential to evaluate contamination levels, yet this task remains challenging due to the natural variability of growth patterns, the occurrence of overlapping colonies, and the diversity of experimental conditions that affect microbial behaviour. To tackle this problem, an integrated application was developed and structured into three main modules. The first is a detection module that applies the YOLO object detection architecture to identify bacterial colonies from agar plate images. The second is a synthetic forecasting module based on convolutional autoencoders capable of predicting future colony states from early observations. The third is a contamination analysis module that translates predictions into interpretable indicators such as colony count, average size, growth rate, and coverage. Together, these modules form a complete pipeline designed to combine visual fidelity with biological relevance. The results show that the system can detect colonies with high accuracy, achieving a Precision of 99.1%, a Recall of 91.7%, and an F1 score of 95.3%. In addition, the forecasting module generated realistic predictions of colony growth, and the contamination analysis provided meaningful metrics across different experimental conditions. The exploration of different temporal intervals revealed complementary trade-offs between predictive detail and biological plausibility, reinforcing the flexibility of the proposed methodology. The main conclusion of this dissertation is that Artificial Intelligence can be effectively applied to predict microbial growth in laboratory settings. By integrating detection, forecasting, and contamination analysis within a single framework, this work establishes a technological foundation that supports the transition to more intelligent sterilisation workflows and contributes to the broader vision of safe, efficient, and smart healthcare environments.
- Ai-driven emotion recognition for mental health diagnoses: Assessing mental health through emotional state evaluationPublication . PRETO, PEDRO MIGUEL PERES; Conceição, Luís Manuel Silva; Figueiredo, Ana Maria Neves Almeida BaptistaMental health conditions remain a concerning challenge across the globe, requiring timely and reliable approaches to correctly make accurate diagnoses and effective interventions. Traditional assessment methods often rely on subjective self-reports and clinical interviews, which may not always capture the full spectrum of an individual’s emotional state. In this context, computational techniques for emotion analysis provide a complementary perspective by identifying patterns in facial expressions, speech, and language. This dissertation evaluates the potential of multimodal emotional state analysis and its contribution to mental health assessment, through the development of a computational application. A systematic review was conducted to evaluate existing methodologies and highlight their strengths, limitations, and applicability in clinical contexts. Building on this review, the present work explores an integration of visual, vocal, textual patterns, assessing the contribution of their combined capacity to improve the consistency and depth of emotional interpretation. An analysis centered on methodological design was conducted by applying techniques such as preprocessing, fine-tuning, and data augmentation on the datasets to enhance the model’s capacity. Ethical and security considerations were also incorporated to strengthen system robustness and ensure responsible deployment in the market. The proposed solution consists of an artificial intelligence based multimodal system that integrates the analysis of emotions present in facial expressions, voice, and text patterns to provide a comprehensive assessment of the user’s emotional state. The application’s modular architecture enables real-time processing and the generation of clinical reports. The experimental validation of the system revealed promising results across several DSM-5 domains, the clinical reference manual that defines diagnostic criteria for mental disorders cases. High F1-scores were recorded in domains such as Anger (0.84) and Personality Functioning (0.87), while more subtle domains, such as Dissociation (0.43) and Repetitive Behaviors (0.52), revealed more modest performance. The overall analysis resulted in an observed agreement level of 71.9% and a Cohen’s Kappa of 0.42, indicating moderate agreement with the DSM-5. The findings underline the promise of computational emotion analysis as a supplementary tool for mental health professionals, while also emphasizing the importance of critical evaluation of its limitations and careful integration into clinical practice.
- AI-driven information retrieval system for candidate screeningPublication . Silva, Vasco Reid Ferreira da; Conceição, Luís Manuel da SilvaEfficient screening and evaluation in the recruitment process are tasks that demand substantial time and effort from Human Resources professionals. These processes often suffer from long waiting periods, inconsistent candidate evaluation, and the potential to overlook qualified candidates. In this context, leveraging state-of-the-art natural language processing architectures, specifically large language models (LLMs), holds significant promise. LLMs can generate evaluations using advanced prompt techniques to improve the accuracy and reliability of the output. This thesis researches the feasibility of employing 7 billion parameter LLMs in candidate screening to reduce response times, decrease workload, and improve evaluation consistency. The study involves a comparative analysis of various state-of-the-art large language models to identify those most suitable for this application. Additionally, it examines different prompt engineering techniques to optimize the performance of these models. A comprehensive analysis of the results is conducted to determine the most effective combinations of LLMs and prompt engineering techniques. This includes a two-way validation process, utilizing both the state-of-the-art GPT-4 model and manual human resources validation, to ensure the robustness and reliability of the findings. The outcomes of this thesis aim to enhance the quality of candidate screening by integrating LLMs into the process. Furthermore, this work aspires to provide valuable insights into the capabilities of 7 billion parameter large language models in the field of human resources and their application in real-world scenarios.
- Ajuste dinâmico de dificuldade em videojogos usando aprendizagem automáticaPublication . Felício, Jorge Emanuel Coelho Mendonça de Anciães; Faria, Luiz Felipe Rocha deIn the constantly evolving field of video games, traditional difficulty settings fail to accommodate the wide range of skill levels among players. The resulting mismatch between the player’s skill and the game’s challenge can make the game boring for skilled players or frustrating for less experienced ones, negatively affecting player engagement. Dynamic Difficulty Adjustment (DDA) seeks to resolve this issue by adapting the game’s difficulty in real time in response to the player’s performance. While advancements in artificial intelligence (AI), particularly machine learning (ML), have enabled more adaptive DDA systems, the full potential of certain advanced techniques or tools has yet to be explored. This thesis thus explores possible innovations in the integration of AI in DDA systems for video games. The research begins by reviewing the techniques used for DDA, focusing on methodologies such as player modeling, rule-based systems, and ML. Based on this research, potential areas for innovation were identified and the application of Deep Reinforcement Learning (DRL) in the Unity game development platform through the usage of the MLAgents toolkit was chosen as a promising approach for this research. Using this methodology, this research aims to implement a DDA system that adjusts a game’s difficulty based on the player’s skills, enhancing their engagement and maintaining a consistent challenge. This project has several critical phases of development, including the creation of a game prototype, data collection for model training, development and integration of the DDA system into the game prototype, and conducting an experiment comparing the prototype with DDA integrated with a version of the prototype that used traditional static difficulty scaling. The experiment conducted was done with 20 participants of varying skill levels and used a combination of collected gameplay metrics and a modified Game Experience Questionnaire (GEQ) survey to evaluate the DDA system’s effectiveness. The results showed that the DDA system demonstrated a statistically significant increase in the player engagement component and appropriately adjusted the difficulty to be harder for participants of higher skill. However, the system sometimes exhibited some issues with drastic adjustments in difficulty between levels, which led to a slightly lower Post-Game positive experience score compared to the static difficulty scaling system. Despite these fluctuations, the proposed system demonstrates the potential of the ML-Agents toolkit in implementing DDA with DRL in games made on the Unity platform. By identifying underexplored areas in the current literature and applying advanced techniques like DRL, this thesis aims to contribute to both academic research and game development regarding the approach to DDA in video games.
- An intelligent hybrid recommender system improved with Association RulesPublication . Moreira, João Filipe Coelho; Santos, Joaquim Filipe Peixoto dosWith the popularization of the Internet and the maturation of associated technologies, the digital environment has evolved into a global marketplace facilitating the exchange of goods and services, commonly referred to as e-commerce. This market has experienced substantial growth due to the expansion of product catalogues and the rising demand for effective recommender systems that enhance user experience and boost the competitiveness of companies. This dissertation examines the current landscape of e-commerce recommender systems, analysing the techniques currently in use, their limitations, and evaluation methods. It also proposes a hybrid approach that integrates recommendation techniques with association rules derived from historical purchase data, assigning weights to balance the influence of each technique. The primary goal is to provide users with personalised and effective recommendations, leveraging the combination of established recommendation methods with association rules, to mitigate existing limitations. The effectiveness of the components in this hybrid approach is evaluated using standard metrics, supplemented by feedback from test users, which aids in adjusting the weights and analysing the relevance of the recommendations. The findings of this approach contribute to increased user satisfaction on e-commerce platforms, although the creation of meaningful association rules requires substantial amounts of data.
- Animal route prediction using artificial intelligencePublication . Azevedo, Catarina Peniche Brandão; Ramos, Carlos Fernando da SilvaThe conservation of wildlife is becoming increasingly critical, especially for endangered species, which face threats from habitat destruction and human interference. This dissertation explores the application of artificial intelligence to predict animal migration routes, an important aspect in species conservation. By using historical GPS tracking data, this study seeks to improve the understanding of the movement patterns of migratory animals. This work starts by addressing several research questions that culminate in the main question, ’How can artificial intelligence be used in predicting animal migration routes?’. These questions focus on the primary techniques and algorithms applied in these cases, the main tracking mechanisms used to gather animal movement information, and the societal implications of the use of AI in this context. Following the systematic review, the development of a feedforward neural network model design for animal route prediction was done. The choice of this model reflects the need for a computationally efficient solution capable of handling the complex data derived from the GPS tracking of African elephants. The model’s performance was improved with hyperparameter tuning, and metrics such as mean squared error (MSE) and R-squared were utilised, demonstrating promising predictive accuracy. By combining AI techniques with wildlife conservation efforts, this work aims to contribute towards mitigating the adverse impacts of human intrusion on migration corridors and enhance efforts to protect endangered species.
- Anomaly behavior detection in webPublication . David, Gabriel Henrique Ribeiro; Marreiros, Maria Goreti CarvalhoIn the domain of web application development, JavaScript plays an important role in enhancing the productivity and interactivity of web applications. However, its flexibility and dynamic nature also introduce potential security risks. Attackers can exploit vulnerabilities in JavaScript to perform various malicious activities, such as data theft, injection attacks, and unauthorized web modifications, including data tampering. This work introduces a novel approach to enhancing the security of web applications by focusing on malicious behavior executed through client-side JavaScript. The core objective of this research is to develop a model capable of identifying anomalous behaviors caused by third-party scripts on web pages. To this end, the research conducts a comparative analysis of four distinct models: One-class SVM, Isolation Forest, Local Outlier Factor, and Autoencoders. To identify the most effective solution, these models are evaluated based on specific performance metrics, including Area Under the Curve (AUC) and F-score. The selected model is used to pinpoint irregularities indicative of potential security breaches or malicious activities. This research significantly advances the field of web application security by providing actionable insights to enhance real-time response capabilities. By addressing the growing threat posed by malicious JavaScript, this work contributes to the development of more robust security measures. The dissertation employs a multi-faceted methodology to ensure a comprehensive approach. Initially, a systematic review methodology is used for a structured and unbiased literature analysis, providing a thorough understanding of the current state of the art. The CRISP-DM framework is adopted for the development phase, facilitating continuous adaptation in response to evolving insights. A Comparative Analysis methodology rigorously evaluates different anomaly detection algorithms, ensuring their possible practical applicability in real-world scenarios. The findings demonstrate that the chosen model can effectively identify anomalies with high accuracy and minimal false positives. This research highlights the importance of integrating anomaly detection with existing Data Loss Prevention (DLP) solutions to monitor and protect sensitive data against cyber-attacks.
- Aplicação técnicas aprendizagem automática no cancro da mamaPublication . Santos, José Carlos Cordeiro Andrade; Marreiros, Maria Goreti CarvalhoO cancro da mama continua atualmente a ser um importante problema de saúde pública a nível internacional e nacional pelo que a problemática da sua abordagem continua a ter todo o interesse. Em Portugal, anualmente são detetados cerca de 7.000 novos casos de cancro da mama, e 1.800 mulheres morrem com esta doença. De acordo com a Norma da Direção-Geral da Saúde para abordagem imagiológica da mama feminina, todas as mulheres assintomáticas com idade compreendida entre 50 e 69 anos, devem realizar uma mamografia de rastreio a cada dois anos. Na presença de alterações morfológicas ou em mulheres com risco moderado a elevado de cancro da mama, o médico assistente pode sugerir antecipar a realização da mamografia e complementar a investigação diagnóstica com os métodos que achar necessários. Se o cancro for detetado precocemente, a probabilidade de o tratamento ser eficaz e bem-sucedido é muito mais elevada. A ressonância magnética é um exame de alta sensibilidade e especificidade moderada, sugerida em pacientes jovens, com aumento substancial do risco, i.e., que apresentam predisposição genética ou história familiar da doença. Este exame utiliza uma tecnologia à base de ondas de radiofrequência num forte campo magnético a fim de obter imagens mais detalhadas dos tecidos internos da mama, no entanto, o seu uso é limitado pela indisponibilidade (imediata) comparada com outros exames e preço associado e contraindicado em pessoas com claustrofobia, dispositivos metálicos como pacemakers ou próteses ou reações ao meio de contraste. Assim, esta tese tem como objetivo desenvolver uma ferramenta de aprendizagem automática com recurso a Redes Adversariais Generativas Cíclicas, capaz de converter uma imagem de mamografia numa semelhante ao produto de uma ressonância magnética, com o intuito de proporcionar uma melhor perceção do campo cirúrgico e aumentar os ganhos em saúde. O conjunto de dados foi cedido pelo Centro Hospitalar Universitário de São João e continha volumes de cortes transversais sucessivos de mamas. Neste caso, o corte seccional com área transversal máxima era o único com interesse para estudo, por isso, extraímos todas as localizações dos cortes para obter os cortes mediais respetivos das mamas. As Redes Adversariais Generativas são pares de sistemas de Inteligência Artificial treinados para criar conteúdo e realizar tarefas mais rapidamente do que um único sistema. Nesta tese, estas realizam a tradução para uma imagem com base noutra singular não emparelhada, ou seja, uma imagem semelhante ao produto de uma ressonância magnética com base numa mamografia, sem imagem de ressonância magnética correspondente. As ferramentas métricas de Medida do Índice de Similaridade Estrutural e de Relação Sinal-Ruído de Pico foram usadas para avaliar a qualidade da imagem sintetizada em relação à imagem real. Com o valor de 0.69667, o valor obtido pela medida do índice de similaridade estrutural indica alta similaridade da imagem criada com a de referência. Quanto à relação sinal-ruído de pico obtida de 31.805 dB, usada para quantificar a qualidade da imagem reconstruída a partir de uma imagem original que sofreu compressão, encontra-se dentro do intervalo de valores típicos. Embora as ferramentas métricas forneçam um resultado quantitativo do desempenho, a melhor resposta que obtivemos foi visual. As imagens sintéticas obtidas apresentam uma aparência visualmente realista, embora seja possível detetar nestes alguns artefactos, devido à diferente forma de captação de imagem pelos diferentes exames e definição inferior dos exames originais usados como base em comparação com a ressonância magnética. Em conclusão, a partir de um conjunto de dados com 57 imagens obtidas por mamografia, em perfil cefalo-caudal, foi possível gerar imagens sintéticas da estrutura mamária semelhantes ao produto da ressonância magnética baseadas em mamografia implementando e testando modelos de rede adversarial generativa, usando dados não emparelhados, como demonstrado pelas diversas métricas e verificações gráficas.
- Application of active learning on medical images to enhance machine learning modelsPublication . Santos, Maria Inês Salvador dos; Marreiros, Maria Goreti CarvalhoArtificial intelligence has made some huge advancements in the healthcare field, particularly in medical imaging. However, data and annotations in this area are often scarce and expensive to obtain. Labeling images, although essential for machine learning models, is a tedious and time-consuming task. Active learning addresses this challenge by selecting informative samples to try and create a subset of unlabeled data where the model could have more difficulty predicting the labels which are then given to experts to annotate. The goal is to try to use less amount of annotated data, whilst still getting a good model performance. Breast cancer is one of the most common cancers in women. The proposed solution uses the Patch- Camelyon dataset, a variation of the Camelyon16 dataset with patches from histopathologic scans of sentinel lymph node sections for the detection of metastatic tissue of breast cancer patients. This work proposes an active learning approach that includes the division of the unlabeled data into clusters which are then classified based on their level of informativeness (based on Shannon Entropy). Then, from each cluster several samples are selected based on the previously defined informativeness level and each sample is scored based on a formula that includes both entropy and Euclidean distance to the cluster centroid. Finally, samples with the lowest uncertainty score are added to the training dataset with the model’s prediction. The proposed method includes both model uncertainty and data distribution. The solution showed promising results when compared with a random sampling approach. To evaluate the proposed solution, greyscale and Macenko normalization techniques were used in all different approaches (random sampling approach, a variation of the proposed solution with no pseudo label task and the proposed solution). In some iterations, the difference between the F1 score in the proposed active learning solution and random sampling was more than 0,20. With the application of this method, experts can spend less time annotating images while still achieving a high-performance model.
