| Name: | Description: | Size: | Format: | |
|---|---|---|---|---|
| 7.43 MB | Adobe PDF |
Advisor(s)
Abstract(s)
Plataformas como o Reddit concentram diariamente milhares de discussões públicas sobre
temas políticos e sociais. No entanto, a informalidade da linguagem utilizada, a dimensão das
conversas e a redundância argumentativa tornam difícil a identificação dos principais pontos
de vista defendidos pelos participantes. Esta dissertação aborda o desafio de simplificar estes
debates, propondo uma solução que recorre a Large Language Models (LLMs) e Grafos de
Conhecimento para estruturar, analisar, classificar e sintetizar os argumentos utilizados, de
forma automática e acessível. A abordagem desenvolvida assenta numa pipeline modular que
extrai e processa todos os dados relevantes de uma discussão, utilizando modelos generativos
nas diversas fases e tarefas que a compõem. A solução procura identificar os argumentos mais
relevantes, classificando-os individualmente quanto à sua posição (a favor, contra ou neutro)
em relação ao tópico do debate, e agrupando-os semântica e visualmente através de um grafo
de conhecimento. O protótipo permite ainda a geração de sumários expositivos, análises
detalhadas e avaliações quantitativas e qualitativas da performance do sistema.
O projeto dá especial atenção a discussões de natureza política e social, tendo em conta a forma
como os argumentos ideológicos são formulados, disseminados e contrapostos em espaços
digitais neste contexto. Esta perspetiva, permite, simultaneamente, testar a robustez do
sistema e explorar a dinâmica discursiva e polarização associada a tópicos politicamente
sensíveis. O objetivo final também passa por tentar contribuir para a melhoria da qualidade do
discurso público e político em ambiente digital, reduzindo a desinformação e promovendo
decisões mais informadas.
Este trabalho foi desenvolvido com base na metodologia Action Research e inclui uma fase
experimental de avaliação do sistema utilizando os próprios LLMs para validação automática.
Foram ainda respeitadas as diretrizes e normas éticas definidas pelo RGPD e pelo AI Act, com
especial atenção à anonimização dos dados e à mitigação de possíveis vieses dos modelos. Os
resultados demonstram o potencial interpretativo e generativo dos LLMs combinados com
Grafos de Conhecimento para promover uma compreensão clara, estruturada e crítica dos
debates públicos e políticos ocorridos em redes sociais.
Platforms such as Reddit host thousands of public discussions every day, particularly on political and social issues. However, the informal nature of the language used, the scale of the conversations, and the redundancy of arguments make it difficult to identify the main points of view defended by participants. This dissertation addresses the challenge of simplifying these debates by proposing a solution that leverages Large Language Models and Knowledge Graphs to structure, analyze, classify, and synthesize the arguments used, in an automated and accessible way. The proposed approach is based on a modular pipeline that extracts and processes all relevant data from a discussion, using generative models throughout the different phases and tasks involved. The system identifies relevant arguments, classifies them individually according to their stance (in favor, against, or neutral) toward the topic under discussion, and semantically and visually groups them through a knowledge graph. It also enables the generation of expository summaries, detailed analyses, and qualitative evaluations of the system's performance. The project pays particular attention to political and social discussions, taking into account how ideological arguments are formulated, disseminated, and countered in digital spaces within this context. This perspective allows, simultaneously, for testing the system’s robustness and exploring the discursive dynamics and polarization associated with politically sensitive topics. The ultimate goal is also to contribute to improving the quality of public and political discourse in digital environments, reducing misinformation, and promoting more informed decisionmaking. This work follows the Action Research methodology and includes an experimental evaluation phase using the LLMs themselves for automatic validation. The ethical standards of the GDPR and the AI Act were followed, with special attention to data anonymization and the mitigation of potential model biases. The results demonstrate the interpretative and generative potential of LLMs combined with Knowledge Graphs in promoting a clear, structured, and critical understanding of public political debates taking place on social media platforms.
Platforms such as Reddit host thousands of public discussions every day, particularly on political and social issues. However, the informal nature of the language used, the scale of the conversations, and the redundancy of arguments make it difficult to identify the main points of view defended by participants. This dissertation addresses the challenge of simplifying these debates by proposing a solution that leverages Large Language Models and Knowledge Graphs to structure, analyze, classify, and synthesize the arguments used, in an automated and accessible way. The proposed approach is based on a modular pipeline that extracts and processes all relevant data from a discussion, using generative models throughout the different phases and tasks involved. The system identifies relevant arguments, classifies them individually according to their stance (in favor, against, or neutral) toward the topic under discussion, and semantically and visually groups them through a knowledge graph. It also enables the generation of expository summaries, detailed analyses, and qualitative evaluations of the system's performance. The project pays particular attention to political and social discussions, taking into account how ideological arguments are formulated, disseminated, and countered in digital spaces within this context. This perspective allows, simultaneously, for testing the system’s robustness and exploring the discursive dynamics and polarization associated with politically sensitive topics. The ultimate goal is also to contribute to improving the quality of public and political discourse in digital environments, reducing misinformation, and promoting more informed decisionmaking. This work follows the Action Research methodology and includes an experimental evaluation phase using the LLMs themselves for automatic validation. The ethical standards of the GDPR and the AI Act were followed, with special attention to data anonymization and the mitigation of potential model biases. The results demonstrate the interpretative and generative potential of LLMs combined with Knowledge Graphs in promoting a clear, structured, and critical understanding of public political debates taking place on social media platforms.
Description
Keywords
Argument Analysis Argument Summarization Social Media Large Language Models Knowledge Graphs Sintetização de argumentos Redes sociais Grafos de conhecimento
