Publication
Implementação DataMesh Sifox
| datacite.subject.fos | Engenharia e Tecnologia | |
| datacite.subject.sdg | 09:Indústria, Inovação e Infraestruturas | |
| dc.contributor.advisor | Reis, Rosa Maria do Nascimento da Silva | |
| dc.contributor.author | VINHAS, FILIPE MIGUEL PEREIRA | |
| dc.date.accessioned | 2025-12-16T15:19:19Z | |
| dc.date.available | 2025-12-16T15:19:19Z | |
| dc.date.issued | 2025-10-28 | |
| dc.description.abstract | Devido Ć crescente necessidade de integrar dados transacionais dispersos em sistemas modulares surgem desafios relacionados Ć governanƧa de dados e Ć disponibilidade de dados em tempo real para anĆ”lise e relatórios. Este trabalho aborda a implementação de uma arquitetura DataMesh no sistema Sifox, que estĆ” a ser reescrito seguindo os princĆpios do Domain-Driven Design (DDD). O projeto tem como objetivo consolidar e relacionar dados de módulos transacionais (OLTP) numa camada analĆtica (OLAP) utilizando mecanismos de Change Data Capture (CDC), permitindo uma integração near real time. A arquitetura DataMesh promove a criação de Data Products reutilizĆ”veis e acessĆveis, descentralizando a governanƧa de dados e facilitando o consumo ad hoc atravĆ©s de ferramentas como o Power BI e APIs. Adicionalmente, o projeto explora o uso de Data Products para anĆ”lises preditivas utilizando Jupyter Notebooks. Este estudo tambĆ©m define diretrizes de governanƧa e explora os benefĆcios e desafios da adoção do DataMesh, comparando-o com abordagens tradicionais de gestĆ£o de dados, como Data Warehouses e Data Lakes. | por |
| dc.description.abstract | The demand for integrating transactional data from modular systems is growing, bringing significant challenges in data governance and ensuring real-time availability for analytics and reporting. This thesis explores the implementation of a Data Mesh architecture in the Sifox system, a solution undergoing a rewrite based on Domain-Driven Design (DDD) principles. The primary objective is to consolidate and relate data from transactional modules (OLTP) into an analytical layer (OLAP) using Change Data Capture (CDC) mechanisms. This enables near realtime integration while maintaining the modularity and autonomy of the system's components. By adopting the Data Mesh paradigm, the project introduces reusable and accessible Data Products, decentralizing governance and enabling ad hoc data consumption through tools like Power BI and APIs. A key focus of this research is on using Data Products for predictive analytics, leveraging advanced machine learning techniques such as Federated Learning (FL). FL methodologies, including Horizontal, Vertical, and Split Learning, are explored for training models across decentralized domains. These approaches prioritize privacy by keeping raw data localized, facilitating tasks like fraud detection and personalized recommendations while addressing challenges of data heterogeneity across domains. Split Learning is particularly emphasized for its ability to balance data privacy and computational efficiency. The thesis also evaluates the core principles of Data Mesh: Data as a Product, Domain Ownership, Federated Governance, and Self-Serve Data Platform. These principles are compared with traditional centralized architectures like Data Warehouses and Data Lakes, highlighting differences in scalability, interoperability, and governance. The research further investigates CDC strategies for synchronizing OLTP and OLAP systems, emphasizing the role of modular input/output ports, Service Level Objectives (SLOs), and automated data contracts to enhance data connectivity and reusability within the mesh. Despite its advantages, the adoption of Data Mesh is not without challenges. Predictive analytics in this decentralized setup can be limited by the complexity of coordinating data from independent domains, ensuring consistency, and maintaining compliance with global governance policies. This work presents a detailed analysis of these limitations while proposing strategies to overcome them through robust infrastructure and policy enforcement. By bridging theoretical insights with practical implementation guidelines, this research aims to provide a roadmap for organizations seeking to adopt Data Mesh architectures, addressing both immediate integration needs and long-term scalability. | por |
| dc.identifier.tid | 204067154 | |
| dc.identifier.uri | http://hdl.handle.net/10400.22/31219 | |
| dc.language.iso | por | |
| dc.rights.uri | N/A | |
| dc.subject | DataMesh | |
| dc.subject | Change Data Capture (CDC) | |
| dc.subject | Data Products | |
| dc.subject | Data Science | |
| dc.subject | Data Decentralization | |
| dc.subject | Data Management | |
| dc.title | Implementação DataMesh Sifox | |
| dc.title.alternative | Sifox Data Mesh Implementation | eng |
| dc.type | master thesis | |
| dspace.entity.type | Publication | |
| thesis.degree.name | Mestrado em Engenharia InformƔtica |
