ESS - DM - Bioestatística e Bioinformática Aplicadas à Saúde
Permanent URI for this collection
Browse
Browsing ESS - DM - Bioestatística e Bioinformática Aplicadas à Saúde by Title
Now showing 1 - 10 of 23
Results Per Page
Sort Options
- Application of machine learning techniques for a recommendation system in pharmacyPublication . Torres, Beatriz Freitas; Oliveira, Alexandra Alves; Faria, Brígida Mónica; Alves, Sandra Maria FerreiraCommunity Pharmacy (CP) plays a crucial role in the population, improving patients’ quality of life and minimising medication risks. In Portugal, CPs dispense prescription and non-prescription products. Pharmacy professionals have an added responsibility when advising non-prescription products and should pay attention to self-medication and possible interactions. Therefore, a product recommendation system that incorporates relevant information about the products supports a more informed recommendation by the professional. Although there are a few studies in the area of medication RS, they are still scarce, and to the best of our knowledge, no medication RS is applied in community pharmacies in Portugal. This work aims to develop a conceptual pharmaceutical product recommendation framework and identify relevant groups of products according to their characteristics and experts’ opinions. The specific objectives consist of describing recommendation systems in pharmacy, defining and comparing distance functions capable of creating groups of similar and clinically relevant products for pharmaceutical counselling, applying machine learning techniques and comparing them, and communicating the results. For this purpose, the background of pharmaceutical products counselling without a prescription was analysed. Public databases were selected to be included in the conceptual framework, and the data obtained was processed. Therefore, a database was obtained with 1426 products (over-the-counter medication, homoeopathic medication, and dermocosmetics) and their clinical and scientific information. In order to identify relevant groups of products, seven hierarchical (single linkage, complete linkage, average linkage, median linkage, centroid linkage, and ward linkage) and non-hierarchical (K-means) clustering techniques were applied and evaluated. Dendrograms, the Calinski-Harabasz score, silhouette score, Davies-Bouldin score and the inflexion point method were used to determine the ideal number of clusters for each technique and evaluate its validity. An experts consultation was performed to define a distance function aligned with pharmaceutical counselling. This consultation allowed the identification of the importance of the variables in the distance function definition. The resultant data was analysed in Microsoft Excel, SPSS and Python with the libraries Pandas, Natural Language Toolkit (NLTK), Unidecode, Plotly, Matplotlib, NumPy, SciPy, and Scikit-learn, using Spyder IDE. Twenty-two groups of similar products were formed with K-means, the most effective clustering approach for forming pharmacologically homogeneous groups. However, the obtained clusters did not present enough clinical relevance to support professionals during counselling. Consequently, a new distance function was defined, enhancing the importance of the pharmacotherapeutic group of the products and aligned with the results obtained in the experts’ consultation. Twenty-four groups of similar products were formed with K-means, which was once again the technique that presented pharmacologically homogeneous groups, based mainly on safe use during pregnancy and breastfeeding and pharmacotherapeutic group. The remaining clustering techniques, non-hierarchical techniques, did not present pharmacologically homogeneous groups with any of the distance functions.
- Applications of knowledge discovery for COVID-19 pandemic studyPublication . Correia, Cláudio Lima; Faria, Brígida Mónica; Fernandes, RúbenThe outbreak of tfe Severe Acute Respiratory Syndrome – Coronavirus 2 (SARS-CoV-2) – also known as COVID 19 has brought global insecurity and fear to our society. Every country in the worl figts together against the spread of this deadly disease with joining efforts. Among the standard models for COVID-19 global pandemic prediction, simple epidemiological and statistical models have received more attention from authorities, wich are popular in the media.Officials around the world are using several outbreak prediction models for COVID-19 to make informe decisions and enforce relevant control measures. Due to a high level of uncertainty and lack of essential data, standard models have shown low accuracy for long-term prediction. This work aims to show na explaratory data analysis od COVID-19 worlwide to understand the real threats and the subsequente planning of containment/mitigation actions. The machine learning models were used to study and understand the everyday exponential bahavior of the COVID-19 across the nations using real-time information from johns Hopkins University and, in particular, in Portugal, with real-time information from Portugal Health Ministry to predict future reachability. In this work, modeling diferente algorithms and evaluating their performance. These algorithms are Polynominal Regression, Support Vector Regression. For the Portuguese dataset, we modeled and evaluated the following algoritms’ performance: Linera Regression, Plynominal Regression, Support Vector Regression, Multilayer Perceptron, and Poly-MultilayerPerceptron. This work also compares three different countries (but very similar – Portugal, Spain and Italy). In the particular case of Portugal, the containment/mitigation actions used by the portuguese government were explored. A comparative analysis was also caried out between Portugal, Spain, and Italy, since the first reported case, in each country, over two months. We also study the effectiveness of mitigation measures, defined by the Portugues government, carried out by the health authorities and my fellow citizens. In the worldwide prediction of the first wave of COVID-19, the best model is the Polynominal Regression model (R-squared – 0.787, MAE – 540.39, RMSE – 782.14, nd the execution time is 0.16s), and in the second wave, the best model is Support Vector Regression (R-squared – 0.996, MAE – 17.41, RMSE 18.98, and the execution time is 0.35s). In the portuguese predictions of COVID-19 (diferente waves), the best model are Polynominal Regression, Multlayer Perceptron, and Poly-MultilayerPerceptron prediction models. In comparing three diferent countries (Portugal, Spain, and Italy), Portugal had the best performance in the testing and mitigation policies. In the study of effectiveness of mitigation measures, defined by the Portuguese government, as soon as the implementation of mitigation measures mores effective are the result of mitigation of the disease.
- Assessing functional activity of astrocytes by calcium imaging: how do astrocytes respond to the electrophysiological microenvironmentPublication . Silva, Sara Cristina da Costa e; Aroso, Miguel; Aguiar, Paulo de Castro; Faria, Brígida MónicaApesar de não serem capazes de produzir potenciais de acção, é sabido que os astrócitos integram as sinapses, sendo capazes de detectar e responder a estímulos externos com dinâmicas de cálcio espaciotemporalmente complexas, podendo modelar a transmissão sináptica. O objectivo deste projecto é avaliar as dinâmincas de cálcio dos astrócitos através da modelação do seu microambiente electrofisiológico. Para tal, culturas de astrócitos foram estimuladas recorrendo a ThinMEAs©, monitorizando a actividade de cálcio. Os resultados obtidos demonstraram que os astrócitos respondem a estímulos de ±600mV ou ±800mV, gerando uma onda de cálcio que se propaga para células vizinhas. A amplitude, tempo de subida e velocidade de propagação da onda de cálcio está dependente do estímulo, sendo que um estímulo de maior amplitude resulta numa resposta de maior amplitude, demorando mais tempo a atingir o seu pico máximo mas atingindo distâncias mais longas. Apesar de preliminares, estes resultados indicam que os astrócitos são capazes de detectar e responder a mudanças eléctricas externas. Desta forma, os astrócitos são células electricamente excitáveis, possivelmente através do seguinte mecanismo: a estimulação leva à abertura dos canais de cálcio voltagem-dependentes de maneira dependente da voltagem, que irá sensibilizar o retículo endoplasmático resultando numa cascata de libertação de cálcio, gerando uma onda de cálcio que se irá propagar através de junções comunicantes ou gliotransmissão vesicular.
- Automatic FoodEx2 classification system for food descriptionPublication . Fonseca, João Emanuel Sousa; Faria, Brígida Mónica; Reis, Luís Paulo; Pimenta, RuiFood is an impacting factor in human health. Food security protects the consumers by offering a safety net from which they can trust the quality of the product. In Europe, entities such as the European Food and Safety Authority (EFSA) are risk assessors. They provide information used to shape laws around food security. To collect data regarding food safety the EFSA developed a comprehensive food classification and description system, called FoodEx2. The FoodEx2 coding system uses manual process to map food descriptions to FoodEx2 codes. The motivation for this work comes from the reduced time that could be obtained by using an algorithm to automate the code generation. It is already known that the application of Knowledge Discovery in Databases is a fundamental area to automatically produce patterns from large quantities of data. The main objective of this project is to explore automatic approaches to classify food descriptions with FoodEx2 codes. In this work several classic classifiers are compared in the prediction of FoodEx2 base codes, a multiclass classification task. The performances were explored in distinct datasets along with different levels of text preprocessing using the metrics exact match ratio and the f1-score and document representation Bag-Of-Words with TF IDF weighting. All the datasets contain imbalanced data distributions. The documents are composed of short texts describing ingredients, dishes, and animal sample details. The performances varied mainly between datasets and classifiers. The best performing classifiers were Random Forests, Decision Trees, and Linear Support Vector Machines. The results show that the creation of an automatic classifier is dependent on further exploration of the available data.
- Bayesian pharmacokinetics: Pharmacodynamics modeling & simulationPublication . Mendonça, Verónica Maria Marques; Oliveira, Carla; Silva, Nuno ElvasPara que um fármaco entre no mercado farmacêutico, é necessário realizar estudos que demonstrem a sua eficácia e segurança. Estes estudos permitem estudar interações e influência de determinados fatores fisiológicos (idade, dieta) e patológicos (ex: problemas hepáticos) na absorção do fármaco no corpo humano. A farmacocinética (PK) tem como objetivo o estudo do processo cinético de absorção do fármaco - como também a biotransformação e eliminação, ou seja, estudo o que o corpo faz ao fármaco - usando parâmetros PK para medir a extensão do componente ativo, desde a fase de absorção até ao local de efeito. Por outro lado, a farmacodinâmica (PD) estuda o que o fármaco faz ao corpo (ex: após ocorrer ligação entre o fármaco e o recetor), isto é, estuda o efeito produzido. A análise individual em contraste com análise populacional não tem em conta fatores que podem explicar alguma da variabilidade observada no fármaco. Daí que, a abordagem populacional é muitas vezes útil quando se deseja identificar e quantificar fatores que influenciam o comportamento ou que expliquem a variabilidade (variabilidade inter-sujeitos) numa determinada população de interesse. Pois, é possível estimar o efeito do fármaco desprovido de outros fatores que possam interferir e deste modo estimar a dosagem apropriada a subpopulação específica, tais como crianças e idosos. Para a análise PK/PD usou-se modelo de efeitos fixos e de efeitos mistos não-lineares com recurso à modelação Bayesiana. A modelação Bayesiana é particularmente atraente do ponto de vista biológico, numa vez que permite a incorporação de distribuições informativas prévias. Como também é preferível, do ponto de vista da estimação, pois consegue lidar com um grande número de parâmetros e, com a não-linearidade dos processos cinéticos. Na modelação Bayesiana recorreu-se aos métodos de Markov chain-Monte Carlo, algoritmos que permitem gerar amostras cuja distribuição se vai moldando e estabilizando, à medida que o número de simulações aumenta, em torno de uma distribuição, na qual se estima os parâmetros de interesse com um determinado grau de credibilidade. O estágio na BlueClinical teve como objetivo caracterizar e perceber a análise individual e populacional da farmacocinética/farmacodinâmica para aceder à segurança e eficácia de um fármaco inovador num estudo ficcional de fase I e fase II de ensaios clínicos (conteúdo acedido por Metrum Institute). Assim foi possível explorar diferentes modelos, identificar e validar o modelo que melhor explica o estudo em causa. Neste estudo, conclui-se que o modelo que melhor explica PK/PD é o modelo de dois compartimentos e para modelar os efeitos secundários foi considerado o modelo semi-mecânico de Friberg and Karlsson.
- Bioestatística em contexto empresarialPublication . Coelho, Heitor Rafael Teixeira; Alves, Sandra; Albuquerque, JoãoA bioestatística desempenha um papel crucial em ensaios clínicos, assegurando a integridade científica e a fiabilidade dos dados desde o planeamento até à comunicação dos resultados. O presente documento retrata as atividades e o conhecimento adquiridos após a integração no departamento de Programação de Dados Clínicos e Estatística da BlueClinical an Astrum Company. Para ilustrar as atividades desenvolvidas, incluindo a elaboração de planos e listas de randomização, a programação de tabelas e a análise descritiva dos conjuntos de dados, apresenta-se a um estudo paralelo, prospetivo e randomizado, que avalia a segurança e eficácia de um fármaco para enxaqueca crónica, com dados simulados. Simularam-se 100 indivíduos, dos quais 60 foram randomizados para receber o medicamento ou placebo. Quanto à eficácia, o grupo tratado com o fármaco apresentou uma redução significativa no número de enxaquecas em comparação ao placebo. No entanto, não houve melhorias significativas na qualidade de vida ou na satisfação dos participantes. Quanto à segurança, o fármaco foi considerado seguro, sem um aumento relevante de eventos adversos em relação ao placebo. Estas atividades permitiram o desenvolvimento de competências técnicas como a programação em R e o aprofundamento da análise estatística aplicada.
- Comparative study of adjustement methods for confounding variablesPublication . Silva, Inês Fortuna Alves da; Antunes, Luís; Faria, Brígida MónicaObservational studies provide relevant evidence; however, they have an inherent lack of balance of baseline variables distribution between the study groups, making it difficult to understand the real treatment effect. There are many methods to balance the confounders. Traditional covariate adjustment is the most used however, currently, it is also common to apply techniques based on propensity score (PS). One of them is Inverse Probability of Treatment Weighting (IPTW). The application of IPTW involves comparing two groups of samples weighed by inverse probability of treatment. The main advantage of using IPTW, compared to other PS techniques, is to allow all patient data to be preserved, and compared to classical adjustment methods, it allows balancing and evaluating this balance of confounders before assessing the outcome. In this study, the effect of two neoadjuvant treatments for HER2-positive breast cancer was analysed. The treatments differed in four additional cycles of pertuzumab. Two methods of balancing the distribution of variables were applied, the IPTW and the traditional regression adjustment methods. The results after the application of both mentioned techniques permitted to conclude that the therapy with double-block anti-HER2 seems more favourable. Besides, this treatment enabled a greater number of patients with pathologic complete response (pCR). It also allowed a reduction in the number of radical mastectomies. Although there were statistically significant differences in the type of surgert between the study groups, the difference in pCR was not significant (p > 0.05). This work had some limitations, such as the low number of patients with certain characteristics, among other factors, which conditioned a clear perception of the results. In this sense, it will be useful to expand this study to include more patients with heterogeneous features, allowing to get more robust conclusions.
- Comparing time series forecasting models for health indicators: A clustering analysis approachPublication . Cruz, Cláudia Beatriz Silva; Oliveira, Alexandra; Faria, Brígida Mónica; Pimenta, RuiTime series can be defined as the sequence of observations ordered by equal time intervals, thus being fundamental to address questions of causality, trends, and forecast. Temporal data and its analysis can be applied to several áreas, such as engineering, finance, and health. With the constant study of time series, several problems arise, one of wich is at the level of clustering, wich aims to identify similarities between the series. This aspect is particularly relevent when time series are modeled by Autoregressive Integrated Moving Average (ARIMA) models, which makes understanding their parameters essential for their analysis. One of the main applications of time series in public health and biomedicine has been in epidemiological studies of infectious and chronic diseases, studies on the prediction of demand for health services, and studies on the assessment of health outcomes through data on mortality and morbidity. These indicators are direct measures of health care needs, reflecting the global burden of disease in the population, and are therefore crucial for the study and surveillance of public health, and for the preocesses of organization and intevention of health services. The sum of mortality and morbidity is referred to as “Burden of Disease” and can be measured by a metric called “Disability Adjusted Life Year” (DALYs). The analysis of this type of data is essential to identify geographic patterns, which allows a better perception of health disparities in the population. The main objectives for this dissertation are to model health indicators through Moving average (MA), Autoregressive Moving Average (ARMA) or Autoregressive Integrated Moving Average processes; evaluate the quality of fito f the models to the data; and compare the distances between processes regarding their effectiveness in identifying natural groups. The study begins by exploring the temporal characteristics of DALYs of five non-communicable diseases (cardiovascular diseases, chronic respiratory diseases, neurological disorders, chronic kidney diseases, and diabetes), highlighting underlying patterns and trends. Then, using na automated algorithm, Autoregressive Integrated Moving Average models are applied to represent and describe the time series. The fito f the model was assessed with forecast accuracy metrics, such as Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), Mean Squared Error (MSE), and Mean Absolute Percentage Error (MAPE). It is on this representation of time series that the Piccolo, the Maharaj, and the LPC distance measures were applied to use clustering techniques and identify clusters. Six diferente hierarchial clustering methods were used, the Ward, the Complete, the Avearge, the Single, the MEdian, and the Centroid linkage. Additionally, the performance of the clustering algorithm was weighed through evaluation metrics, such as the Silhouette scire, CIndex, McClain Index, and Dunn Index. The resulto n non-communicable diseases DALYs data specific to 48 European countries, show that the choice of distance measure greatly influences ckustering outcomes, and the number of clusters formed. While certain methods revealed geographic patterns, other factos, such as, cultural or economic similarities can also influence cluster formation. Furthermore, some countries were frequently isolated in their own cluster across clustering methods and distance measures, suggesting that their Autoregressive Integrated Moving Average model was signifcantly diferente from the rest. For exemple, Latvia, which formed isolated lusters in cardiovascular diseases. Other countries, such as Albania, Belarus, Lithuania, and Swedenwere grouped into the same cluster across various clustering methods when the Piccolo distance was applied to neurological disorders. For chronic respiratory diseases, 15 clusters were formed with the LPC distance, between 8 and 15 clusters with the Piccolo distance, and between 9 and 15 clusters with the Mahara distance. These insights, not only contribute to advancing the field of public health surveillance and intervention, ultimately aiming to alleviate the global burden if disease, but also contribute to our understanding of clustering Autoregressive Integrated Moving Average models and how the use of diferente distance measures influence clusters outcomes.
- Contributions for the validation of the portuguese version of the vascular quallity of Life-6 questionnaire in peripheral artery disease patientsPublication . Oliveira, Rafaela Monteiro; Silva, Ivone; Pedras, Susana; Pimenta, RuiPeripheral Arterial Disease (PAD) is an occlusive atherosclerotic disease that affects ˃230 million people worldwide. The most common symptom is intermitente claudication (IC) that leads to lower quality of life (QoL). Thus, this study aimed to contribute to the validation of the VascuQol-6 questionnaire for the portuguese popultion to obtain a quick, sensitive, and easy-to-use way to assess QoL in PAD. The VascuQol-6 was adapted and translated into European Portuguese. 115 patients were included with a mean age of 65 years and with PAD with IC stable for more than 3 months. Reliability, construct validity analysis through convergente and discriminant validity, known-group validity, and respossiveness analysis were tested. The Average Variance Extracted for the latent construct was 0.40 and the Composite Reliability was 0.79, indicating strong internal consistency. VascuQol-6 was positively associated with SF-36 Physical Component Summary and Mental Component Summary scores (r=.64, p˂.01 and r =.42, p˂.01, respectively). In turn, there was no significant correlation between VascuQol-6 scores and the PADKP or IPAQ. A statistically significant difference between groups according to IC severity (F(2,47)=8.35, p˂0.001) was found. A paired samples t-test showed diferences between VascuQol-6 scores before a walking program (M=15.65, SD=3.09), and after a walking program (M=17.41, SD=2.71), t(67)=3.94, p=˂.001. The VascuQol-6 is a 6-item instrument to assess the QoL associated with PAD with good psychometric properties, convergente and discriminant validity with SF-36, PADKQ and IPAQ. The instrument proved to have known group validity and responsiveness.
- Estudo epidemiológico da incidência e sobrevivência do cancro em pacientes da Região Norte de Portugal e IPO-PortoPublication . Silva, Soraia Alexandra Cardoso da; Antunes, Luís; Oliveira, AlexandraO cancro é um problema de saúde pública mundial. As estatísticas, tanto de incidência como de sobrevivência, são uma ferramenta importante para monitorizar os avanços do controlo do cancro e chamar a atenção para as áreas onde existe necessidade de intervenção. O objetivo deste trabalho foi auxiliar o Registo Oncológico Regional do Norte (RORENO), no sentido de criar as publicações de sobrevivência e incidência com dados dos doentes oncológicos da região norte de Portugal e dos doentes do IPO, pelas quais esta instituição é responsável. Assim sendo, para as publicações de incidência foi utilizado o software Excel tanto para os cálculos como para a criação de tabelas e gráficos necessários. Por sua vez, para o caso das publicações referentes à sobrevivência da doença oncológica, foi utilizado o software R para a análise das sobrevivências e o LATEX para a criação do formato final das publicações. As publicações de incidência, como o próprio nome indica, apresentam o número de casos novos de cancro para o ano em estudo, tanto para os doentes registados na região Norte como também para doentes diagnosticados apenas no IPO-Porto. Por outro lado, para cada uma das publicações de sobrevivência, são apresentados resultados da sobrevivência observada e da sobrevivência “net”. Para a sobrevivência observada foi utilizado o estimador de Kaplan-Meier e para a sobrevivência “net” um estimador recente, designado por Pohar-Perme. Também são apresentadas sobrevivências estratificadas por algumas variáveis: sexo, grupo etário, distrito de residência, extensão da doença e tipo histológico da doença. Além disto, foram calculadas as sobrevivências padronizadas para certos tipos de cancro e realizadas comparações entre curvas de sobrevivência e biénios, utilizando um teste do tipo log-rank para a sobrevivência “net”. Desse modo, no final do estágio foram obtidas duas publicações de incidência, uma direcionada ao estudo de incidência de cancro no Norte de Portugal, no ano de 2012, e outra direcionada à incidência de cancro no IPO do Porto, no ano de 2017. Além destas, foram também obtidas duas publicações relativas ao estudo da sobrevivência da doença oncológica, uma a para região Norte de Portugal e outra para o IPO-Porto, ambas no biénio 2011-2012.
- «
- 1 (current)
- 2
- 3
- »