Repository logo
 

ESS - DM - Bioestatística e Bioinformática Aplicadas à Saúde

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 10 of 17
  • Contributions for the validation of the portuguese version of the vascular quallity of Life-6 questionnaire in peripheral artery disease patients
    Publication . Oliveira, Rafaela Monteiro; Silva, Ivone; Pedras, Susana; Pimenta, Rui
    Peripheral Arterial Disease (PAD) is an occlusive atherosclerotic disease that affects ˃230 million people worldwide. The most common symptom is intermitente claudication (IC) that leads to lower quality of life (QoL). Thus, this study aimed to contribute to the validation of the VascuQol-6 questionnaire for the portuguese popultion to obtain a quick, sensitive, and easy-to-use way to assess QoL in PAD. The VascuQol-6 was adapted and translated into European Portuguese. 115 patients were included with a mean age of 65 years and with PAD with IC stable for more than 3 months. Reliability, construct validity analysis through convergente and discriminant validity, known-group validity, and respossiveness analysis were tested. The Average Variance Extracted for the latent construct was 0.40 and the Composite Reliability was 0.79, indicating strong internal consistency. VascuQol-6 was positively associated with SF-36 Physical Component Summary and Mental Component Summary scores (r=.64, p˂.01 and r =.42, p˂.01, respectively). In turn, there was no significant correlation between VascuQol-6 scores and the PADKP or IPAQ. A statistically significant difference between groups according to IC severity (F(2,47)=8.35, p˂0.001) was found. A paired samples t-test showed diferences between VascuQol-6 scores before a walking program (M=15.65, SD=3.09), and after a walking program (M=17.41, SD=2.71), t(67)=3.94, p=˂.001. The VascuQol-6 is a 6-item instrument to assess the QoL associated with PAD with good psychometric properties, convergente and discriminant validity with SF-36, PADKQ and IPAQ. The instrument proved to have known group validity and responsiveness.
  • Application of machine learning techniques for a recommendation system in pharmacy
    Publication . Torres, Beatriz Freitas; Oliveira, Alexandra Alves; Faria, Brígida Mónica; Alves, Sandra Maria Ferreira
    Community Pharmacy (CP) plays a crucial role in the population, improving patients’ quality of life and minimising medication risks. In Portugal, CPs dispense prescription and non-prescription products. Pharmacy professionals have an added responsibility when advising non-prescription products and should pay attention to self-medication and possible interactions. Therefore, a product recommendation system that incorporates relevant information about the products supports a more informed recommendation by the professional. Although there are a few studies in the area of medication RS, they are still scarce, and to the best of our knowledge, no medication RS is applied in community pharmacies in Portugal. This work aims to develop a conceptual pharmaceutical product recommendation framework and identify relevant groups of products according to their characteristics and experts’ opinions. The specific objectives consist of describing recommendation systems in pharmacy, defining and comparing distance functions capable of creating groups of similar and clinically relevant products for pharmaceutical counselling, applying machine learning techniques and comparing them, and communicating the results. For this purpose, the background of pharmaceutical products counselling without a prescription was analysed. Public databases were selected to be included in the conceptual framework, and the data obtained was processed. Therefore, a database was obtained with 1426 products (over-the-counter medication, homoeopathic medication, and dermocosmetics) and their clinical and scientific information. In order to identify relevant groups of products, seven hierarchical (single linkage, complete linkage, average linkage, median linkage, centroid linkage, and ward linkage) and non-hierarchical (K-means) clustering techniques were applied and evaluated. Dendrograms, the Calinski-Harabasz score, silhouette score, Davies-Bouldin score and the inflexion point method were used to determine the ideal number of clusters for each technique and evaluate its validity. An experts consultation was performed to define a distance function aligned with pharmaceutical counselling. This consultation allowed the identification of the importance of the variables in the distance function definition. The resultant data was analysed in Microsoft Excel, SPSS and Python with the libraries Pandas, Natural Language Toolkit (NLTK), Unidecode, Plotly, Matplotlib, NumPy, SciPy, and Scikit-learn, using Spyder IDE. Twenty-two groups of similar products were formed with K-means, the most effective clustering approach for forming pharmacologically homogeneous groups. However, the obtained clusters did not present enough clinical relevance to support professionals during counselling. Consequently, a new distance function was defined, enhancing the importance of the pharmacotherapeutic group of the products and aligned with the results obtained in the experts’ consultation. Twenty-four groups of similar products were formed with K-means, which was once again the technique that presented pharmacologically homogeneous groups, based mainly on safe use during pregnancy and breastfeeding and pharmacotherapeutic group. The remaining clustering techniques, non-hierarchical techniques, did not present pharmacologically homogeneous groups with any of the distance functions.
  • In silico dessection of the immunomodulatory effects of cholesterol on colorectal cancer
    Publication . Machado, Ana Luísa Marinho da Cunha; Fernandes, Verónica; Velho, Sérgia; Antunes, Luís
    Cholesterol plays a pivotal role in the progression of tumors, serving as a crucial component for cell membrane formation and the generation of specific proteins and enzymes that stimulate the growth and dissemination of tumor cells. Additionally, cholesterol levels within the tumor microenvironment exert influence over immune responses by hindering the activity of vital components like T-cells and NK-cells, which are indispensable for effective anti-cancer immunity. The primary objective of this research is to investigate whether it is possible to categorize colon cancer tumors based on disparities in cholesterol-related characteristics and whether these groupings correlate with distinct immune profiles. The Cancer Genome Atlas (TCGA) project is an open-access catalog aiming to comprehensively understand the genomic alterations responsible for various cancer types, by encompassing a vast array of molecular data from thousands of patient samples. One of the pivotal advantages of utilizing TCGA data lies in its sheer scale and diversity. By integrating genomic, transcriptomic, proteomic, and clinical data from a multitude of patients, researchers can identify patterns, mutations, and biomarkers associated with specific cancers. Taking advantage of this catalog, we selected TCGA RNA-seq dataset from patients with colorectal cancer (480 tumor colon samples and 167 tumor rectum samples). Firstly, we used the Gene Set Enrichment Analysis (GSEA) tool, a powerful tool employed in bioinformatics and computational biology, to determine the sets of genes and pathways that showed statistically significance. Upon comparing these samples with their corresponding normal adjacent tissues, notable disparities in lipid metabolism were discerned. While cholesterol-related pathways did not rank as the top differentially regulated pathways, we exclusively observed an upregulation of lipid-related pathways in normal adjacent tissue in comparison to tumor tissue within the colon samples. Subsequently, we conducted in-depth analyses to determine whether colon tumors can be stratified based on differences in cholesterol metabolism and whether these variations correlate with disparities in the tumor microenvironment.By using the ssGSEA scores of the pathways related to cholesterol metabolism we employed the k-means method to cluster the samples. Remarkably, colon tumor samples naturally segregated into two distinct groups: one characterized by low expression of cholesterol-related genes and the other exhibiting increased expression. Notably, these groupings exhibited disparities in colon sample staging and the prevalence of molecular subtypes within each category. The group displaying enhanced cholesterol metabolism showcased reduced prolifiv eration, underscoring the significance of tumor microenvironment remodeling. Among the top enriched pathways, were pathways associated with modified antigen presentation, cytotoxic immune responses, and remodeling of the extracellular matrix. These observations were consistent with increased infiltration of immune cells driven by the activation of cholesterol metabolism. However, despite the higher quantity of these immune cells, their activation levels were lower in tumors characterized by upregulated cholesterol metabolism. Comparison of signaling pathways between these groups revealed significant differences in pathways linked to tumor malignancy. In summary, these findings underscore the role of cholesterol metabolism alterations in driving substantial adaptations within the tumor microenvironment. Stratifying colon tumors based on cholesterol metabolism presents a promising avenue, potentially benefiting patients through immunotherapy and cholesterol modulation as adjuvant therapy.
  • Automatic FoodEx2 classification system for food description
    Publication . Fonseca, João Emanuel Sousa; Faria, Brígida Mónica; Reis, Luís Paulo; Pimenta, Rui
    Food is an impacting factor in human health. Food security protects the consumers by offering a safety net from which they can trust the quality of the product. In Europe, entities such as the European Food and Safety Authority (EFSA) are risk assessors. They provide information used to shape laws around food security. To collect data regarding food safety the EFSA developed a comprehensive food classification and description system, called FoodEx2. The FoodEx2 coding system uses manual process to map food descriptions to FoodEx2 codes. The motivation for this work comes from the reduced time that could be obtained by using an algorithm to automate the code generation. It is already known that the application of Knowledge Discovery in Databases is a fundamental area to automatically produce patterns from large quantities of data. The main objective of this project is to explore automatic approaches to classify food descriptions with FoodEx2 codes. In this work several classic classifiers are compared in the prediction of FoodEx2 base codes, a multiclass classification task. The performances were explored in distinct datasets along with different levels of text preprocessing using the metrics exact match ratio and the f1-score and document representation Bag-Of-Words with TF IDF weighting. All the datasets contain imbalanced data distributions. The documents are composed of short texts describing ingredients, dishes, and animal sample details. The performances varied mainly between datasets and classifiers. The best performing classifiers were Random Forests, Decision Trees, and Linear Support Vector Machines. The results show that the creation of an automatic classifier is dependent on further exploration of the available data.
  • Comparative study of adjustement methods for confounding variables
    Publication . Silva, Inês Fortuna Alves da; Antunes, Luís; Faria, Brígida Mónica
    Observational studies provide relevant evidence; however, they have an inherent lack of balance of baseline variables distribution between the study groups, making it difficult to understand the real treatment effect. There are many methods to balance the confounders. Traditional covariate adjustment is the most used however, currently, it is also common to apply techniques based on propensity score (PS). One of them is Inverse Probability of Treatment Weighting (IPTW). The application of IPTW involves comparing two groups of samples weighed by inverse probability of treatment. The main advantage of using IPTW, compared to other PS techniques, is to allow all patient data to be preserved, and compared to classical adjustment methods, it allows balancing and evaluating this balance of confounders before assessing the outcome. In this study, the effect of two neoadjuvant treatments for HER2-positive breast cancer was analysed. The treatments differed in four additional cycles of pertuzumab. Two methods of balancing the distribution of variables were applied, the IPTW and the traditional regression adjustment methods. The results after the application of both mentioned techniques permitted to conclude that the therapy with double-block anti-HER2 seems more favourable. Besides, this treatment enabled a greater number of patients with pathologic complete response (pCR). It also allowed a reduction in the number of radical mastectomies. Although there were statistically significant differences in the type of surgert between the study groups, the difference in pCR was not significant (p > 0.05). This work had some limitations, such as the low number of patients with certain characteristics, among other factors, which conditioned a clear perception of the results. In this sense, it will be useful to expand this study to include more patients with heterogeneous features, allowing to get more robust conclusions.
  • Speckle tracking echocardiography in detecting myocardial deformation and the influence of cardiovascular factors
    Publication . Cardoso, Helena Jesus Loureiro; Faria, Brígida; Amaral, Rita; Martins, Adélio
    The Transthoracic Echocardiogram studies the mobility of the left ventricular wall through the ejction fraction. Speckle Tracking is a recente modality that assess the function through the relative changes in the extent/thickness of the myocardium to the original shape (strains). This dissertation is divided into a systematic review and meta-analysis, and a research study. The systematic review aims to compare patients with cardiac entities and healthy individuals, in the diagnosis of LV myocardial deformation through EF values and strains. The main objective of the investigation study is to compare EF and strains in the diagnosis of LV myocardial deformation, in patients with eletrocardiographic changes or clinical histoty of cardiac enteties. Secondary objectives include (1) verify diferences between the alternations in ventricular repolarization, according to EF and strains; (2) analyse between LV segments with hypokinesia, according to EF and strains; (3) verify the correlation between age and strains; (4) describe the distribution of strains according to gender; (5) evaluate the influence of cardiovascular risk factos on strain values. Observational studies were searched in three electronic databases (PubMed, B-on, MDPI). Reviewers independently extracted data and assessed the quality of evidence with GRADE. Pooled studies were analyzed using a random effects model and results were presented as standardized mean diferences. The reseach included 22 patients who, due to electrocardiographic changes, complaints, or clinical history, were evaluated using the Speckle Tracking method. The applications of statistical techniques included summary statistical measures and graphics within the scope of descriptive statistics normality tests, homogeneity tests, independente t-tests, Mann-Whitney, Kruskal-Wallis, and ANOVA. Fifty-six articles met the inclusion and review criteria; 10 articles were grouped in meta-analysis. The combined mean of ejection fraction and strains was significantly lower in patients with cardiomyophaty (EF (-1,47 [-2,55; -0,39]), GLS (-2,18 [-3,01; -1,34])), GRS ((-1,56 [-2,48; -0,64])) myocardial infarction (EF (-2,63 [-3,64; -1,61]), GLS (2,80 [-4,10; -1,51])), and coronary artery disease (EF (-0,25 [-0,49; -0,01]), GLS (-0,95 [-1,65; -0,25])) when compared to control groups. In the 22 patients studied, the ejection fraction values compared to the strain values did not show concordant degrees of severity. Non-specific changes in ventricular repolarization and non-recent necrosis showed a difference in the GLS (p˂0,05). The middle third of the septal segment and the latero-apical segment also presented a difference in the EF (p˂0,05). When assessing the influence of cardiovascular risk factos, no evidence of a difference was found in strain values. In conclusion, speckle Tracking is a recente method that can be used to assess left ventricular myocardial function.
  • Risk assessement for food safety and public health: a machine learning approach
    Publication . Silva, Maria Clara Ferreira e; Faria, Brígida Mónica; Reis, Luís Paulo
    Foodborne diseases continue to spread widely in the 21st century. In Portugsl. The economic and Food Safety Authority (ASAE), have the goal to monitoring and preventing non-compliance with regulatory legislation and food safety, regulate the conduct of economic activities in the food and non-food sectors, as well as acess and comunicate risks in the food chain. In this work, it was evaluated the global risk considering three risk factors provided by ASAE (non-compliance rate, producto or service risk and consumption volume). It was also compared the performance on the prediction of risk of four classification models Decision Tree, Naive Bayes, K-Nearest Neighbor and Artificial Neural Network before and after feature selection and hyperparameter tuning. Our principal findings revealed that the servisse Provider food and beverage and retail were the activity sectors presente in the dataset with the highest global risk associated with. It was also observed that the Artificial Neural Network classifier presented the best results of 60%, however it was the model that took longer to train. It was also detected that the Chi-square feature selection method provided better results than the ANOVA F-test. It was also verified that data balancing using the SMOTE method led to a performance increase of 90% with the Decision tree and K-Nearest Neighbor modelas. This work allowed to conclude that the use of machine learning can be helpful in risk assessment related to food safety and public health. It was also concluded that the áreas regarding major global risks are the ones which are more frequented by the Portuguese population and require more thorough inspections. Thus, relying on risk assessment usig machine learning can have a positive influence in economic crime prevention related to food safety as well as in public health.
  • Mathematical modeling for pharmacological approaches directed to metabolic pathways in "diabetic paradox" in prostate cancer
    Publication . Santos, Inês Ribeiro da Silva de Lima; Fernandes, Rúben; Alves, Marco; Baylina, Pilar
    Obesity and diabetes are two metabolic risk factos for cancer. However, there is a metabolic paradox in prostate cancer in which diabetes appears to protect the patient form this type of cancer. The current study aims to develop explanatory models of this contradiction utilizing prostate cancer cell lines, PC3 and LNCaP, in contrast to the metabolismo of normal prostate cells, using bioinformatics methods (HPEpiC). Two of the major routes of prostate metabolism, glycolysis and gluconeogensis, were mathematically manipulated in this study. This mathematical model offers unique and revolutionary implications in personalized medicine since it predicts the Effect, therapeutic dose, and efficacy of medications in varied conditions of the tumor microenvironment and the patient’s metabolismo. As na illustration od the model’s usefulness, a novel anti-tumor drug in the clinical trials phase, 3-bromopyruvate, which has the modeled metabolic pathways as a therapeutic target, was employed. The efficacy od 3-bromopyruvate was investigated, and the IC50 was found to be capable of significantly inhibiting tumor cell lines. When compared to basal metabolismo, its IC50 delayed glycolytic metabolismo by 12 minutes. As a result, the diabetic environment has a slowing Effect on glycolytic metabolismo. The obese environment had no significant diferences in this form os cancer as compared to the healthy environment. Tha value of mathematical modeling is clear, as the Effect of anew drug on metabolismo may be computer evaluated and used as a novel tool to provide a tailored approach to each patient.
  • Sleep quality of drivers: a study based on self-perceived and sleep companions feedback
    Publication . Lopes, Tatiana Donai; Faria, Brígida Mónica; Oliveira, Alexandra Alves; Pimenta, Rui
    Sleep is a crucial biological need for all individuals, being reparative on a physical and mental level. Driving heavy vehicles is a task that requires constant attention and vigilance, and sleep deprivation leads to behavioral and physiological changes that can develop sleep disorders that can put lives at risk. The main objective of this study is to analyse he sleep quality, excessive daytime sleepiness, circadian preference, and risk of suffering from obstructive sleep apnea in a population of portuguese of drivers. I tis also of interest to analyse and compare the ansewers given by the drivers companions to the complementary questionnaire of the Pittsburgh Sleep Index, in order to better understand self-perceived sleep. To fulfill the mentioned objectives, the flowing validated and translated to Portuguese questionnaires were also applied: Epworth Sleepiness Scale, Morningness-Eveningness, Stop-Bang Questionnaires, The Pittsburgh Sleep Index, and the Satisfaction alertness Timing Efficiency Durations. These questionnaires wre answered by 43 Portuguese drivers between 23 na d63 years old. The obtained results indicated that the older drivers tend to experience excessive daytime sleepiness (8±3; p=0.003). Regarding sleep quality, the majority of the drivers were classified with poor sleep quality (69.76%). These results allowed to infer the association between the self-perceived sleep of the drivers, and the sleep that their sleep partner perceive regarding the drivers sleep and snoring habits (α=0.10; p=0.09) and the excessive sleepiness that the drivers have while driving (α=0.05; p=0.05). Taking into consideration the circadian preferences, it was possible to conclude that the drivers of this sample have na indiferente circadian preference. Regarding perceived sleep health, the results showed that the drivers who do small courses have a worse self-perception of sleep health (22.28±5.17; p<0.01). Therefore, i tis importante to implemente prevention programs, which understand the basic rules for better sleep quality, as well as the treatment of sleep disorders in order to minimize the conequences that they entail.
  • Applications of knowledge discovery for COVID-19 pandemic study
    Publication . Correia, Cláudio Lima; Faria, Brígida Mónica; Fernandes, Rúben
    The outbreak of tfe Severe Acute Respiratory Syndrome – Coronavirus 2 (SARS-CoV-2) – also known as COVID 19 has brought global insecurity and fear to our society. Every country in the worl figts together against the spread of this deadly disease with joining efforts. Among the standard models for COVID-19 global pandemic prediction, simple epidemiological and statistical models have received more attention from authorities, wich are popular in the media.Officials around the world are using several outbreak prediction models for COVID-19 to make informe decisions and enforce relevant control measures. Due to a high level of uncertainty and lack of essential data, standard models have shown low accuracy for long-term prediction. This work aims to show na explaratory data analysis od COVID-19 worlwide to understand the real threats and the subsequente planning of containment/mitigation actions. The machine learning models were used to study and understand the everyday exponential bahavior of the COVID-19 across the nations using real-time information from johns Hopkins University and, in particular, in Portugal, with real-time information from Portugal Health Ministry to predict future reachability. In this work, modeling diferente algorithms and evaluating their performance. These algorithms are Polynominal Regression, Support Vector Regression. For the Portuguese dataset, we modeled and evaluated the following algoritms’ performance: Linera Regression, Plynominal Regression, Support Vector Regression, Multilayer Perceptron, and Poly-MultilayerPerceptron. This work also compares three different countries (but very similar – Portugal, Spain and Italy). In the particular case of Portugal, the containment/mitigation actions used by the portuguese government were explored. A comparative analysis was also caried out between Portugal, Spain, and Italy, since the first reported case, in each country, over two months. We also study the effectiveness of mitigation measures, defined by the Portugues government, carried out by the health authorities and my fellow citizens. In the worldwide prediction of the first wave of COVID-19, the best model is the Polynominal Regression model (R-squared – 0.787, MAE – 540.39, RMSE – 782.14, nd the execution time is 0.16s), and in the second wave, the best model is Support Vector Regression (R-squared – 0.996, MAE – 17.41, RMSE 18.98, and the execution time is 0.35s). In the portuguese predictions of COVID-19 (diferente waves), the best model are Polynominal Regression, Multlayer Perceptron, and Poly-MultilayerPerceptron prediction models. In comparing three diferent countries (Portugal, Spain, and Italy), Portugal had the best performance in the testing and mitigation policies. In the study of effectiveness of mitigation measures, defined by the Portuguese government, as soon as the implementation of mitigation measures mores effective are the result of mitigation of the disease.