Browsing by Author "Quelhas, Dulce"
Now showing 1 - 9 of 9
Results Per Page
Sort Options
- Analysis and visualization of chromosome informationPublication . Costa, António Cardoso; Machado, J. A. Tenreiro; Quelhas, DulceThis paper analyzes the DNA code of several species in the perspective of information content. For that purpose several concepts and mathematical tools are selected towards establishing a quantitative method without a priori distorting the alphabet represented by the sequence of DNA bases. The synergies of associating Gray code, histogram characterization and multidimensional scaling visualization lead to a collection of plots with a categorical representation of species and chromosomes.
- Can power laws help us understand gene and proteome information?Publication . Costa, António Cardoso; Machado, J.A.Tenreiro; Quelhas, DulceProteins are biochemical entities consisting of one or more blocks typically folded in a 3D pattern. Each block (a polypeptide) is a single linear sequence of amino acids that are biochemically bonded together. The amino acid sequence in a protein is defined by the sequence of a gene or several genes encoded in the DNA-based genetic code. This genetic code typically uses twenty amino acids, but in certain organisms the genetic code can also include two other amino acids. After linking the amino acids during protein synthesis, each amino acid becomes a residue in a protein, which is then chemically modified, ultimately changing and defining the protein function. In this study, the authors analyze the amino acid sequence using alignment-free methods, aiming to identify structural patterns in sets of proteins and in the proteome, without any other previous assumptions. The paper starts by analyzing amino acid sequence data by means of histograms using fixed length amino acid words (tuples). After creating the initial relative frequency histograms, they are transformed and processed in order to generate quantitative results for information extraction and graphical visualization. Selected samples from two reference datasets are used, and results reveal that the proposed method is able to generate relevant outputs in accordance with current scientific knowledge in domains like protein sequence/proteome analysis.
- Entropy analysis of DNA code dynamics in human chromosomesPublication . Costa, António Cardoso; Machado, J. A. Tenreiro; Quelhas, DulceDeoxyribonucleic acid, or DNA, is the most fundamental aspect of life but present day scientific knowledge has merely scratched the surface of the problem posed by its decoding. While experimental methods provide insightful clues, the adoption of analysis tools supported by the formalism of mathematics will lead to a systematic and solid build-up of knowledge. This paper studies human DNA from the perspective of system dynamics. By associating entropy and the Fourier transform, several global properties of the code are revealed. The fractional order characteristics emerge as a natural consequence of the information content. These properties constitute a small piece of scientific knowledge that will support further efforts towards the final aim of establishing a comprehensive theory of the phenomena involved in life.
- Fractional dynamics in DNAPublication . Costa, António Cardoso; Machado, J. A. Tenreiro; Quelhas, DulceThis paper addresses the DNA code analysis in the perspective of dynamics and fractional calculus. Several mathematical tools are selected to establish a quantitative method without distorting the alphabet represented by the sequence of DNA bases. The association of Gray code, Fourier transform and fractional calculus leads to a categorical representation of species and chromosomes.
- Histogram-based DNA analysis for the visualization of chromosome, genome and species informationPublication . Costa, António Cardoso; Machado, J. A. Tenreiro; Quelhas, DulceWe describe a novel approach to explore DNA nucleotide sequence data, aiming to produce high-level categorical and structural information about the underlying chromosomes, genomes and species. The article starts by analyzing chromosomal data through histograms using fixed length DNA sequences. After creating the DNA-related histograms, a correlation between pairs of histograms is computed, producing a global correlation matrix. These data are then used as input to several data processing methods for information extraction and tabular/graphical output generation. A set of 18 species is processed and the extensive results reveal that the proposed method is able to generate significant and diversified outputs, in good accordance with current scientific knowledge in domains such as genomics and phylogenetics.
- Multi-dimensional scaling applied to histogram-based DNA analysisPublication . Costa, António Cardoso; Machado, J. A. Tenreiro; Quelhas, DulceThis paper aims to study the relationships between chromosomal DNA sequences of twenty species. We propose a methodology combining DNA-based word frequency histograms, correlation methods, and an MDS technique to visualize structural information underlying chromosomes (CRs) and species. Four statistical measures are tested (Minkowski, Cosine, Pearson product-moment, and Kendall τ rank correlations) to analyze the information content of 421 nuclear CRs from twenty species. The proposed methodology is built on mathematical tools and allows the analysis and visualization of very large amounts of stream data, like DNA sequences, with almost no assumptions other than the predefined DNA “word length.” This methodology is able to produce comprehensible three-dimensional visualizations of CR clustering and related spatial and structural patterns. The results of the four test correlation scenarios show that the high-level information clusterings produced by the MDS tool are qualitatively similar, with small variations due to each correlation method characteristics, and that the clusterings are a consequence of the input data and not method’s artifacts.
- On the DNA of eleven mammalsPublication . Costa, António Cardoso; Machado, J. A. Tenreiro; Quelhas, DulceThis paper studies the DNA code of eleven mammals from the perspective of fractional dynamics. The application of Fourier transform and power law trendlines leads to a categorical representation of species and chromosomes. The DNA information reveals long range memory characteristics.
- Shannon, rényie and tsallis entropy analysis of DNA using phase planePublication . Costa, António Cardoso; Machado, J. A. Tenreiro; Quelhas, DulceThis paper analyzes DNA information using entropy and phase plane concepts. First, the DNA code is converted into a numerical format by means of histograms that capture DNA sequence length ranging from one up to ten bases. This strategy measures dynamical evolutions from 4 up to 410 signal states. The resulting histograms are analyzed using three distinct entropy formulations namely the Shannon, Rényie and Tsallis definitions. Charts of entropy versus sequence length are applied to a set of twenty four species, characterizing 486 chromosomes. The information is synthesized and visualized by adapting phase plane concepts leading to a categorical representation of chromosomes and species.
- Wavelet analysis of human DNAPublication . Costa, António Cardoso; Tenreiro Machado, J. A.; Quelhas, DulceThis paper studies the human DNA in the perspective of signal processing. Six wavelets are tested for analyzing the information content of the human DNA. By adopting real Shannon wavelet several fundamental properties of the code are revealed. A quantitative comparison of the chromosomes and visualization through multidimensional and dendograms is developed.