Seminari Generali

Statistical Mechanics of long-range correlations in DNA sequences

by Prof. Michele Caselle

Aula Conversi (Dipartimento di Fisca - Ed. G. Marconi)

Aula Conversi

Dipartimento di Fisca - Ed. G. Marconi

One of the most surprising features of higher eukaryotes genomes is the presence of long range correlations in the composition of the DNA sequence. These correlations were discovered more than 20 years ago when the first long continuous DNA sequences became available. In the past few years, thanks to next generation sequencing, an impressive amount of whole-genome sequences have been published, and the composition of genomic DNA can be now studied systematically over a wide range of scales and organisms. This makes now possible to assess the various models proposed for their description. After a review of these models and a few methodological remarks about the use of metaphors and models from statistical mechanics to biological problems, we shall present a specific application of a one dimensional model. In Statistical Mechanics long range correlations are the distintive features of critical points, however it is well known that no such point may exist in one dimensional models with local interactions. We then discuss an alternative possibility, involving non local interactions. We model DNA correlations in the human genome using the long range one dimensional Ising model. For distances between $10^3$ and $10^6$ base pairs the correlations show an universal behaviour and may be described by the non-mean field limit of the model. This allows us to make some testable hypothesis on the evolutionary mechanisms and on the nature of the interactions between distant portions of the DNA chain which led to the formation of the DNA correlations that we observe today in higher eukaryotes.