SHORT COURSES
Gregory Lawler (Cornell and Duke Universities - USA)
The scaling limits of many models in statistical
physics have been conjectured to have conformally
invariant limits ``at criticality''. Such limits
are now being understood by mathematicians.
The chief new tool is the stochastic Loewner
evolution (SLE), introduced by Oded Schramm. This,
combined with ideas of universality, has led to a number
of rigorous results. This course will be an
introduction to conformally invariant processes and
SLE with applications to intersection
properties of Brownian motion, percolation,
loop-erased, and self-avoiding walks. Much of what
I will discuss will be joint work of Schramm,
Wendelin Werner, and myself.
If time allows, I will also discuss the situation
in three dimensions where mathematicians
are still very far
from understanding things rigorously.
Bernard Prum (Génopole Evry - France)
Biological sequences essentially consist in DNA chains, the
chromosome which transmit the information from a generation to the
following one, and proteic chains, the proteins being the essential
component of all phenomena in living cells. The first ones are writen in a
4 letters alphabet {a, c, g, t} while the second ones contain 20
letters, the amino-acid.
Daily, more than 20 millions of new deciphered letters
arrive in the data banks and a challenge for the statisticians is to help
the biologist for finding the relevant information in this huge amount of
data.
A first topic we are interesed by consists in searching
words whose frequency is too high to let believe it results from pure
randomness. As an example, in bacterial genomes exists some signal (called
CHI) which participates to their defenses and must therefore be
sufficiently frequent to be efficient. Hence CHI's role is irrelevant with
the usual genetic code but has another importance for the organism.
To search for these "exceptional" words, we look for a
modelisation which could be both satisfactory for the biologist and
tractable for the mathematician. One has to take into account the
frequencies of the letters, of the 2-letters words, 3-letters words,
etc..., hence to work conditionnally to the sufficient statistics of a
Markov chain model. In these models for each word W, using a conditionnal
approach, we compute the expectation and the variance of the number of
occurrences and give result about its (asymptotic) law.
A very relevant criticism done against this modelisation is
that it assumes the homogeneity of the sequence, and this hypothesis
is worst and worst admitted by the biologists when they deal with larger
and larger sequences. One way for answering these criticisms consists
in allowing the simultaneous existence of more than one markovian model
and this led us to work with Hidden Markov Models (HMM). These models
quickly turn out to be statistical tools permiting much more than the
separate analysis of regions choosed to be homogeneous. The fact that, at
the begining of the algorithm, we must nor fix the markovian transition in
each state nor the positions of the various states implies that adjusting
a HMM on a sequence produces its "segmentation" by allocating a common
characteristic to all the segments related to a same state.
An important drawback of the 'classical' modelisation by HMM
is that it implies that the areas corresponding to a same state must
have length distributed according to an exponential law, and this is not
at all verified in the reality of genomes. Semi-markovian models
solve this difficulty : they allow every law for the length of the
various areas.
Joined with the use of charateristics of the biological
context, these methods must significatively improve the performances of
the predictions of homogeneous regions. We will present a few applications
as search of "horizontal transfers" and "annotation".
Since some 10 years, it is admitted that beside the
vertical transmission (from parents to offsprings), a phenomenon of
horizontal transmission of genetical information plays an important role
in the evolution of life.
For example some viruses may copy a part of the genome of
some individual and transport and incorporate it in the genome of another
individual, maybe of another species. The potential profit of this
phenomenon is obvious: through such tranposons, a new beneficial gene can spread
in a great number of species. As it is well known that each species
leads to a different adjustment of a Markov model (frequencies of words
change from a species to another), modelisation using HMM is perfectly
adapted for searching tranposons.
The matter of "annotation" is to contribute to an automatic
research in DNA sequences of coding parts, and within these of exons ans
introns (in "eucaryotes" - essentially every species except bacteriae -
genes contain two kinds of regions : exon message is in fine translated
into the proteins, while introns desappear during the 'maturation'
process). HMM is also a successful approach for this problem.
Contact
Contact us at: epb6@ime.usp.br.
|