[Prévia] [Próxima] [Prévia por assunto] [Próxima por assunto]
[Índice cronológico] [Índice de assunto]

Re: RE: [ABE-L]: Provocação



Como trabalho atualmente em aplicacoes de estatistica na Medicina , deixei de pensar na parte filosofica dos fundamentos da estatistica. Isso so e discutido pelos estatisticos mais experientes.E uma pena que nossa lista tenha pouca gente discutindo o assunto e quando alguem lanca uma provocacao poucos ou ninguem sabem responder. Para justificar minha incompetencia digo que filosofia e coisa de velho, de gente que nao faz mais a "coisa" e entao fica so discutindo como deve ser feita
Minha contribuicao e apenas historica.
Foi mencionado que uma conferencia em Londres que o Savage participou em 1962. Na realidade trata-se do Friday Seminar de Estatistica da Universidade de Londres. Desde os anos 30, da epoca de Fisher , Naynman , Pearson etc ate o fim dos anos 70 ( segundo o Prof Adrian Smith em 1980 em visita de 3 meses ao meu antigo departamento no IM/UFRJ me informou que estavam acabando essses encontros) que todos os departamentos de estatistica da universidade tinham esta atividade. Em cada semestre haviam seminarios de proeminentes estatisticos em visita a europa e em seguida um minicurso apos o mseminario de duracao de poucas sextas feiras. Todos os membros de departamento se reuniam neste seminario. Voce sentava ao lado de seus idolos!!! Foi nesta epoca as discussoes intensas sobre os fundamentos da estatistica em particular assisti minicuros (Smith e Lindley-livros de DeFineti, Barnard-pivotal inference, Barndorf Nielsen- plausibility inference entre outros , alem de seminarios de Le Cam , Dempster, Anscombe , Blackwell, Birnbaum , Granger, Karlin , Cox , Barndorf Nielsen, Herbert Robbins. etc. Me lembro das aulas do Prof Barnard ao fazer a predicao em 1975 de que o futuro da estatistica dependia da analise numerica e o que realmente esta ocorrendo com os MCMC, booptstrap , cros validation e metodos computacionalmente i ntensivos Talvez os nao Bayesianos da lista possam me responder porque os irmaos Pereira (uma aplicacao ver Clinics 2009 e em simulacao do aluno de doutoradona COPPE _ James Dean) consequiram um balanceamenmto melhor de variaveis prognosticas usando distancias, do que usando aleatorizacao em ensaioi clinicos Basilio

Julio Stern Escreveu:

Aproveitando o dia do professor, aceito a provocacao do meu grande mestre para assuntos de estatistica Bayesiana (aprendi com ele que o adjetivo eh pleonastico) e probabilidae subjetiva (a parte objetiva eh outra historia), Carlos Alberto de Braganca Pereira, continuando na rede ABL nossas discussoes sobre o tema de randomizacao. No artigo anexo, - Decoupling, Sparsity, Randomization and Objective Bayesian Inference. Cybernetics And Human Knowing. Vol. 15, no. 2, pp. 49-68 tento responder a algumas das alfinetadas de meu mestre. Um abraco a todos, --- Julio Stern
Date: Fri, 16 Oct 2009 02:00:53 -0300
From: cpereira@ime.usp.br
To: abe-l@ime.usp.br
Subject: [ABE-L]: Provocação
Aproveitando o dia do professor e considerando que a ESAMP está próxima, coloco
aqui uma aula do DeFinetti sobre aleatorização.
Divirtam-se
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Induction and Sample Randomization
Lecture XIII (Friday 27 April, 1979) Exchangeability and Convergence to the Observed Frequency
I would like to discuss the relation between the concepts of random experiment
and exchangeable experiment. After all, there is only a lexical difference
between the two notions, which can be summarized as follows: the expression
?equally probable events with unknown but constant probability,? used by the
objectivists does not make any sense from the subjectivist point of view, simply
because there is no such a thing as an unknown probability (the probability
being that which a certain person assigns at a certain time).
However, what is typical of these cases is exchangeability: those cases in which
one speaks of independent events with unknown but constant probability are, in
fact, all cases of exchangeability. However, behind this terminological
difference lies a conceptual difference concerning the problem of inductive
inference. The objectivists do not answer this question satisfactorily and in
fact, they almost completely neglect it. Their argument goes as follows: since,
in the long run, frequency coincides with probability, in order to determine the
probability it is sufficient to observe a somewhat large number of experiments.
From the subjectivist point of view, this argument is unacceptable. Indeed, for
us subjectivists, probability cannot be determined empirically but it is
evaluated by everyone, at any instant, on the basis of one?s own experience.
Probability, in fact, changes with every new experience.
Suppose we are drawing from an urn containing white and black balls in unknown
proportions. Suppose, however, that we know that the percentage of white balls
is one of the following: 30%, 50%, 70%, 80%. I shall call the four possible
hypotheses about the percentage of white balls H1, H2, H3 and H4, respectively.
Suppose that an initial probability is assigned to each of the hypotheses Hi
respectively. As we continue the draws, those probabilities change according to
Bayes? theorem. In fact, the probability of the hypothesis that is closest to
the observed frequency undergoes an increase. And it is probable that certain
sequences obtain such that, in the long run, the probability of one of the
hypotheses Hi will get really close to 1. And the probability relative to a
single shot would be very probably very close to the observed frequency.
However, we must always bear in mind the influence of the initial probabilities
assigned to the hypotheses Hi.1
ALPHA: However, the subjective differences are always tempered by this
convergence. Therefore, the Bayesian method, provided that the condition of
exchangeability is satisfied, is in some sense a self-corrective method (to use
Reichenbach?s term).2 DE FINETTI: Yes, it is. Who uses this term?
ALPHA: Reichenbach who, however, referred to the estimation of frequencies
rather than subjective probabilities. According to him an estimation rule is
self-corrective when the limit of the difference between the estimate obtained
with that rule and the observed frequency is 0.
BETA: Hence, the subjective probability of one of the hypotheses converges to
the value 1 as the number of experiments grows.
DE FINETTI: Yes, provided that it is borne in mind that all this does not hold
necessarily but depends on the premises (exchangeability).3
BETA: Let us suppose that there are three urns: the first one containing only
black balls, the second one only white balls and the third one half white and
half black.
DE FINETTI: This is a very simple case. In fact, as soon as two balls of
distinct colors were drawn, it would be known with certainty which urn is being
used for the draws. If, on the other hand, only white or only black balls were
drawn, then? as the number of shots grows?the probability that the draws are
being made using the first of the second urn would rapidly increase.
BETA: At the beginning, the probability reflects the personal state of mind of
whoever makes the evaluation. But then, as new draws are carried out,
differences among people?s opinions tend to disappear. Therefore, the growth of
knowledge leads the opinions to converge.
DE FINETTI: Yes, the differences in the initial opinions have no other
consequence than delaying the preponderance of the observed frequency over the
initial opinion itself. Bayesian Statistics and Sample Randomization
ALPHA: Let us now tackle the problem of the methods of random selection of
statistical samples. Savage, in this booklet, which you might be familiar with ...
DE FINETTI: What is the title?
ALPHA: The Foundations of Statistical Inference4 Barnard and Cox, 1962). It is a
short summary of the course that Savage taught for the International
Mathematical Summer Centre in Italy (Savage, 1959). Immediately after that
course, as explained in the book, Savage went to London.
DE FINETTI: OK, I understand: it is the report that Savage presented at the
conference in London.
ALPHA: As Savage writes: ?the problem of analyzing the idea of randomization is
more acute and at present more baffling, for subjectivists than for objectivists,
more baffling because an ideal subjectivist would not need randomization at all?
(Savage, 1962, p. 34). Perhaps Savage intended to say that the subjectivist,
since he should not neglect any piece of information, would have no reason to
resort to randomization by means of which some of the information available is
actually excluded. But, Savage continues, ?[t]he need for randomization
presumably lies in the imperfection of actual people and, perhaps, in the fact
that more than one person is ordinarily concerned with an investigation.?
(ibid.) This sentence suggests a new argument supporting the rationality of the
randomization of statistical samples: thanks to the randomization, the
likelihood can be computed more inter-subjectively. In fact, the Bayesian method
produces the convergence when the likelihood is the same for everyone.5 But if
the draws are not randomized, then the likelihood varies, in general, from
person to person and this might preclude convergence. What is your opinion about
this justification of the use of randomization in the formation of statistical
samples? DE FINETTI: I seem to agree with this. But I should think more carefully about it.
ALPHA: Savage adds: ?the imperfections of real people with respect to subjective
probability are vagueness and temptation to self-deception ... and randomization
properly employed may perhaps alleviate both these defects.? (ibid.) Do you
believe that Savage?s analysis is correct or do you believe that there could be
other reasons that make rational the use of the randomization of samples? It
seems to me that the practice of randomization could be justified by means of
the need for the inter-subjectivity of science. A scientific community, in fact,
accepts a result when the majority of its members recognize its value. Is it
possible to use the method of randomization in order to facilitate the agreement
of many peoples? judgments?
DE FINETTI: The problem of the randomization of the samples has a mixed
character, as it does not have a probabilistic nature only. Randomization is a
measure that guards us from the instinctive tendency ? which is often followed
bona fide ?to fiddle the results. This can be done in many ways. For instance,
it can happen that a researcher excludes some abnormal piece of data thinking
that it might be the consequence of a typo or it might be due to a faulty
measurement. This would be legitimate if it turned out, for instance, that a
certain individual?s height is 170 meters: it would be reasonable to assume that
in reality the value of the height is 170 centimeters. But in other cases there
could be a tendency to alter the real data because it is considered unreliable.
Or there could be a tendency to round off. If many people in a sample turn out
to have a height of exactly 170 centimeters and very few people a height of 169
or 171 centimeters, then it would be natural to suspect that a rounding off of
the data has taken place. Randomization is a procedure that guards the data from
some forms of manipulation and in particular, a biased selection of the sample.
ALPHA: An observation that occurred to me at this moment is as follows. The
randomization of the sample makes it easier to determine the state of
information. Taking into account all the information that one possesses would be
a lot more complicated if the choice was not random. When the sampling is
random, the influence of many relevant pieces of information present on the
state of information of the single individuals is eliminated.
DE FINETTI: Also those considerations need to be made cautiously. Suppose, for
instance, that despite the fact that the selection has been done correctly from
the point of view of precautions (re-shuffling, etc.), the sample turns out to
be decidedly skewed towards heights that are clearly too big. The suspicion
could then arise that this might be due to a systematic tendency to choose tall
people. In any case, the problem of the random selection of statistical samples
is a very complicated problem and I have never managed to find a completely
satisfactory solution to it.
ALPHA: The problem consists in this: strictly speaking one should try to
maximize the quantity of empirical information, whereas with the random
selection, one intentionally deprives oneself of some information that could
turn out to be relevant. If it were known that one individual satisfies some
relevant property, this information should also be taken into account rather
than neglected because that individual does not belong to the randomly selected
sample.6
Editor?s Notes
1. For precise details see Chapter 8.
2. ?The inductive procedure, therefore, has the character of a method of trial
and error so devised that, for sequences having a limit of the frequency, it
will automatically lead to success in a finite number of steps. It may be
called a self corrective method (or an asymptotic method)? (Reichenbach, 1949,
p. 446). Reichenbach points out (ibid., note 1) that C. S. Peirce had already
stressed in 1878, without however explaining the reason for it, the ?constant
tendency of the inductive process to correct itself? (Hartshorne and Weiss,
1960, vol. 2, p. 456).
3. Important observation, often neglected: Bayes?s theorem alone is not
sufficient to guarantee the convergence.
4. The book contains a contribution by Savage (1962).
5. To put the matters in more Definettian terms, all random samples are
exchangeable and all stratified random samples are partially exchangeable.
6. The problem of random samples has been addressed by many Bayesian authors.
See, for example, the following authors: Stone (1969); Rubin (1978); Swijtink
(1982); Kadane and Seidenfeld (1990); Spiegelhalter, Freedman, and Parmar
(1994); Papineau (1994); Berry and Kadane (1997); Frangakis, Rubin, and Zhou
(2002); Kyburg and Teng (2002); Berry (2004); Localio, Berlin, and Have (2005);
Worral (2007).

Carlos Alberto de Bragança Pereira <cpereira@ime.usp.br>
_________________________________________________________________
Hotmail: Trusted email with Microsoft?s powerful SPAM protection.
http://clk.atdmt.com/GBL/go/177141664/direct/01/


Basilio de Bragança Pereira
*Titular Professor of  Bioestatistics and of Applied Statistics
*FM-School of Medicine and COPPE-Posgraduate School of Engineering and
HUCFF-University Hospital Clementino Fraga Filho.
*UFRJ-Federal University of Rio de Janeiro
*Tel: (55 21) 2562-2594 or /2558/7045
www.po.ufrj.br/basilio/
*MailAddress:
COPPE/UFRJ
Caixa Postal 68507
CEP 21941-972 Rio de Janeiro,RJ
Brasil