Probability Question

CraigD · January 21, 2010

I'm not sure I can write a general expression for the probability of a particular team out of an arbitrary number of teams finishing in an arbitrary place using ordinary math notation. A procedure program in some programming language shouldn't be too hard, but not too easy, either. One for the road (that is, for the bus) I guess :)

After a bit longer than a bus ride, here’s MUMPS code that calculates probabilities for this problem in general:

x XRX
n (XX,P,M,PS) x XX(1),XX(2) s PS=0 f C=0:1:C-1 x XX(3) s PS=PP+PS ;XX: hypography thread 22087
s N=$o(P(""),-1),C=1 f I=N-M+1:1:N-1 s C=I*C ;XX(1)
k S0 f I=2:1:N s S0(I-1)=I ;XX(2)
s A=1,T=1,PP=P(1) m S=S0 f I=1:1:M-1 x XX(4) s PI=P(S(S)),A=N-I*A,T=T-PI,PP=PI/T*PP k S(S) ;XX(3)
s S=0 f J=0:1:C\A#(N-I) s S=$o(S(S)) ;XX(4)

Here output applying it to the original given probabilities:

s P(1)=.4,P(2)=.25,P(3)=.25,P(4)=.1
s N=$o(P(""),-1) f I=1:1:N s J=P(1),P(1)=P(I),P(I)=J,PA=0 f M=1:1:N x XX s PA=PA+PS w M,?8,PS,?32,PA,!
1       .4                      .4
2       .311111111111111111     .711111111111111111
3       .2085470085470085469    .9196581196581196579
4       .08034188034188034186   1
1       .25                     .25
2       .2777777777777777778    .5277777777777777778
3       .2933455433455433455    .8211233211233211233
4       .1788766788766788766    1
1       .25                     .25
2       .2777777777777777778    .5277777777777777778
3       .2933455433455433455    .8211233211233211233
4       .1788766788766788766    1
1       .1                      .1
2       .1333333333333333333    .2333333333333333333
3       .204761904761904762     .4380952380952380953
4       .5619047619047619048    1

Here’s some applying it to a 6-team case, where the middle 4 teams have .125 prob of winning:

1       .4                      .4
2       .2730158730158730157    .6730158730158730157
3       .1724526369687660012    .8454685099846390169
4       .0968545433061562092    .942323053290795226
5       .04448855803694513354   .98681161132774036
6       .01318838867225963998   1
1       .125                    .125
2       .1507936507936507936    .2757936507936507936
3       .170170048777858625     .4459636995715094186
4       .1830795972561677143    .6290432968276771329
5       .1888068496302791719    .8178501464579563048
6       .1821498535420436956    1
...(3 repeats edited out)…
1       .1                      .1
2       .1238095238095238095    .2238095238095238095
3       .1468671679197994991    .3706766917293233086
4       .1708270676691729323    .5415037593984962409
5       .200284043441938178     .7417878028404344189
6       .2582121971595655804    .999999999999999999

With a general algorithm in hand, I think we’ve exhausted the answer to this thread’s question with the assumption

For [imath]p_{T_1 \,\mbox{wins}} +p_{T_2 \,\mbox{wins}} \dots +p_{T_n \,\mbox{wins}} = 1[/imath]

[math]p_{T_1 \,\mbox{wins when}\, T_2 \,\mbox{not playing}} = \frac{p_{T_1 mbox{wins}}}{1-p_{T_2 \,\mbox{wins}}}[/math]

As I’ve mentioned in previous posts, this assumption is perfect for games like a raffle (everybody throws the tickets they’ve bought into a box and one is drawn at random), but, I suspect, not very good for games like soccer tournaments – and thus not much good for, say, running a profitable gambling operation. I started the thread 22371 to explore ways of calculating probabilities via assumptions more realistic for a sports trounaments.

Pyrotex · January 21, 2010

Fantastic job, Craig! :night_moon:

"This case is closed!" :phones:

A23 · January 21, 2010

The formulas are correct. Did you think about how to prove their unicity ?

Pyrotex · January 21, 2010

unicity ??

A23 · January 22, 2010

Unicity means it's the only possible formula. Given the criteria. It seems to me the formula should contain some degrees of freedom :

When you write this :

Again, note that this approach makes some likely unrealistic assumptions about real-world sports outcomes. For example, , even though Ghana might have a special advantage vs. Spain that makes this probability lower, or a special weakness that makes it higher.

-) this means that some cases are possible, and there should exist a way to get out all the possible cases with choosable parameters.

-) averaging over all possible parameters (or weakness/strongness assumption), will give am expression without free parameters.

I took a simplier example with 3 teams A,B,C, with probabilities to be first A1, B1, C1.

Then for the probability of being second, I calculate a passing formula :

[math]A_2=\frac{1-A_1}{1-A_1+\frac{A_1}{B_1}(1-B_1)+\frac{A_1}{C_1}(1-C_1)}[/math]

and try to interprete it ?

In this case, it's restrained to the cases where A,B,C are not 1st, relatively to who is supposed to be first inside this remaining group ?

The problem is that I wasn't able to have a global reasoning on all the possible cases (2nd,3rd,4th),

Pyrotex · January 22, 2010

Hmmm... For the case of only 3 contenders, I get

[math]A_2 = \left(B \cdot \frac{A}{A+C} + C \cdot \frac{A}{A+B} \right)[/math]

This simplifies to

[math]A_2 = A \cdot \left(\frac{B}{1-B} + \frac{C}{1-C} \right)[/math]

if you remember that A+B+C = 1, and therefore, A+C = 1-B, etc.

So, the probability that A comes in first or second,

[math]A_{1,2} = A + A \cdot \left(\frac{B}{1-B} + \frac{C}{1-C} \right)[/math]

A23 · January 22, 2010

In first approach, I would get, with minimal assumption :

[math] 0<A_2<1-A [/math]

[math] 0<B_2<1-B [/math]

[math] C<A_2+B_2<1[/math]

and [math] C_2=1-A_2-B_2 [/math].

Why among this infinity of cases, only 1 were "possible" (not in the sense "probable") ?

From your formula, can you decide if [math] C_2>C_3[/math] supposing [math] A>B>C [/math] ?

I took an example from the WEB, to check : prob. in percent

(Advanced NFL Stats: Team Playoff Probabilities - Week 11)

[math] \begin{array}{ccccc}

& 1st & 2nd & 3rd & 4th\\

CIN & 73 &27& 0 & 0 \\

PIT &27 &64 & 9 & 0 \\

BAL &0 & 9 & 90 & 0 \\

CLE &0 & 0 & 0 & 100 \\

\end{array}[/math]

then : I take the given formula : [math]p(BAL_{2nd})=0*(.73/(1-.73)+.27/(1-.27))=0[/math] which is different from the table ?

The question was : deduce other columns from the first one.

Pyrotex · January 22, 2010

In first approach, I would get...:

A23:: [math] 0<A_2<1-A [/math] -- So you assume that the probability of A coming in second must be less than the chance of it not winning first place? Okay, I agree, because the chance of it not winning first place INCLUDES the chance of it coming in second.

A23:: [math] 0<B_2<1-B [/math] -- So I agree with this, too.

A23:: [math] C<A_2+B_2<1[/math] -- So you assume that the probability of C coming in first must be less than (A coming in second OR B coming in second). This I disagree with. Even if it is correct, you're going to have to prove this one. Suppose that C is the all-time favorite team, with a 51% chance of winning first place. Then there is no way your statement can be true.

A23:: ...Why among this infinity of cases, only 1 were "possible" (not in the sense "probable") ?

This is a bewildering question. :confused: There aren't an "infinite number of cases" here. What cases? Are you saying the solution given by Craig is a "case"? And that there could be an infinite number of equations that claim to be solutions? And how do we pick which one is the "real solution"?

CraigD · January 22, 2010

Unicity means it's the only possible formula.

A more common term for “unicity” in this context is “uniqueness”, but we know what you mean now.

Since one can always transform equations like the above algebraically, we can prove that none are unique

Given the criteria. It seems to me the formula should contain some degrees of freedom

The values of these equations, and the given probabilities they contain, are probabilities, so can be thought of as each having the single degree of freedom of a uniform random variable.

The equations themselves, however, don’t in any ordinary sense I know have any “freedom” or “randomness”. They’re just ordinary mathematical expressions.

It’s possible for a mathematical expressions of probabilities to themselves be determined by random outcomes described by probabilities, something like this:

For [imath]0 \le A, B, p < 1[/imath],

[math] C = \begin{cases}

A & 0 \le p \le .5 \\

1-A & .5 < p \le .875 \\

\frac{A}{1+B} & .875 < p

\end{cases}[/math]

but the formula describing this answer to the problem in this thread aren’t like this.

I took a simplier example with 3 teams A,B,C, with probabilities to be first A1, B1, C1.

Then for the probability of being second, I calculate a passing formula :

[math]A_2=\frac{1-A_1}{1-A_1+\frac{A_1}{B_1}(1-B_1)+\frac{A_1}{C_1}(1-C_1)}[/math]

Hmmm... For the case of only 3 contenders, I get

[math]A_2 = \left(B \cdot \frac{A}{A+C} + C \cdot \frac{A}{A+B} \right)[/math]

Pyro’s formula is the same as the one used by my program.

I began trying to show algebraically if A23’s and Pyro’s were the same, then though again, and simply tried them with the sample probabilities A1=.5, B1=.25, C1=.25, and found they gave different values for A2, Pyro’s and my program .33..., A23’s .142857142857...

A23, I think you made a mistake in finding your formula. :(

A23 · January 22, 2010

I gave a case above, but your post came in between.

If BAL was never first..it cannot be 2nd (with your formula, since p(2nd)=p(1st)*something) :

A2=A1*(B/(1-:naughty:+C/(1-C))

so if A1=0, A2=0, and hence A cannot be 2nd.

In some NFL stats however (see above), there are such cases, where a team was never 1st, but still could achieve to be 2nd sometimes.

CraigD · January 22, 2010

If BAL was never first..it cannot be 2nd (with your formula, since p(2nd)=p(1st)*something)

I think I understand what you’re saying, A23.

This thread’s question assumes that the probability of each team winning first place in the tournament is greater than zero. If this isn’t the case, our equations won’t work, because at least one of the (1-A- …) terms would be zero, resulting in a division by zero and an indeterminate value.

As you conclude, a team with prob 0 of winning 1st has a 0 prob of winning 2nd, or any other place. You can work around this if you have 1 team with a 0 prob by find the probabilities for the other teams, then assign a 100% probability of that team finishing last. If you have two or more 0 probs, however, the problem can’t be solved, as you have no way of determining the probability of those teams finishing last, next to last, etc. Mathematically, these probabilities are given by expressions like [imath]\frac{0}{0+0}[/imath], which are indeterminate.

In some NFL stats, there are such cases, where a team was never 1st, but still could achieve to be 2nd sometimes.

:Exclamati You’ve got to be careful with this sort of reasoning, as it confuses observed statistics with probability. Although observed statistics are a valid, obvious, and common way to estimate probabilities, they’re not the same as probabilities.

Consider the thread’s original example. Perhaps, over the many tournaments that have been played with these 4 teams, Ghana has never finished first. It’s statistic for incidence of first place finishes, therefore, is 0. It’s probability of finishing first, however, remains the given .1.

The distinction between observed statistics and probabilities is a subtle but important one. :Exclamati

Many philosophers and mathematicians of my acquaintance consider this distinction, and the more general question “what is a probability?” to be philosophically deep. :daydreaming: :scratchchin:

A23 · January 22, 2010

The distinction between observed statistics and probabilities is a subtle but important one. :Exclamati

Many philosophers and mathematicians of my acquaintance consider this distinction, and the more general question “what is a probability?” to be philosophically deep.

Sometimes I read that [math] p(a)=\lim_{N->\infty}\frac{N(a)}{N}[/math].

N(a) being number of times the result was a, N the total number of trials. The probability would be defined as a statistic concerning an infinite number of trials, if this limit converges.

But most of the time the probabilities are axiomatic, and look more like "possibilities"..like for a dice p(n)=1/6..for n=1,..6...without making any throw sample. (assumption of "fair" dice).

Pyrotex · January 22, 2010

Sometimes I read that [math] p(a)=\lim_{N->\infty}\frac{N(a)}{N}[/math]. ...

That is a good example of an empirical probability, also called a "point estimate". It's calculated from actual empirical data.

In my job, we don't have enough empirical data to calculate a point estimate for probabilities. What we CAN do is determine the theoretical "distribution" for probabilities. You did this when you talked about a "fair die" -- you've never rolled it, but by assuming it's "fair" you can assume that the "distribution" of probabilities for 1,2,3,4,5,6 will be uniform -- (any outcome is equally likely).

So, we often determine via theory what the "distribution" should be for a particular event, say the failure of a motorized ball valve under typical loads. That "distribution" will then yield a "mean value" for the probability of failure -- it is NOT based on actual experience, but on the shape of the "distribution" itself. This is what we call a probability.

If we rolled that die 60 times and got:

1 -- 10 rolls

2 -- 15 rolls

3 -- 8 rolls

4 -- 11 rolls

5 -- 7 rolls

6 -- 9 rolls

Then we would say: "The point estimate expectation for rolling a 2 is 25%, given our experience; but the probability of rolling a 2 is 1 out of 6, or 16.67%, given the expected distribution."

A23 · January 23, 2010

Well.

Now let suppose a case we throw 10 times a dice and the only thing we know is how many times 1 came out, let say 0.

Is this sufficient to find out that 2 never came out neither ?

(maybe some jokes are allowed? In fact it is not a joke, in life it is like that : if somebody is not 1st, then this person is nothing).

A23 · January 23, 2010

The formula you gave is a 'tour de force' of reasoning.

Sign In

Probability Question

Recommended Posts

CraigD

Pyrotex

A23

Pyrotex

A23

Pyrotex

A23

Pyrotex

CraigD

A23

CraigD

A23

Pyrotex

A23

A23

Join the conversation

Browse

Activity