Jump to content
Science Forums

Recommended Posts

Posted

Hi, first post so please be gentle!

 

I have been running a probabilty problem over in my head and cant get it to work as I would wish

 

Think of a sports competition like the FIFA world cup; concentrating on the group stages where each group has 4 teams. I am attempting to derive probabilities on each team's finishing position in the group.

 

Lets say the probabilities to finish at the top of an imaginary group consisting of 4 teams are as follows:

 

Spain 40%

England 25%

Holland 25%

Ghana 10%

 

It's a given that the probabilities for each team to finish in the first 4 are

Spain 100%

England 100%

Holland 100%

Ghana 100%

 

But what of the other possible outcomes. How do I calculate the probability for each team to finish in the top 2, or the top 3 positions?

 

Spain to finish 1st (40%)

Spain to finish 1st or 2nd (40% + x)

Spain to finish 1st or 2nd or 3rd (40% + x + y)

Spain to finish in top 4 (100% = z)

 

z = 100%

x + y = z - 40%

x + y = 60%

 

but how to calculate the individual values of x & y ?

 

Any help would be appreciated and educational!

Posted

You can calculate the probability of a team finishing first or second pretty easily, using conditional probability, as follows:

 

Let’s abbreviate the various probabilities of each team finishing first by the first letter of each team’s name: S, E, H, and G.

 

Then the probability, for example, of Spain finishing first or second is

[math]S_{1 or 2} = S +(1-S) \left( \frac{\frac{ES}{S+H+G} +\frac{HS}{S+E+G} +\frac{GS}{S+E+H}}{E+H+G} \right)[/math]

Using the given values S=.4, E=.25, H=/25, G=.1, this gives

[math]S_{1 or 2} = \frac{32}{45} \dot= .71111111[/math]

 

In words, the above reads something like “the probability of Spain finishing first or second is equal to the probability of it finishing first, plus the probability of it finishing not first times the sum of the probability of each of the other teams finishing first times the probability of Spain finishing first among the remaining 3 teams.”

 

It’s much harder to use this approach to calculate [imath]S_{1 or 2 or 3}[/imath] – so is left, or course, as an exercise for the reader. ;) (unless I figure out how to calculate it first :))

 

Calculating the [imath]S_{1 or 2 or 3 or 4}[/imath] is even harder, though we know that our result must equal 1.

Posted

Here’re the conditional probabilities for all of the teams finishing first, second, third, or last in the series, and their accumulations:

1: .4                         .4
2: .311111111111111111        .711111111111111111
3: .2085470085470085469       .9196581196581196579
4: .08034188034188034188      1

1: .25                        .25
2: .2777777777777777778       .5277777777777777778
3: .2933455433455433455       .8211233211233211233
4: .1788766788766788766       1

1: .1                         .1
2: .1333333333333333333       .2333333333333333333
3: .2047619047619047619       .4380952380952380952
4: .5619047619047619048       1

Writing this in traditional mathematical expressions is arduous. Here’s the MUMPS code that generated it:

s A=.4,B=.25,C=.25,D=.1
s (J,K)=A w "1: ",J,?30,K,! s J=(B*(A/(1-B)))+(C*(A/(1-C)))+(D*(A/(1-D))),K=J+K w "2: ",J,?30,K,! s J=(B*(C/(1-B))*(A/(1-B-C)))+(B*(D/(1-B))*(A/(1-B-D)))+(C*(B/(1-C))*(A/(1-C-B)))+(C*(D/(1-C))*(A/(1-C-D)))+(D*(B/(1-D))*(A/(1-D-B)))+(D*(C/(1-D))*(A/(1-D-C))),K=J+K w "3: ",J,?30,K,! s J=(B*(C/(1-B))*(D/(1-B-C)))+(B*(D/(1-B))*(C/(1-B-D)))+(C*(B/(1-C))*(D/(1-C-B)))+(C*(D/(1-C))*(B/(1-C-D)))+(D*(B/(1-D))*(C/(1-D-B)))+(D*(C/(1-D))*(B/(1-D-C))) ,K=J+K w "4: ",J,?30,K,!

This code always generates the probabilities for the team with probability of finishing first A, so to generate all 3 cases, you must execute the second line following

s A=.25,B=.4,C=.25,D=.1

s A=.1,B=.4,C=.25,D=.25

or a similar manipulation.

 

If you’re able to read this code or follow my previous post’s reasoning, and are observant enough, you’ll note that this approach makes a key assumption. It assumes that if the probability of teams A, B, C, or D finishing first are A, B, C, and D, the probability of a team (say B) finishing second if another (say A) finished first is [imath]\frac{B}{B+C+D}[/imath], or alternately, [imath]\frac{B}{1-A}[/imath]. This would be true if the game in question were a simple one, such as drawing a card at random from a deck where each team has a number of cards in the deck corresponding to its given probability of winning. In an actual soccer tournament, I doubt it would be, as psychology, injury, and complicated factors involving the relative strengths and weaknesses of the teams are involved.

Posted
...Then the probability, for example, of Spain finishing first or second is

[math]S_{1 or 2} = S +(1-S) \left( \frac{\frac{ES}{S+H+G} +\frac{HS}{S+E+G} +\frac{GS}{S+E+H}}{E+H+G} \right)[/math]

...“the probability of Spain finishing first or second is equal to

the probability of it finishing first,

plus the probability of it finishing not first

times the sum of

the probability of each of the other teams finishing first

times the probability of Spain finishing first among the remaining 3 teams.”

...

So, from this, I deduce that the probability of Spain finishing first among 3 remaining team, given that England was first overall, is:

 

[math]

(E)S_1^3 = \frac{S}{(S+H+G) (E+H+G)} [/math]

 

Now, (S+H+G) = (1-E) = probability that England is not first.

And, (E+H+G) = (1-S) = probability that Spain is not first.

 

[math] (E)S_1^3 = \frac{S}{(1-E)(1-S)} [/math]

Likewise, we can then say in similar fashion:

[math](G)S_1^3 = \frac{S}{(1-G)(1-S)} [/math] and,

[math] (H)S_1^3 = \frac{S}{(1-H)(1-S)} [/math]

 

Is this correct? If so, how did you derive these? :shrug:

 

I understand that that the probability of either Spain or England coming in first place overall is:

[math] S or E = 1 - (1-S)(1-E) [/math]

 

So, it looks like if we negate both sides, we have the probability that NEITHER Spain NOR England come in first place overall is:

 

[math] 1 - (S or E) = (1-S)(1-E) [/math], or,

[math] (Snot)(Enot) = (I-S)(1-E) [/math]

Therefore,

[math] (E)S_1^3 = \frac{S}{(Snot)(Enot)} [/math]

 

becomes equivalent to: given that England came in first overall,

the probability that Spain comes in first among the three remaining teams is:

the probability that Spain comes in first overall,

divided by the probability that neither Spain nor England come in first overall.

 

It's these divisions by a probability that confuse me.

And I cannot see how you derived this bit of logic.

Can you help me understand this?

 

We then have:

[math] (E)S_1^3 = \frac {S}{(Snot)(Enot)} = \frac {.4}{(.6)(.75)} = 1/2[/math]

[math] (H)S_1^3 = \frac {S}{(Snot)(Hnot)} = \frac {.4}{(.6)(.75)} = 1/2[/math]

[math] (G)S_1^3 = \frac {S}{(Snot)(Gnot)} = \frac {.4}{(.6)(.9)} = 3/5[/math]

 

which seems reasonable / plausible.

 

In fact, if I am correct in my retro-derivations from your formula,

then the three equations above could be generalized for any number of n+1 teams easily.

 

[math] (X)S_1^n = \frac {S}{(Snot)(Xnot)} [/math]

Posted

I don't understand the formula, why could not be w/o calculation for example. The sum should be 1, col/line, then a free choice ?

 

I add 2 lines both summing to 1.

I complete each row that it gives 1 for the last line ?

 

.4 .25 .25 .1

.2 .3 .15 .35

.1 .2 .5 .2

.3 .25 .1 .35

 

?

Posted

Maybe to choose the number in the line, start with the row having the biggest sum over it...else it stops.

 

Similar to magic squares.

 

S H E 1-S-H-E

a1 a2 a3 1-(a1+a2+a3)

b1 b2 b3 1-(b1+b2+b3)

1-S-a1-b1 1-H-a2-b2 1-E-a3-b3 X

 

with X=1-(1-S-a1-b1)-(1-H-a2-b2)-(1-E-a3-b3)

 

and domains : a1<0.6 (a2<.75 & a2<1-a1), (a3<0.75 & a3<1-a1-a2), etc...

 

another example :

 

.4 .25 .25 .1

.1 .6 .2 .1

.3 .1 .4 .2

.2 .05 .15 .6

 

 

I don't understand how you can fix the unknown, because I don't understand your formula.

Posted

Hello, A23!

I think we may have to wait for CraigD to answer our questions. He seems to be the expert in this.

I should know something about this, because I just changed careers over to probabilistic risk analysis (PRA), and we have to solve problems similar to this 4-team problem.

I worked backwards from Craig's formula (which I also do not understand).

 

However, my formula (the probability that Spain comes in first among remaining 3) should be similar to yours (the probability that Spain comes in second overall).

Posted

Okay, I just can't leave this alone. There's ego at stake here. :)

 

Obviously, the prob that Spain will come in either 1st or 2nd is

 

[math]

S_{1,2} = S_1 + S_2

[/math]

 

The hard part is figuring the prob that Spain will come in second. It boils down to one of three options: Spain does NOT come in first AND either England, Holland or Ghana DO come in first. So we have:

 

[math]

S_2 = (1-S) \left( \frac {E (1-H) (1-G) + H(1-G)(1-E) + G(1-E)(1-H)}{} \right)

[/math]

 

This is far more easy to understand than Craig's formula.

 

S_2 turns out to be 23.625%, and therefore,

 

S[1 or 2] is 63.625%, which is quite a bit lower than Craig's answer.

Posted
Hello, A23!

I think we may have to wait for CraigD to answer our questions. He seems to be the expert in this.

I should know something about this, because I just changed careers over to probabilistic risk analysis (PRA), and we have to solve problems similar to this 4-team problem.

I worked backwards from Craig's formula (which I also do not understand).

Here’s an explanation of the long line of MUMPS code in post #5,

s A=.4,B=.25,C=.25,D=.1
s (J,K)=A w "1: ",J,?30,K,! s J=(B*(A/(1-B)))+(C*(A/(1-C)))+(D*(A/(1-D))),K=J+K w "2: ",J,?30,K,! s J=(B*(C/(1-B))*(A/(1-B-C)))+(B*(D/(1-B))*(A/(1-B-D)))+(C*(B/(1-C))*(A/(1-C-B)))+(C*(D/(1-C))*(A/(1-C-D)))+(D*(B/(1-D))*(A/(1-D-B)))+(D*(C/(1-D))*(A/(1-D-C))),K=J+K w "3: ",J,?30,K,! s J=(B*(C/(1-B))*(D/(1-B-C)))+(B*(D/(1-B))*(C/(1-B-D)))+(C*(B/(1-C))*(D/(1-C-B)))+(C*(D/(1-C))*(B/(1-C-D)))+(D*(B/(1-D))*(C/(1-D-B)))+(D*(C/(1-D))*(B/(1-D-C))) ,K=J+K w "4: ",J,?30,K,!

 

Though it makes the expressions a little longer, I think it’s clearer if we use addition rather than subtraction in our denominators, and do away with the “1”s, rather than how I wrote it in the code.

 

The probability of team A winning 1st place is given.

 

The prob of A winning 2nd place is the sum of the prob of every way this can happen – that is, the sum of the prob of each of the other teams winning first place times the prob of A winning 1st place among the remaing 3 teams, including itself, that didn’t win 1st place.

 

The prob of A winning 3rd place is the sum of the product of the probs of every possible way 2 other teams can win 1st and 2nd place, times the prob of A sinning 1st among the remaining 2 teams.

 

And so on.

 

So the probability of A 1st is:

[imath]\frac{A}{A+B+C+D}[/imath].

 

Note that [imath]A+B+C+D = 1[/imath], so is written only for clarity.

 

2nd:

[imath]\frac{B}{A+B+C+D} \cdot \frac{A}{A+C+D}[/imath]

[imath] + \frac{C}{A+B+C+D} \cdot \frac{A}{A+B+D}[/imath]

[imath] + \frac{D}{A+B+C+D} \cdot \frac{A}{A+B+C} [/imath]

 

3rd:

[imath]\frac{B}{A+B+C+D} \cdot \frac{C}{A+C+D} \cdot \frac{A}{A+D} [/imath]

[imath] + \frac{B}{A+B+C+D} \cdot \frac{D}{A+C+D} \cdot \frac{A}{A+C} [/imath]

[imath] + \frac{C}{A+B+C+D} \cdot \frac{B}{A+B+D} \cdot \frac{A}{A+D} [/imath]

[imath] + \frac{C}{A+B+C+D} \cdot \frac{D}{A+B+D} \cdot \frac{A}{A+B} [/imath]

[imath] + \frac{D}{A+B+C+D} \cdot \frac{B}{A+B+C} \cdot \frac{A}{A+C} [/imath]

[imath] + \frac{D}{A+B+C+D} \cdot \frac{C}{A+B+C} \cdot \frac{A}{A+B} [/imath]

 

4th:

[imath]\frac{B}{A+B+C+D} \cdot \frac{C}{A+C+D} \cdot \frac{D}{A+D} \cdot \frac{A}{A}[/imath]

[imath] + \frac{B}{A+B+C+D} \cdot \frac{D}{A+C+D} \cdot \frac{C}{A+C} \cdot \frac{A}{A} [/imath]

[imath] + \frac{C}{A+B+C+D} \cdot \frac{B}{A+B+D} \cdot \frac{D}{A+D} \cdot \frac{A}{A} [/imath]

[imath] + \frac{C}{A+B+C+D} \cdot \frac{D}{A+B+D} \cdot \frac{B}{A+B} \cdot \frac{A}{A} [/imath]

[imath] + \frac{D}{A+B+C+D} \cdot \frac{B}{A+B+C} \cdot \frac{C}{A+C} \cdot \frac{A}{A} [/imath]

[imath] + \frac{D}{A+B+C+D} \cdot \frac{C}{A+B+C} \cdot \frac{B}{A+B} \cdot \frac{A}{A} [/imath]

 

Note that [imath] \frac{A}{A} = 1[/imath], so is written only for clarity, and also that, since we know the sum of the probabilities of A finishing 1st, 2nd, 3rd, or 4th is 1, so we could calculate the probability of finishing 4th by subtracting the previous 3 probabilities from 1, which would be easier than the above.

 

Writing a mathematical expression for this in general, for any number of player/teams (with summations and products), would be a bit trickier, though not too hard.

 

Again, note that this approach makes some likely unrealistic assumptions about real-world sports outcomes. For example, [imath]\frac{A}{A+D} = \frac{.4}{.4+.1} = .8[/imath], even though Ghana might have a special advantage vs. Spain that makes this probability lower, or a special weakness that makes it higher.

 

S_2 turns out to be 23.625%, and therefore,

 

S[1 or 2] is 63.625%, which is quite a bit lower than Craig's answer.

You think I actually trusted my own answer without having a little program randomly play the tournament thousands of times, and seeing that the observed outcomes approached the calculated probabilities?! :eek:

 

I did, and they matched the result I got. :)

Posted

Yes, it is complicated.

 

How can it be, that we can deduce if Ghana has more chance to be 3rd than 4th for example ? (in other words why are your formulas different from Pyrotex ?)

 

If we could know this, making a bet over the other rank could be a mean to earn money on average, against an uniform assumption.$$$

Posted
...(in other words why are your formulas different from Pyrotex ?)...
Well, I would say because his formulas are correct and mine are wrong. :hihi:

thanks, Craig! :)

 

Having said that, his original formula can be simplified tremendously. Rather than that monstrosity he presented on the first page, it can be boiled down to:

 

[imath]S_2 = E \cdot \frac{S}{S+H+G} + H \cdot \frac{S}{S+E+G} + G \cdot \frac{S}{S+E+H}[/imath]

 

In words, the prob of Spain coming in exactly 2nd place is the sum of:

- - prob of England coming in 1st times the prob that Spain is 1st among the remaining three;

- - prob of Holland coming in 1st times the prob that Spain is 1st among the remaining three;

- - prob of Ghana coming in 1st times the prob that Spain is 1st among the remaining three.

 

This is equivalent to:

 

[imath]S_2 = S \cdot \left( \frac{E}{1-E} + \frac{H}{1-H} + \frac{G}{1-G} \right)[/imath]

 

And therefore:

 

[imath]S_{1 or 2} = S + S \cdot \left( \frac{E}{1-E} + \frac{H}{1-H} + \frac{G}{1-G} \right)[/imath]

 

which comes out to 71.11% :)

Posted
I found that probability Spain being 2nd were : [math]S_2=S(1-S+3EGH-2(GE+GH+EH))=S(\frac{E}{1-E}+\frac{G}{1-G}+\frac{H}{1-H})[/math] by simplifying, is this true ?
Correct!!!!!!!!!! :hihi: :lol: :hihi:
Posted
Having said that, his original formula can be simplified tremendously. Rather than that monstrosity he presented on the first page, it can be boiled down to:

...

[imath]S_2 = S \cdot \left( \frac{E}{1-E} + \frac{H}{1-H} + \frac{G}{1-G} \right)[/imath]

My first equation for [imath]S_2[/imath] was pretty monstrous. :embarassed:

 

Even simplified, [imath]S_3[/imath] (which I’ll write [imath]A_3[/imath], as I find A, B, C, and D easier to keep track of than S, E, H, and G) gets a bit monstrous, too:

 

[math]A_3 = A \cdot \left( \frac{B}{1-B} \cdot \left( \frac{C}{1-B-C} +\frac{D}{1-B-D}\right) +\frac{C}{1-C} \cdot \left( \frac{B}{1-C-B} +\frac{D}{1-C-D}\right) +\frac{D}{1-D} \cdot \left( \frac{B}{1-D-B} +\frac{C}{1-D-C}\right) \right)[/math]

 

I'm not sure I can write a general expression for the probability of a particular team out of an arbitrary number of teams finishing in an arbitrary place using ordinary math notation. A procedure program in some programming language shouldn't be too hard, but not too easy, either. One for the road (that is, for the bus) I guess :)

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...