bandontherun Posted January 13, 2010 Report Posted January 13, 2010 Hi, first post so please be gentle! I have been running a probabilty problem over in my head and cant get it to work as I would wish Think of a sports competition like the FIFA world cup; concentrating on the group stages where each group has 4 teams. I am attempting to derive probabilities on each team's finishing position in the group. Lets say the probabilities to finish at the top of an imaginary group consisting of 4 teams are as follows: Spain 40%England 25%Holland 25%Ghana 10% It's a given that the probabilities for each team to finish in the first 4 areSpain 100%England 100%Holland 100%Ghana 100% But what of the other possible outcomes. How do I calculate the probability for each team to finish in the top 2, or the top 3 positions? Spain to finish 1st (40%)Spain to finish 1st or 2nd (40% + x)Spain to finish 1st or 2nd or 3rd (40% + x + y)Spain to finish in top 4 (100% = z) z = 100%x + y = z - 40%x + y = 60% but how to calculate the individual values of x & y ? Any help would be appreciated and educational! Quote
A23 Posted January 14, 2010 Report Posted January 14, 2010 Isnt rather : p(spain not 1)=p(spain 2,3 or 4)=1-p(spain 1)=60% ? but i dont know how to find x and y. Quote
Turtle Posted January 14, 2010 Report Posted January 14, 2010 Hi, first post so please be gentle!...z = 100%x + y = z - 40%x + y = 60% but how to calculate the individual values of x & y ? Any help would be appreciated and educational! two gentle words i think; linear programming. :shrug: or so it appears to me. :naughty: :D Linear programming - Wikipedia, the free encyclopedia Quote
CraigD Posted January 16, 2010 Report Posted January 16, 2010 You can calculate the probability of a team finishing first or second pretty easily, using conditional probability, as follows: Let’s abbreviate the various probabilities of each team finishing first by the first letter of each team’s name: S, E, H, and G. Then the probability, for example, of Spain finishing first or second is[math]S_{1 or 2} = S +(1-S) \left( \frac{\frac{ES}{S+H+G} +\frac{HS}{S+E+G} +\frac{GS}{S+E+H}}{E+H+G} \right)[/math]Using the given values S=.4, E=.25, H=/25, G=.1, this gives[math]S_{1 or 2} = \frac{32}{45} \dot= .71111111[/math] In words, the above reads something like “the probability of Spain finishing first or second is equal to the probability of it finishing first, plus the probability of it finishing not first times the sum of the probability of each of the other teams finishing first times the probability of Spain finishing first among the remaining 3 teams.” It’s much harder to use this approach to calculate [imath]S_{1 or 2 or 3}[/imath] – so is left, or course, as an exercise for the reader. ;) (unless I figure out how to calculate it first :)) Calculating the [imath]S_{1 or 2 or 3 or 4}[/imath] is even harder, though we know that our result must equal 1. Quote
CraigD Posted January 17, 2010 Report Posted January 17, 2010 Here’re the conditional probabilities for all of the teams finishing first, second, third, or last in the series, and their accumulations: 1: .4 .4 2: .311111111111111111 .711111111111111111 3: .2085470085470085469 .9196581196581196579 4: .08034188034188034188 1 1: .25 .25 2: .2777777777777777778 .5277777777777777778 3: .2933455433455433455 .8211233211233211233 4: .1788766788766788766 1 1: .1 .1 2: .1333333333333333333 .2333333333333333333 3: .2047619047619047619 .4380952380952380952 4: .5619047619047619048 1 Writing this in traditional mathematical expressions is arduous. Here’s the MUMPS code that generated it:s A=.4,B=.25,C=.25,D=.1 s (J,K)=A w "1: ",J,?30,K,! s J=(B*(A/(1-B)))+(C*(A/(1-C)))+(D*(A/(1-D))),K=J+K w "2: ",J,?30,K,! s J=(B*(C/(1-B))*(A/(1-B-C)))+(B*(D/(1-B))*(A/(1-B-D)))+(C*(B/(1-C))*(A/(1-C-B)))+(C*(D/(1-C))*(A/(1-C-D)))+(D*(B/(1-D))*(A/(1-D-B)))+(D*(C/(1-D))*(A/(1-D-C))),K=J+K w "3: ",J,?30,K,! s J=(B*(C/(1-B))*(D/(1-B-C)))+(B*(D/(1-B))*(C/(1-B-D)))+(C*(B/(1-C))*(D/(1-C-B)))+(C*(D/(1-C))*(B/(1-C-D)))+(D*(B/(1-D))*(C/(1-D-B)))+(D*(C/(1-D))*(B/(1-D-C))) ,K=J+K w "4: ",J,?30,K,!This code always generates the probabilities for the team with probability of finishing first A, so to generate all 3 cases, you must execute the second line followings A=.25,B=.4,C=.25,D=.1 s A=.1,B=.4,C=.25,D=.25 or a similar manipulation. If you’re able to read this code or follow my previous post’s reasoning, and are observant enough, you’ll note that this approach makes a key assumption. It assumes that if the probability of teams A, B, C, or D finishing first are A, B, C, and D, the probability of a team (say B) finishing second if another (say A) finished first is [imath]\frac{B}{B+C+D}[/imath], or alternately, [imath]\frac{B}{1-A}[/imath]. This would be true if the game in question were a simple one, such as drawing a card at random from a deck where each team has a number of cards in the deck corresponding to its given probability of winning. In an actual soccer tournament, I doubt it would be, as psychology, injury, and complicated factors involving the relative strengths and weaknesses of the teams are involved. modest 1 Quote
Pyrotex Posted January 19, 2010 Report Posted January 19, 2010 ...Then the probability, for example, of Spain finishing first or second is[math]S_{1 or 2} = S +(1-S) \left( \frac{\frac{ES}{S+H+G} +\frac{HS}{S+E+G} +\frac{GS}{S+E+H}}{E+H+G} \right)[/math]...“the probability of Spain finishing first or second is equal to the probability of it finishing first, plus the probability of it finishing not first times the sum of the probability of each of the other teams finishing first times the probability of Spain finishing first among the remaining 3 teams.”...So, from this, I deduce that the probability of Spain finishing first among 3 remaining team, given that England was first overall, is: [math](E)S_1^3 = \frac{S}{(S+H+G) (E+H+G)} [/math] Now, (S+H+G) = (1-E) = probability that England is not first.And, (E+H+G) = (1-S) = probability that Spain is not first. [math] (E)S_1^3 = \frac{S}{(1-E)(1-S)} [/math]Likewise, we can then say in similar fashion:[math](G)S_1^3 = \frac{S}{(1-G)(1-S)} [/math] and, [math] (H)S_1^3 = \frac{S}{(1-H)(1-S)} [/math] Is this correct? If so, how did you derive these? :shrug: I understand that that the probability of either Spain or England coming in first place overall is:[math] S or E = 1 - (1-S)(1-E) [/math] So, it looks like if we negate both sides, we have the probability that NEITHER Spain NOR England come in first place overall is: [math] 1 - (S or E) = (1-S)(1-E) [/math], or,[math] (Snot)(Enot) = (I-S)(1-E) [/math]Therefore, [math] (E)S_1^3 = \frac{S}{(Snot)(Enot)} [/math] becomes equivalent to: given that England came in first overall, the probability that Spain comes in first among the three remaining teams is:the probability that Spain comes in first overall, divided by the probability that neither Spain nor England come in first overall. It's these divisions by a probability that confuse me. And I cannot see how you derived this bit of logic.Can you help me understand this? We then have:[math] (E)S_1^3 = \frac {S}{(Snot)(Enot)} = \frac {.4}{(.6)(.75)} = 1/2[/math][math] (H)S_1^3 = \frac {S}{(Snot)(Hnot)} = \frac {.4}{(.6)(.75)} = 1/2[/math][math] (G)S_1^3 = \frac {S}{(Snot)(Gnot)} = \frac {.4}{(.6)(.9)} = 3/5[/math] which seems reasonable / plausible. In fact, if I am correct in my retro-derivations from your formula,then the three equations above could be generalized for any number of n+1 teams easily. [math] (X)S_1^n = \frac {S}{(Snot)(Xnot)} [/math] Quote
A23 Posted January 20, 2010 Report Posted January 20, 2010 I don't understand the formula, why could not be w/o calculation for example. The sum should be 1, col/line, then a free choice ? I add 2 lines both summing to 1.I complete each row that it gives 1 for the last line ? .4 .25 .25 .1.2 .3 .15 .35.1 .2 .5 .2.3 .25 .1 .35 ? Quote
A23 Posted January 20, 2010 Report Posted January 20, 2010 Maybe to choose the number in the line, start with the row having the biggest sum over it...else it stops. Similar to magic squares. S H E 1-S-H-Ea1 a2 a3 1-(a1+a2+a3)b1 b2 b3 1-(b1+b2+b3)1-S-a1-b1 1-H-a2-b2 1-E-a3-b3 X with X=1-(1-S-a1-b1)-(1-H-a2-b2)-(1-E-a3-b3) and domains : a1<0.6 (a2<.75 & a2<1-a1), (a3<0.75 & a3<1-a1-a2), etc... another example : .4 .25 .25 .1.1 .6 .2 .1.3 .1 .4 .2.2 .05 .15 .6 I don't understand how you can fix the unknown, because I don't understand your formula. Quote
A23 Posted January 20, 2010 Report Posted January 20, 2010 I found that probability Spain being 2nd were : [math]S_2=S(1-S+3EGH-2(GE+GH+EH))=S(\frac{E}{1-E}+\frac{G}{1-G}+\frac{H}{1-H})[/math] by simplifying, is this true ? CraigD 1 Quote
A23 Posted January 20, 2010 Report Posted January 20, 2010 Can I just iterate the formula to find for the next rank ? Quote
Pyrotex Posted January 20, 2010 Report Posted January 20, 2010 Hello, A23!I think we may have to wait for CraigD to answer our questions. He seems to be the expert in this.I should know something about this, because I just changed careers over to probabilistic risk analysis (PRA), and we have to solve problems similar to this 4-team problem.I worked backwards from Craig's formula (which I also do not understand). However, my formula (the probability that Spain comes in first among remaining 3) should be similar to yours (the probability that Spain comes in second overall). Quote
Pyrotex Posted January 20, 2010 Report Posted January 20, 2010 Okay, I just can't leave this alone. There's ego at stake here. :) Obviously, the prob that Spain will come in either 1st or 2nd is [math]S_{1,2} = S_1 + S_2[/math] The hard part is figuring the prob that Spain will come in second. It boils down to one of three options: Spain does NOT come in first AND either England, Holland or Ghana DO come in first. So we have: [math]S_2 = (1-S) \left( \frac {E (1-H) (1-G) + H(1-G)(1-E) + G(1-E)(1-H)}{} \right)[/math] This is far more easy to understand than Craig's formula. S_2 turns out to be 23.625%, and therefore, S[1 or 2] is 63.625%, which is quite a bit lower than Craig's answer. Quote
CraigD Posted January 20, 2010 Report Posted January 20, 2010 Hello, A23!I think we may have to wait for CraigD to answer our questions. He seems to be the expert in this.I should know something about this, because I just changed careers over to probabilistic risk analysis (PRA), and we have to solve problems similar to this 4-team problem.I worked backwards from Craig's formula (which I also do not understand).Here’s an explanation of the long line of MUMPS code in post #5,s A=.4,B=.25,C=.25,D=.1 s (J,K)=A w "1: ",J,?30,K,! s J=(B*(A/(1-B)))+(C*(A/(1-C)))+(D*(A/(1-D))),K=J+K w "2: ",J,?30,K,! s J=(B*(C/(1-B))*(A/(1-B-C)))+(B*(D/(1-B))*(A/(1-B-D)))+(C*(B/(1-C))*(A/(1-C-B)))+(C*(D/(1-C))*(A/(1-C-D)))+(D*(B/(1-D))*(A/(1-D-B)))+(D*(C/(1-D))*(A/(1-D-C))),K=J+K w "3: ",J,?30,K,! s J=(B*(C/(1-B))*(D/(1-B-C)))+(B*(D/(1-B))*(C/(1-B-D)))+(C*(B/(1-C))*(D/(1-C-B)))+(C*(D/(1-C))*(B/(1-C-D)))+(D*(B/(1-D))*(C/(1-D-B)))+(D*(C/(1-D))*(B/(1-D-C))) ,K=J+K w "4: ",J,?30,K,! Though it makes the expressions a little longer, I think it’s clearer if we use addition rather than subtraction in our denominators, and do away with the “1”s, rather than how I wrote it in the code. The probability of team A winning 1st place is given. The prob of A winning 2nd place is the sum of the prob of every way this can happen – that is, the sum of the prob of each of the other teams winning first place times the prob of A winning 1st place among the remaing 3 teams, including itself, that didn’t win 1st place. The prob of A winning 3rd place is the sum of the product of the probs of every possible way 2 other teams can win 1st and 2nd place, times the prob of A sinning 1st among the remaining 2 teams. And so on. So the probability of A 1st is: [imath]\frac{A}{A+B+C+D}[/imath]. Note that [imath]A+B+C+D = 1[/imath], so is written only for clarity. 2nd:[imath]\frac{B}{A+B+C+D} \cdot \frac{A}{A+C+D}[/imath][imath] + \frac{C}{A+B+C+D} \cdot \frac{A}{A+B+D}[/imath] [imath] + \frac{D}{A+B+C+D} \cdot \frac{A}{A+B+C} [/imath] 3rd:[imath]\frac{B}{A+B+C+D} \cdot \frac{C}{A+C+D} \cdot \frac{A}{A+D} [/imath][imath] + \frac{B}{A+B+C+D} \cdot \frac{D}{A+C+D} \cdot \frac{A}{A+C} [/imath][imath] + \frac{C}{A+B+C+D} \cdot \frac{B}{A+B+D} \cdot \frac{A}{A+D} [/imath] [imath] + \frac{C}{A+B+C+D} \cdot \frac{D}{A+B+D} \cdot \frac{A}{A+B} [/imath] [imath] + \frac{D}{A+B+C+D} \cdot \frac{B}{A+B+C} \cdot \frac{A}{A+C} [/imath][imath] + \frac{D}{A+B+C+D} \cdot \frac{C}{A+B+C} \cdot \frac{A}{A+B} [/imath] 4th:[imath]\frac{B}{A+B+C+D} \cdot \frac{C}{A+C+D} \cdot \frac{D}{A+D} \cdot \frac{A}{A}[/imath][imath] + \frac{B}{A+B+C+D} \cdot \frac{D}{A+C+D} \cdot \frac{C}{A+C} \cdot \frac{A}{A} [/imath][imath] + \frac{C}{A+B+C+D} \cdot \frac{B}{A+B+D} \cdot \frac{D}{A+D} \cdot \frac{A}{A} [/imath] [imath] + \frac{C}{A+B+C+D} \cdot \frac{D}{A+B+D} \cdot \frac{B}{A+B} \cdot \frac{A}{A} [/imath] [imath] + \frac{D}{A+B+C+D} \cdot \frac{B}{A+B+C} \cdot \frac{C}{A+C} \cdot \frac{A}{A} [/imath][imath] + \frac{D}{A+B+C+D} \cdot \frac{C}{A+B+C} \cdot \frac{B}{A+B} \cdot \frac{A}{A} [/imath] Note that [imath] \frac{A}{A} = 1[/imath], so is written only for clarity, and also that, since we know the sum of the probabilities of A finishing 1st, 2nd, 3rd, or 4th is 1, so we could calculate the probability of finishing 4th by subtracting the previous 3 probabilities from 1, which would be easier than the above. Writing a mathematical expression for this in general, for any number of player/teams (with summations and products), would be a bit trickier, though not too hard. Again, note that this approach makes some likely unrealistic assumptions about real-world sports outcomes. For example, [imath]\frac{A}{A+D} = \frac{.4}{.4+.1} = .8[/imath], even though Ghana might have a special advantage vs. Spain that makes this probability lower, or a special weakness that makes it higher. S_2 turns out to be 23.625%, and therefore, S[1 or 2] is 63.625%, which is quite a bit lower than Craig's answer.You think I actually trusted my own answer without having a little program randomly play the tournament thousands of times, and seeing that the observed outcomes approached the calculated probabilities?! :eek: I did, and they matched the result I got. :) JMJones0424 1 Quote
A23 Posted January 20, 2010 Report Posted January 20, 2010 Yes, it is complicated. How can it be, that we can deduce if Ghana has more chance to be 3rd than 4th for example ? (in other words why are your formulas different from Pyrotex ?) If we could know this, making a bet over the other rank could be a mean to earn money on average, against an uniform assumption.$$$ Quote
Pyrotex Posted January 20, 2010 Report Posted January 20, 2010 ...(in other words why are your formulas different from Pyrotex ?)...Well, I would say because his formulas are correct and mine are wrong. :hihi:thanks, Craig! :) Having said that, his original formula can be simplified tremendously. Rather than that monstrosity he presented on the first page, it can be boiled down to: [imath]S_2 = E \cdot \frac{S}{S+H+G} + H \cdot \frac{S}{S+E+G} + G \cdot \frac{S}{S+E+H}[/imath] In words, the prob of Spain coming in exactly 2nd place is the sum of: - - prob of England coming in 1st times the prob that Spain is 1st among the remaining three; - - prob of Holland coming in 1st times the prob that Spain is 1st among the remaining three; - - prob of Ghana coming in 1st times the prob that Spain is 1st among the remaining three. This is equivalent to: [imath]S_2 = S \cdot \left( \frac{E}{1-E} + \frac{H}{1-H} + \frac{G}{1-G} \right)[/imath] And therefore: [imath]S_{1 or 2} = S + S \cdot \left( \frac{E}{1-E} + \frac{H}{1-H} + \frac{G}{1-G} \right)[/imath] which comes out to 71.11% :) Quote
Pyrotex Posted January 20, 2010 Report Posted January 20, 2010 I found that probability Spain being 2nd were : [math]S_2=S(1-S+3EGH-2(GE+GH+EH))=S(\frac{E}{1-E}+\frac{G}{1-G}+\frac{H}{1-H})[/math] by simplifying, is this true ?Correct!!!!!!!!!! :hihi: :lol: :hihi: Quote
CraigD Posted January 21, 2010 Report Posted January 21, 2010 Having said that, his original formula can be simplified tremendously. Rather than that monstrosity he presented on the first page, it can be boiled down to:...[imath]S_2 = S \cdot \left( \frac{E}{1-E} + \frac{H}{1-H} + \frac{G}{1-G} \right)[/imath]My first equation for [imath]S_2[/imath] was pretty monstrous. Even simplified, [imath]S_3[/imath] (which I’ll write [imath]A_3[/imath], as I find A, B, C, and D easier to keep track of than S, E, H, and G) gets a bit monstrous, too: [math]A_3 = A \cdot \left( \frac{B}{1-B} \cdot \left( \frac{C}{1-B-C} +\frac{D}{1-B-D}\right) +\frac{C}{1-C} \cdot \left( \frac{B}{1-C-B} +\frac{D}{1-C-D}\right) +\frac{D}{1-D} \cdot \left( \frac{B}{1-D-B} +\frac{C}{1-D-C}\right) \right)[/math] I'm not sure I can write a general expression for the probability of a particular team out of an arbitrary number of teams finishing in an arbitrary place using ordinary math notation. A procedure program in some programming language shouldn't be too hard, but not too easy, either. One for the road (that is, for the bus) I guess :) Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.