Tests for a double-dummy solver designing evaluation scheme?
#1
Posted 2005-September-06, 12:50
Suppose south has a particular distribution. We can generate thousands of random hands under this constraint. What is the probability that N/S can make a game?
This sort of test will give us some idea of the value of shape. One of the main points ZAR makes is that 6-3-3-1 shape is more powerful than 6-3-2-2 (14 vs. 13 ZAR points, ignoring high cards). This test will tell us whether this is true, and how large the effect is.
I would expect some interesting results. For example, it wouldn't surprise me if 6-3-3-1 (in that order) was better than 6-3-2-2, but 3-3-1-6 was not substantially better than 3-2-2-6. The reasoning being that the singleton is useful when you play in a suit, but the hands with a long minor are more likely to play in notrump where a singleton or doubleton makes little difference. In addition, I wouldn't be surprised to see that 4-4-3-2 is substantially better than 4-2-3-4, due to the increased chances of a major suit fit.
a.k.a. Appeal Without Merit
#2
Posted 2005-September-06, 14:39
- hrothgar
#3
Posted 2005-September-07, 04:53
"Of course wishes everybody to win and play as good as possible, but it is a hobby and a game, not war." 42 (BBO Forums)
"If a man speaks in the forest and there are no women around to hear is he still wrong?" anon
"Politics: an inadequate substitute for bridge." John Maynard Keynes
"This is how Europe works, it dithers, it delays, it makes cowardly small steps towards the truth and at some point that which it has admonished as impossible it embraces as inevitable." Athens University economist Yanis Varoufakis
"Krypt3ia @ Craig, dude, don't even get me started on you. You have posted so far two articles that I and others have found patently clueless. So please, step away from the keyboard before you hurt yourself." Comment on infosecisland.com
"Doing is the real hard part" Emma Coats (formerly from Pixar)
"I was working on the proof of one of my poems all the morning, and took out a comma. In the afternoon I put it back again." Oscar Wilde
"Assessment, far more than religion, has become the opiate of the people" Patricia Broadfoot, Uni of Gloucestershire, UK
#4
Posted 2005-September-07, 10:21
4-3-3-3 7.80 4-4-3-2 8.09 5-3-3-2 8.14 5-4-2-2 8.41 6-3-2-2 8.51 4-4-4-1 8.62 5-4-3-1 8.69 6-3-3-1 8.78 7-2-2-2 8.91 6-4-2-1 9.02 5-5-2-1 9.03 7-3-2-1 9.14 5-4-4-0 9.38 5-5-3-0 9.51 6-4-3-0 9.51 8-2-2-1 9.57 6-5-1-1 9.61 7-3-3-0 9.65 7-4-1-1 9.67 8-3-1-1 9.83 6-5-2-0 9.88 7-4-2-0 9.89
But the suggestion you make is an interesting one. I think I could whip it up and take a look at the %game statistic. That would differentiate between majors and minors and see how much a 1444 hand is compared to a 4441 hand. Good idea.
#5
Posted 2005-September-07, 12:05
Shape Game? 2-2=5-4 24% 3-2=4-4 24% 3-3=4-3 25% 4-3=3-3 25% 3-2=5-3 26% 4-2=4-3 26% 3-3=5-2 27% 4-3=4-2 27% 2-2=6-3 27% 4-4=3-2 28% 3-2=6-2 28% 4-2=5-2 28% 5-2=3-3 28% 5-3=3-2 28% 5-2=4-2 29% 3-1=5-4 29% 2-1=6-4 30% 2-1=5-5 30% 4-1=4-4 31% 3-1=6-3 31% 5-4=2-2 31% 4-1=5-3 32% 3-3=6-1 32% 6-2=3-2 32% 6-3=2-2 33% 4-3=5-1 33% 5-1=4-3 34% 4-4=4-1 34% 5-3=4-1 35% 4-2=6-1 35% 4-1=6-2 35% 5-4=3-1 37% 6-1=3-3 37% 6-3=3-1 38% 5-2=5-1 38% 5-1=5-2 39% 6-1=4-2 40% 6-2=4-1 41% 4-0=5-4 42% 6-4=2-1 43% 5-5=2-1 43% 5-0=4-4 44% 5-4=4-0 49%
Edit: A more complete list is now available on the 2nd page of this thread.
Because of what we're looking for, S&H are equivelent, as are D&C. So I'm going to use the notation A-B=C-D to be A&B in the majors and C&D in the minors (either way for both).
So maybe 2245 is the worst distribution in bridge and not 3334!
Okay, what can we make of this? It's a very loose scale, but 1 point of "normal" distribution corresponds to about 3.8% game increase. On Zar's scale, it's about 1 point per 2.3%.
So that means that the difference between 4-4=4-1 and 4-1=4-4 is about 1.0 points or 1.6 Zar.
To address awm's original guesses, the difference between 6-3=3-1 and 6-3=2-2 is about the same as the difference between 3-3=6-1 and 3-2=6-2 (about 1.2 points or 2.0 Zar in both cases). And the difference between 4-4=3-2 and 4-2=4-3 is only 0.5 points or 0.8 Zar.
There are some other interesting ramifications here, but I don't want to hog the stage. Anyone want to comment?
Tysen
#6
Posted 2005-September-07, 12:14
COOL!!!!
- hrothgar
#7 Guest_Jlall_*
Posted 2005-September-07, 12:16
#8
Posted 2005-September-07, 12:17
I'm missing 5-0-4-4, that should do worse than 5-4-4-0.
- hrothgar
#9
Posted 2005-September-07, 12:21
Jlall, on Sep 7 2005, 01:16 PM, said:
Shocking indeed, and notice the following:
6-4-2-1: 15 Zar points, and 4 5-3-1 points.
5-5-2-1: 14 Zar points, and 4 5-3-1 points.
5-4-4-0: 14 Zar points, and 5 5-3-1 points.
- hrothgar
#10
Posted 2005-September-07, 12:29
Many thanks for this, Tysen - it's amazing to be able to look through that table.
#11
Posted 2005-September-07, 12:30
I think so, at least for shapely hands. The difference between 6-3-3-1 and 1-3-3-6 is 6%, the same as the difference between 6-3-3-2 and 2-3-3-6. Does this mean that we should have almost 2 HCP or 3 Zar points more to open with these shapes when the long suit is a minor?
I guess this argument is a bit too simplistic, but it sure is interesting.
- hrothgar
#12
Posted 2005-September-07, 12:38
A question for Tysen though is how are these probabilities calculated?
Is it Prob(10+ tricks in a major U 11+ tricks in a minor U 9+ tricks in NT | My hand shape = X) = Prob(10+ tricks in a major U 11+ tricks in a minor U 9+ tricks in NT| my hand shape = X AND my hcp = 0)Prob(my hcp = 0| my hand shape = x) + Prob(.... | my hand shape = X AND my hcp = 1)Prob(my hcp = 1 | my hand shape = X) + ... etc.
For this conditioning argument are the conditional high card point frequencies used? That is to say, we know that shape and hcp are correlated (since with freak distributions there are less empty spaces to hold high cards).
Perhaps this is all intrinsic when you simulate enough hands?
#13 Guest_Jlall_*
Posted 2005-September-07, 12:40
#14
Posted 2005-September-07, 13:36
This seems to support the idea that it's good to open more aggressively with length in the majors. Neither ZAR nor binky nor any scheme I've seen really deals with this well -- for example 2254 and 5422 would evaluate the same under most schemes but yet the chance of game on the second hand is substantially greater. It seems very reasonable to open lighter with 5422 shape than with 2254, without even considering the obstructive value of a 1♠ opening bid.
Another point here is the power of voids. Methods like the "rule of 20" for opening tend to consider 5521 shape as "stronger" than 5440. ZAR distribution points (before fit is known) also agree, giving 15 points for the first and 14 for the second. But in fact it seems that 5440 shape is more likely to produce a game. This suggests that the 5/3/1 scheme has something going for it, but it's hard to see from the percentages how points and shape interact.
Perhaps a reasonable experiment would be to say, given a particular distribution, how many bumrap points (or points+controls as they are roughly the same) do you need such that your probability of making game is at least P (say P=40% as a reasonable threshold to start with)? This might give us a grasp on the "value" of these different shapes, which could eventually lead to something better than a 5/3/1 type approach.
a.k.a. Appeal Without Merit
#15
Posted 2005-September-07, 13:48
awm, on Sep 7 2005, 02:36 PM, said:
This seems to support the idea that it's good to open more aggressively with length in the majors. Neither ZAR nor binky nor any scheme I've seen really deals with this well -- for example 2254 and 5422 would evaluate the same under most schemes but yet the chance of game on the second hand is substantially greater. It seems very reasonable to open lighter with 5422 shape than with 2254, without even considering the obstructive value of a 1♠ opening bid.
Another point here is the power of voids. Methods like the "rule of 20" for opening tend to consider 5521 shape as "stronger" than 5440. ZAR distribution points (before fit is known) also agree, giving 15 points for the first and 14 for the second. But in fact it seems that 5440 shape is more likely to produce a game. This suggests that the 5/3/1 scheme has something going for it, but it's hard to see from the percentages how points and shape interact.
Perhaps a reasonable experiment would be to say, given a particular distribution, how many bumrap points (or points+controls as they are roughly the same) do you need such that your probability of making game is at least P (say P=40% as a reasonable threshold to start with)? This might give us a grasp on the "value" of these different shapes, which could eventually lead to something better than a 5/3/1 type approach.
Ok, how do you conclude this? Why is it better to open aggressively with length in majors compared to minors? I assume there are just as many advantages to opening agrressively in the minors, how does this refute that in favor of the majors? We all know it takes one less trick in the majors but so what? The chance for game in 8 or 9 card major is almost always greater than the same in a minor with same number of cards, so what?
I see the data but I do not see this conclusion, can anyone help?
#16
Posted 2005-September-07, 14:08
1) 1M is more preemptive than 1m. We knew this already.
2) Hands with major length are more likely to have a game on with the same HCP strength. We also knew this already (as a matter of common sense), but these data help to quantify the effect. So passing on the borderline minor hands has less risk of missing a game than on the same hands with majors instead.
#17
Posted 2005-September-07, 14:23
Hannie, on Sep 7 2005, 10:17 AM, said:
I'm missing 5-0-4-4, that should do worse than 5-4-4-0.
5-0=4-4 was cut off from the list. It is 44%. I'll edit the original post to include it.
I cut off from the table everything that happened less than 5000 times in my database. So 5-0=4-4 just happened to be below that while the other 5440 patterns were above. At a count of 5000, the error is on the order of 0.5%, so I didn't want to include rarer patterns. Let me know if any other patterns are missing.
Tysen
#18
Posted 2005-September-07, 14:28
Echognome, on Sep 7 2005, 10:38 AM, said:
Yes. And while HCP & Shape are slightly correlated, when you add up every single HCP possibility, you get everything.
#19
Posted 2005-September-07, 14:39
Blofeld, on Sep 7 2005, 03:08 PM, said:
1) 1M is more preemptive than 1m. We knew this already.
2) Hands with major length are more likely to have a game on with the same HCP strength. We also knew this already (as a matter of common sense), but these data help to quantify the effect. So passing on the borderline minor hands has less risk of missing a game than on the same hands with majors instead.
But is this not a function of majors being higher ranked and needing only 10 tricks. Again I see no evidence that an aggressive one level minor suit cannot be a very effective preempt effect at the one level. Virtually as much if not the same as a major. I see no reason to not open one level minors aggressively based on any of this data, can someone clue me in? thanks in advance.
#20
Posted 2005-September-07, 14:46
mike777, on Sep 7 2005, 02:48 PM, said:
I see the data but I do not see this conclusion, can anyone help?
There are some underlying assumptions here I may not have mentioned. Perhaps the main question is: why should I open the bidding, as opposed to passing? Some reasons:
(1) If I think we might have a game, I should bid so we can reach that game.
(2) In order to get in the opponents' way, to make it harder for them to find a contract.
(3) If I think we can make a partscore, perhaps I should open so we can get there.
(4) In order to help partner on defense, to find the right lead, count my pattern, etc.
All of these are perfectly fair reasons for bidding. But assuming fairly "constructive" methods, it seems like (1) is the major reason for opening at the one level. Keep in mind that one-level bids don't steal a huge amount of space from the opponents. Of course, things change a little bit in 3rd seat (where 3 and 4 become bigger concerns) and in 4th.
Note that many systems seem to base an opening bid on "I have half what we need for game." We see this with Goren (26 hcp for game, 13 to open), with LTC (14 losers for major suit game, 7 losers to open), and with ZAR (52 for game, 26 to open). All of these seem to be working on the assumption that (1) is the major reason to open.
So if we're willing to assume that the main reason for opening is to find our games, it seems like an opening should announce that game is reasonably likely given opener's hand. This is really what the methods above are going for isn't it? So that seems to support opening lighter with major suit length.
To give a simple example, suppose I am deciding whether to open a balanced eleven count in first seat. Since I'm balanced, directing a lead from partner (condition 4) isn't a big deal. Since I play fairly standard methods, my opening on any (4432) pattern will be one of a minor, which doesn't really take any space from the opponents. So the only real concern here is, what do I think are our chances at game? It seems from Tysen's data that it might be reasonable for me to open a 4-4-2-3 eleven count, but that with a 2-3-4-4 eleven I should probably pass.
a.k.a. Appeal Without Merit