Evaluating ZAR points

Page 1 of 1

You cannot start a new topic
You cannot reply to this topic

Evaluating ZAR points a simulation

#1 hotShot

Axxx Axx Axx Axx

Group: Advanced Members
Posts: 2,976
Joined: 2003-August-31
Gender:Male

Posted 2005-February-19, 13:23

Hi!

I'm just programming a simulation to test ZAR points. I'm using a double-summy solver to check the result. If you think I should modify something speak up now.
Here is what I'll do:
Dealer is always N, if the north hand is a ZAR opening and south holds an answer, i'll analyse the hand, otherwise it will be counted as a dropout.

To analyse it, i look for the longest fit available. If it is 8+ cards, the hands will be reevaluated using the fit information. The sum of of both players is used to find proper level. Then the double-dummy solver will try to make the contract. If the contract is made, i increment the good results, otherwise i'll increment the bad ones.

Sections are levels 2,3,4 and 5.
There is an extra section for 6/7, that is satisfied, when 11 tricks can be made. 11 is enough, because i expect any decent partnership to check for missing keycards before bidding a slam. To do this the 5 level must be save.

If the best fit has only 7 cards, the contract will be:
2 some suit, if the level is 2.
NT if the level is 3+. Since 3NT needs the same points as game in a major, i put (n NT) to the (n+1) level. So if the level says 3, the contract will be 2NT.

Every contract is played by south!

(1) Some strange settings are due to the fact that i don't want to write a bidding engine. People have invested much more brain and time to do that, than I'm willing to put on this project. And I not convinced, by most of the results.
(2) If i'd implement a bidding engine, the results would depend on the given bidding system.
(3) EW cards or possible bids are not taken into account.

#2 hrothgar

Group: Advanced Members
Posts: 15,395
Joined: 2003-February-13
Gender:Male
Location:Natick, MA
Interests:Travel
Cooking
Brewing
Hiking

Posted 2005-February-19, 14:10

Hi HotShot...

If your willing to go to this much trouble, then I strongly suggest that you look at some of the earlier posts in which Tysen and I suggested various methodologies for testing the accuracy of different hand evaluation metrics. This is a complex subject and if you design an inappropriate test then there is a very real possibility that you'll waste a lot of time...

As I noted in the past, I'd recommend an approach like the following:

1. Generate 1000 hands using any one of a variety of Dealer programs
2. Define a set of 13 buckets. Each bucket defines the maximum number of tricks that can be taken on a double dummy basis.
3. Sort the hands into buckets
4. For each hand in a given bucket, Let X = the sum of the Zar points for Declarer and Dummy
5. Calulate the Mean and Standard Deviation

The relative accuracy of different metrics can be determined from these two statistics, so a c"complete" analysis would need to compare Zar Points to an alternative schema like Bum Rap.

If you prefer, you could invert this entire proceed. Your initial buckets would measure the ccombined Zar Points of the two hands. You could then calculate average number of tricks taken for two hands with X combined Zar points.

Alderaan delenda est

#3 Guest_Jlall_*

Group: Guests

Posted 2005-February-19, 16:03

Using a double dummy analyzer wouldnt help too much.

#4 inquiry

Group: Admin
Posts: 14,566
Joined: 2003-February-13
Gender:Male
Location:Amelia Island, FL
Interests:Bridge, what else?

Posted 2005-February-19, 17:17

i would suggest that before you undertake this, you state here how you are going to count ZAR points, zar fit points, and anti-Zar points (points off for short support, and for singelton honors)...so others can offer suggestions to help you get zar points correct before running whatever test you decide to try.

--Ben--

#5 hotShot

Axxx Axx Axx Axx

Group: Advanced Members
Posts: 2,976
Joined: 2003-August-31
Gender:Male

Posted 2005-February-19, 19:23

Well here is my first impression.

I made 2 test runs yet. One counting K, KQ, Qx, Jxx, QJ with their full load.
There where more bad's than good's as expected. In the second run i put all those to 0 (exept KQ = 3+1).
This is the result of the second run.

Droped: 207
Level: 1 Good: 1 Bad: 0
Level: 2 Good: 13 Bad: 3
Level: 3 Good: 22 Bad: 10
Level: 4 Good: 20 Bad: 3
Level: 5 Good: 17 Bad: 3
Level: 6 Good: 6 Bad: 0
Level: 7 Good: 2 Bad: 0

There are problems with non fit hand, most of the bad 3 Level contract's are missfit NT's. Although I treat the misfit NT's as one level lower, they still go down.

Up to now i only use the HCP + ControlPoints + 2*longest suit + 2nd longest suit - shortest suit.

I'm going to implement the following extra's:
+1 if 15+ hcp concentrated in 3 suits or +1 if 12+ hcp in 2 suits
KQ,QJ, K, Q, J each -1 for unsave honors

For the fit reevaluation I intend to implement:
+1 for each trump honor (incl. T) with a maximum of 2 (both sides ?)
I'll look for the combined shortest suit and downgrade honors by one

Since i have no bidding taking place, I still thinking about the second suit.
So I'm not sure, if and how i will implement the extra points for the second suit.
Additionally I don't know "how many trump" were promised, because i counted the combined length, and must deside when to add the 3 HC for additional trump length.

Since it's middle of the night here, I'll take a break now.

#6 hotShot

Axxx Axx Axx Axx

Group: Advanced Members
Posts: 2,976
Joined: 2003-August-31
Gender:Male

Posted 2005-February-20, 06:54

Jlall, on Feb 19 2005, 10:03 PM, said:

Using a double dummy analyzer wouldnt help too much.

Maybe so, but it can analyse a thousand boards, much faster than i could.

You will usually not play that good, but on the other hand you won't get the perfect defence either.

#7 hotShot

Axxx Axx Axx Axx

Group: Advanced Members
Posts: 2,976
Joined: 2003-August-31
Gender:Male

Posted 2005-February-21, 05:30

I've been working on the downgrade of "disability combination" of honors.
Here's my list:
-4 K
-3 QJ
-2 AQ, AJ, KJ, Q, Qx
-1 A, AKJ, KQJ, Jx, Jxx

Upgrades:
11-14 HCP with more than 11 in 2 suits +1
15+ HCP with more than 15 in 3 suits

So i think i have the pre-bidding evaluation done.

Anyone interested, can get a csv-Files to be read with Excel or Open Office containing a list auf deal, Zar_points for each hand, the selected fit and the number of tricks the double dummy solver made.

#8 Gerben47

Group: Full Members
Posts: 428
Joined: 2003-October-27
Location:Tübingen, Germany

Posted 2005-February-21, 05:53

So what question is it you are answering?

Is it: "If I add the Zar points of the hands and select a contract, the contract will make" ?

I'm very interested in these results, if you could send me the files I'd be very grateful.
Email: gerben AT t-online DOT de

To save time you might want to use the deals from the GIB Double Dummy library (see the GIB research page).

Gerben

Two wrongs don't make a right, but three lefts do!

#9 hotShot

Axxx Axx Axx Axx

Group: Advanced Members
Posts: 2,976
Joined: 2003-August-31
Gender:Male

Posted 2005-February-21, 07:56

Gerben47, on Feb 21 2005, 11:53 AM, said:

So what question is it you are answering?

Is it: "If I add the Zar points of the hands and select a contract, the contract will make" ?

This is one of the questions, the others are:

How good is the prediction beween 3/4M?
As we know vul @ imps you start gaining, if your game/down ration is better than 38%.

How good is the prediction fo 3m, because if it is accurate 5m may be a good defence.

Weak Zar openings need controls to open, are they worth 2 defence tricks?

If i find time again, i'll try with other evaluation methods, too.

#10 hrothgar

Group: Advanced Members
Posts: 15,395
Joined: 2003-February-13
Gender:Male
Location:Natick, MA
Interests:Travel
Cooking
Brewing
Hiking

Posted 2005-February-21, 07:56

Jlall, on Feb 20 2005, 01:03 AM, said:

Using a double dummy analyzer wouldnt help too much.

I'd be very interested to know what this assertion is based on?

"Everyone" knows that double dummy analyzers do not provide a perfect approximation of single dummy play, let alone the behaviour of "falliable" wetware systems like the human brain.

With this said and done, double dummy solvers are orders of magnitude faster than alternative approaches and there is an awful lot to be said for substituting brute force and massive numbers of repition for elegance. As an analogy, consider the way that high end pharaceutical scales are now developed. The circuits built into high end scales are actually quite innaccurate. The scales themselves achieve their accuracy by weighing a samples tens of thousands of times and the averaging the results. Since the "noise" is randomly distributed, it will cancell itself out.

From my perspective, a similar approach is more than appropriate in measuring the accuracy of hand evaluation systems.

It should be noted that there can be problems with this approach. Most notably, if the double dummy analyzer introduces systemic bias, there could be problems. For example, assume that the double dummy analyzer was biased in favor of declarer this bias function was a function of the algorithm being evaluated... In this case it would be extremely difficult to differentiate the two error sources.

To date, I've never seen a good analysis that suggests that double dummy analyzers introduce systemic bias. I'd be interested in seeing anything to the contrary.

Alderaan delenda est

#11 hotShot

Axxx Axx Axx Axx

Group: Advanced Members
Posts: 2,976
Joined: 2003-August-31
Gender:Male

Posted 2005-February-21, 08:47

2000 Boards:

Droped: 492 = no opening at N or S
Misfit: 224 = no 8+ Fit (might be source of bad results)

Level: 1 Good: 73 Bad: 65
Level: 2 Good: 153 Bad: 125 42-46
Level: 3 Good: 283 Bad: 168 47-51 ZAR
Level: 4 Good: 247 Bad: 130 52-56 ZAR
Level: 5 Good: 134 Bad: 52 57-61 ZAR
Level: 6 Good: 40 Bad: 28 62-66 ZAR
Level: 7 Good: 5 Bad: 5 67+

NT contracts are shifted one level e.g.: 3NT = 52-56.

#12 cherdano

5555

Group: Advanced Members
Posts: 9,516
Joined: 2003-September-04
Gender:Male

Posted 2005-February-21, 08:52

hrothgar, on Feb 21 2005, 01:56 PM, said:

To date, I've never seen a good analysis that suggests that double dummy analyzers introduce systemic bias. I'd be interested in seeing anything to the contrary.

I think I have seen statistical analysis of Word Championship hands that showed that declarers there would on average get more tricks than they should on a double dummy basis. The deviation was s.th. like a third or half a trick.

Sounds plausible to me, given how many tricks are lost on the opening lead alone.

Arend

The easiest way to count losers is to line up the people who talk about loser count, and count them. -Kieran Dyke

#13 mikestar

Group: Full Members
Posts: 913
Joined: 2003-August-18
Location:California, USA

Posted 2005-February-21, 10:50

Double dummy solvers would tend to have a bias that varies with level. Take a grand slam the depends on a two-way finesse for the Queen of trumps. The DD declarer will never get it wrong and the DD defender gains no benefit whatever. On the other hand, DD defense will always find the opeing lead ruff to set a grand. But on the whole, the stronger the declaring side's hands, the more likely it is that DD information won't help the defense because they have little or no control of the play.

My guess is that this pro-declarer bias at higher levels helps bring DD results closer to table results. DD may be a bit unfair to Zarpoints at the partscore level--when the strength is fairly equally divided, DD info will be useful to both sides and that will be a gain for the defense vs. table results.

By the way, Zar points could be quite useful for suit contracts while being worthless for NT (compare the LTC) so the NT results will be of limited utility.

#14 hrothgar

Group: Advanced Members
Posts: 15,395
Joined: 2003-February-13
Gender:Male
Location:Natick, MA
Interests:Travel
Cooking
Brewing
Hiking

Posted 2005-February-21, 10:51

cherdano, on Feb 21 2005, 05:52 PM, said:

hrothgar, on Feb 21 2005, 01:56 PM, said:

To date, I've never seen a good analysis that suggests that double dummy analyzers introduce systemic bias. I'd be interested in seeing anything to the contrary.

Thanks for the data point: One addition "quick" comment.

Its still unclear the extent to which any such bias would impact the analysis in question.

Assume for the moment that Single Dummy play is .3456 tricks "better" than double dummy play. Furthermore, assume that this bias is the same regardless of the relative strength of the hands in question.

In this case, the bias would adjust the mean number of tricks taken but would NOT effect the relative variance. And, since the accuracy of the hand evaluation technique depends on the variance this really doesn't effect the methodology...

Alderaan delenda est

#15 tysen2k

Group: Full Members
Posts: 406
Joined: 2004-March-25

Posted 2005-February-22, 13:37

As was said earlier, it's probably best for you to review what has already been done. Here and here are the best places to start. No point in reinventing the wheel.

Also about the accuracy of DD data compared to real world declarers. Peter Cheung did an extensive study of 383,000 okbridge hands (25 million plays) and found that on average there is only 0.1 tricks difference. A DD declarer has the advantage in slam contracts, but the DD defenders have the advantage at partscores. Around game, DD is very accurate.

Tysen

A bit of blatant self-pimping - I've got a new poker book that's getting good reviews.

#16 hotShot

Axxx Axx Axx Axx

Group: Advanced Members
Posts: 2,976
Joined: 2003-August-31
Gender:Male

Posted 2005-February-22, 15:34

Thanks Tysen,

those are interesting links.
But there is something about wheeles, some have spokes, some have rims, and if they don't match in form or size, you need to get your own.

hotShot

Page 1 of 1

You cannot start a new topic
You cannot reply to this topic

BBO Discussion Forums: Evaluating ZAR points - BBO Discussion Forums

Evaluating ZAR points a simulation

#1 hotShot

#2 hrothgar

#3 Guest_Jlall_*

#4 inquiry

#5 hotShot

#6 hotShot

#7 hotShot

#8 Gerben47

#9 hotShot

#10 hrothgar

#11 hotShot

#12 cherdano

#13 mikestar

#14 hrothgar

#15 tysen2k

#16 hotShot

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users

Delete Post

Skin and Language

Execution Stats

BBO Discussion Forums: Evaluating ZAR points - BBO Discussion Forums

Evaluating ZAR points a simulation

#3 Guest_Jlall_*

1 User(s) are reading this topic 0 members, 1 guests, 0 anonymous users

Delete Post

Skin and Language

Execution Stats

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users