Artificial intelligence for bridge bidding
#21
Posted 2019-March-03, 12:54
I recommend you join Bridge Winners https://bridgewinners.com to link with knowledgeable people. I know some year ago a French researcher who runs a big project in the same area posted there.
Best in your endeavors
#22
Posted 2019-March-03, 13:38
madvorak, on 2019-March-03, 06:41, said:
I think seven bananas and 6-9 hcp would do. Of course most pairs have more complex criteria but for opps bidding it doesn't have to be 100% realistic. If you were to optimize responses to the preempt it would be different.
#23
Posted 2019-March-03, 21:07
madvorak, on 2019-March-03, 06:41, said:
Hi
I hesitate to write anything that could be regarded as criticising the methodology of a project I know little about. And of course you have many constraints on time and resources and need to take some short cuts in developing your methods, your engine and training. I also do not know how you are using the DD hands in training so my following comment may not be valid
However, my preference in any training is to use real data. It is available out there but may be hard to get. But if you get thousands of hands from a club or site that would be my preference for training sets. However I do understand you are trying to get your algorithm to develop a system rather than learning from existing play so I may be off on the wrong track. real world data is much better for training as you know, since it has correct distributions and less biased variances than articifical simulated data. You can end up overfitting and fitting to the wrong problem. DD results do not represent true bridge scores and are already an overfit with too little variance that do not represent real bridge bidding and play
However I appreciate this is a Masters Computer Science thesis so it is the principle of what you are doing that matters. And if you can create an engine that learns from DD results you can then try training it with real data (of course you would need to know the standard of the humans generating that real data). But please beware any risks of fitting to anything artificially - they are easy traps to fall in to
regards P
#24
Posted 2019-March-04, 01:08
kellonius, on 2019-March-03, 11:16, said:
Out of curiosity what happens when this is treated as a supervised learning problem, e.g. learn to predict what the experts bid?
I am not going to do this. I would need tons of training data from real deals played by experts who all use the exactly same bidding system. I am afraid I would not be able to get such data.
#25
Posted 2019-March-04, 01:17
thepossum, on 2019-March-03, 21:07, said:
I hesitate to write anything that could be regarded as criticising the methodology of a project I know little about. And of course you have many constraints on time and resources and need to take some short cuts in developing your methods, your engine and training. I also do not know how you are using the DD hands in training so my following comment may not be valid
However, my preference in any training is to use real data. It is available out there but may be hard to get. But if you get thousands of hands from a club or site that would be my preference for training sets. However I do understand you are trying to get your algorithm to develop a system rather than learning from existing play so I may be off on the wrong track. real world data is much better for training as you know, since it has correct distributions and less biased variances than articifical simulated data. You can end up overfitting and fitting to the wrong problem. DD results do not represent true bridge scores and are already an overfit with too little variance that do not represent real bridge bidding and play
However I appreciate this is a Masters Computer Science thesis so it is the principle of what you are doing that matters. And if you can create an engine that learns from DD results you can then try training it with real data (of course you would need to know the standard of the humans generating that real data). But please beware any risks of fitting to anything artificially - they are easy traps to fall in to
regards P
Thanks for your comment! You are right. DD will cause a systematic bias to my learning algorithm. For example, 3NT is often made in reality, even when DD says that the results is 3NT-1.
On the other hand, I am able to generate much more DD results than I am able to download real tournament results. With unlimited training data, I am more able to fight against overfitting, which is very likely to happen. What's worse, with real tournament data, my AI can end up in a contract that nobody played in the tournament on the deal, but I need to evaluate the contract. And if I use real tournament data for some contracts but DD data for other contracts (which will most time be bad contracts), I end up in a worse bias than what DD causes itself.
#26
Posted 2019-March-04, 01:22
helene_t, on 2019-March-03, 13:38, said:
Good idea! I can use some rough criteria (like HCP and length) to quickly filter which deals start with a preemptive bid. Because generating random deals is fast and filtering by simple criteria too (the "slow" part is getting the DD results, which I don't need for the rest of deals), I may develop a fast way to learn something that's difficult for human players.
#28
Posted 2019-March-04, 02:11
thepossum, on 2019-March-03, 21:07, said:
One simple way to (partially) address this is to use the DD result given the most "natural" lead, or given a random choice between all natural leads, instead of (as one would normally do) use DD given the "best" (i.e. most successful) lead.
For low-level contracts, this would probably give results closer to real bridge, as declarer's advantage is mostly due to defenders not finding the best lead. It will also give an award to bidding systems that avoid leaking info to opps and thereby make it more difficult to find the best lead.
For slam hands, it is different because here declarer in practice tend to make fewer tricks than he should according to DD. So using blind leads would make the bias worse and cause the AI to bid too many slams.
So maybe the way forward is to use calibrated DD results. You can probably find all the calibration factors you need on Richard Pavlicek's website, but otherwise it would be relatively easy to construct them using either BrBr, the Vugraph archive, or Phil King's database.
#29
Posted 2019-March-04, 02:28
Thinning of a random bidding system:
Start by clustering the two hands the two partners can have. For example, in the follow up to (3♠)-3NT problem, overcaller could have eight different hand types: with 3-4 hearts or less, with 19 HCPs or less, and balanced vs a long minor. Similarly, advancer's hands could be divided into some 20 clusters.
Then you decide that auction terminations are only in three categories: 3NT, 4♥ and 4NT+. You assign a score (expected IMP loss) to each combination of hands/contracts, i.e. 8*20*3=480 different scores.
Now you start by a random bidding system: as response to 3NT, advancer will for each of the 8 clusters select each of the five possible actions (pass, 4♣, 4♦, 4♥, bypass) with 20% probability.
And finally you iteratively remove calls from the system. For example, while the initial system would let an opener who holds a balanced maximum respond to a 4♦ advance as 33% pass, 33% 4♥ and 33% bypass, you could remove the pass, thus changing it to 50% 4♥ and 50% bypass. Which call to remove would be chosen on the basis of the improvement in total expected IMPs of the system.
Repeat until the system has become deterministic.
#30
Posted 2019-March-04, 08:21
madvorak, on 2019-March-02, 02:59, said:
I am a student of Computer Science at the Charles University in Prague. I am also an enthusiastic bridge player. You can probably guess where this combination of interests leads to ...
My master thesis deals with an artificial intelligence for bridge. Specifically, my goal is to let my computer design its own bidding system without any prior knowledge. I perform experiments with evolutionary algorithms (involving genetic programming and learning classifier systems) for this purpose.
Would you be interested in watching my progress? I think that posting information about my research could increase my motivation to progress faster.
I would love to see your progress!
#31
Posted 2019-March-04, 11:17
I have nothing helpful to add, but if you want a humorous take on AI and bridge, I recommend The Principle of restricted Talent - Danny Kleinman
#32
Posted 2019-March-04, 11:29
madvorak, on 2019-March-02, 02:59, said:
I am a student of Computer Science at the Charles University in Prague. I am also an enthusiastic bridge player. You can probably guess where this combination of interests leads to ...
My master thesis deals with an artificial intelligence for bridge. Specifically, my goal is to let my computer design its own bidding system without any prior knowledge. I perform experiments with evolutionary algorithms (involving genetic programming and learning classifier systems) for this purpose.
Would you be interested in watching my progress? I think that posting information about my research could increase my motivation to progress faster.
I think you have a very interesting project and I'd love to be kept informed of your progress.
Good luck!
#33
Posted 2019-March-05, 03:16
Is the plan to make a system your bot can only play with itself? Is the plan to make the best system you can play with another bot? Or is the plan to make the best system you can with a human with a decent but not infinite memory? If the latter you would need to describe the system bids in some form a human can understand which would be very different from the others.
#34
Posted 2019-March-05, 15:05
etha, on 2019-March-05, 03:16, said:
Once the algorithm is in place for a particular setting (say 1st seat red/red), it would be easy to run it again under other settings.
As for position, there are really only two: first and second seat. This is because a 3rd seat opening should not be seen as an opening but rather as a response to the first seat pass.
But maybe the objective is, for simplicity, to make a single system that applies in all seats and vulnerabilities. In that case, one could optimize it for average vulnerability, i.e. a 500 points game bonus and 75 points for an undoubled undertrick.
A system that is optimized to both 1st and 2nd seat is slightly more complicated. One could include both 1st and 2nd seat positions in the training set but obviously a response to a 1st seat pass is very different from a response to a 2nd seat pass so this isn't really good.
I think I would just optimize it to 1st seat. There are a lot of complexities in bridge (legality of the system, vulnerability, risk willingness, opps' style etc) which are not very interesting from an AI research point of view. And the problem is still hugely complex even if it is simplified a little bit
#35
Posted 2019-March-05, 23:21
#36
Posted 2019-March-10, 09:38
cherdano, on 2019-March-05, 23:21, said:
So you would choose opening bid rules manually and then let the AI develop responses and rebids after each one individually?
Pro: opening bids could be made disjoint.
Cons: my selection of opening bids may be unsuitable for AI (moreover, having disjoint openings isn't a big deal, because the system will develop responses and other bids that cannot be disjoint, since such a test would be NP-hard; therefore, the system will have to deal with "bid ambiguity" anyways).
#37
Posted 2019-March-10, 09:40
helene_t, on 2019-March-05, 15:05, said:
As for position, there are really only two: first and second seat. This is because a 3rd seat opening should not be seen as an opening but rather as a response to the first seat pass.
But maybe the objective is, for simplicity, to make a single system that applies in all seats and vulnerabilities. In that case, one could optimize it for average vulnerability, i.e. a 500 points game bonus and 75 points for an undoubled undertrick.
A system that is optimized to both 1st and 2nd seat is slightly more complicated. One could include both 1st and 2nd seat positions in the training set but obviously a response to a 1st seat pass is very different from a response to a 2nd seat pass so this isn't really good.
I think I would just optimize it to 1st seat. There are a lot of complexities in bridge (legality of the system, vulnerability, risk willingness, opps' style etc) which are not very interesting from an AI research point of view. And the problem is still hugely complex even if it is simplified a little bit
Good ideas! I agree with everything you say.