BBO Discussion Forums: Artificial intelligence for bridge bidding - BBO Discussion Forums

Jump to content

  • 2 Pages +
  • 1
  • 2
  • You cannot start a new topic
  • You cannot reply to this topic

Artificial intelligence for bridge bidding

#21 User is offline   fuzzyquack 

  • PipPipPip
  • Group: Full Members
  • Posts: 84
  • Joined: 2019-March-03

Posted 2019-March-03, 12:54

madvorak,

I recommend you join Bridge Winners https://bridgewinners.com to link with knowledgeable people. I know some year ago a French researcher who runs a big project in the same area posted there.

Best in your endeavors
0

#22 User is offline   helene_t 

  • The Abbess
  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 17,080
  • Joined: 2004-April-22
  • Gender:Female
  • Location:UK

Posted 2019-March-03, 13:38

View Postmadvorak, on 2019-March-03, 06:41, said:

OK. How do you decide which deals are opened by 3 bananas from the opponent? I don't have trouble generating DD results. I have trouble selecting the deals if I want to do this specialized analysis.

I think seven bananas and 6-9 hcp would do. Of course most pairs have more complex criteria but for opps bidding it doesn't have to be 100% realistic. If you were to optimize responses to the preempt it would be different.
The world would be such a happy place, if only everyone played Acol :) --- TramTicket
0

#23 User is offline   thepossum 

  • PipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 2,371
  • Joined: 2018-July-04
  • Gender:Male
  • Location:Australia

Posted 2019-March-03, 21:07

View Postmadvorak, on 2019-March-03, 06:41, said:

I don't have trouble generating DD results


Hi

I hesitate to write anything that could be regarded as criticising the methodology of a project I know little about. And of course you have many constraints on time and resources and need to take some short cuts in developing your methods, your engine and training. I also do not know how you are using the DD hands in training so my following comment may not be valid

However, my preference in any training is to use real data. It is available out there but may be hard to get. But if you get thousands of hands from a club or site that would be my preference for training sets. However I do understand you are trying to get your algorithm to develop a system rather than learning from existing play so I may be off on the wrong track. real world data is much better for training as you know, since it has correct distributions and less biased variances than articifical simulated data. You can end up overfitting and fitting to the wrong problem. DD results do not represent true bridge scores and are already an overfit with too little variance that do not represent real bridge bidding and play

However I appreciate this is a Masters Computer Science thesis so it is the principle of what you are doing that matters. And if you can create an engine that learns from DD results you can then try training it with real data (of course you would need to know the standard of the humans generating that real data). But please beware any risks of fitting to anything artificially - they are easy traps to fall in to :)

regards P
0

#24 User is offline   madvorak 

  • PipPip
  • Group: Members
  • Posts: 14
  • Joined: 2019-March-02

Posted 2019-March-04, 01:08

View Postkellonius, on 2019-March-03, 11:16, said:

Thanks for sharing, sounds interesting.

Out of curiosity what happens when this is treated as a supervised learning problem, e.g. learn to predict what the experts bid?


I am not going to do this. I would need tons of training data from real deals played by experts who all use the exactly same bidding system. I am afraid I would not be able to get such data.
0

#25 User is offline   madvorak 

  • PipPip
  • Group: Members
  • Posts: 14
  • Joined: 2019-March-02

Posted 2019-March-04, 01:17

View Postthepossum, on 2019-March-03, 21:07, said:

Hi

I hesitate to write anything that could be regarded as criticising the methodology of a project I know little about. And of course you have many constraints on time and resources and need to take some short cuts in developing your methods, your engine and training. I also do not know how you are using the DD hands in training so my following comment may not be valid

However, my preference in any training is to use real data. It is available out there but may be hard to get. But if you get thousands of hands from a club or site that would be my preference for training sets. However I do understand you are trying to get your algorithm to develop a system rather than learning from existing play so I may be off on the wrong track. real world data is much better for training as you know, since it has correct distributions and less biased variances than articifical simulated data. You can end up overfitting and fitting to the wrong problem. DD results do not represent true bridge scores and are already an overfit with too little variance that do not represent real bridge bidding and play

However I appreciate this is a Masters Computer Science thesis so it is the principle of what you are doing that matters. And if you can create an engine that learns from DD results you can then try training it with real data (of course you would need to know the standard of the humans generating that real data). But please beware any risks of fitting to anything artificially - they are easy traps to fall in to :)

regards P


Thanks for your comment! You are right. DD will cause a systematic bias to my learning algorithm. For example, 3NT is often made in reality, even when DD says that the results is 3NT-1.

On the other hand, I am able to generate much more DD results than I am able to download real tournament results. With unlimited training data, I am more able to fight against overfitting, which is very likely to happen. What's worse, with real tournament data, my AI can end up in a contract that nobody played in the tournament on the deal, but I need to evaluate the contract. And if I use real tournament data for some contracts but DD data for other contracts (which will most time be bad contracts), I end up in a worse bias than what DD causes itself.
1

#26 User is offline   madvorak 

  • PipPip
  • Group: Members
  • Posts: 14
  • Joined: 2019-March-02

Posted 2019-March-04, 01:22

View Posthelene_t, on 2019-March-03, 13:38, said:

I think seven bananas and 6-9 hcp would do. Of course most pairs have more complex criteria but for opps bidding it doesn't have to be 100% realistic. If you were to optimize responses to the preempt it would be different.


Good idea! I can use some rough criteria (like HCP and length) to quickly filter which deals start with a preemptive bid. Because generating random deals is fast and filtering by simple criteria too (the "slow" part is getting the DD results, which I don't need for the rest of deals), I may develop a fast way to learn something that's difficult for human players.
0

#27 User is offline   xamuk 

  • Pip
  • Group: Members
  • Posts: 2
  • Joined: 2011-March-08
  • Gender:Male
  • Location:Madagascar
  • Interests:Searching for good pairs to form Team and play major events in the world.
    My P is boobookely

Posted 2019-March-04, 01:25

So, something like alphazero in chess!? ===> AlphaBridgeZero :)
0

#28 User is offline   helene_t 

  • The Abbess
  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 17,080
  • Joined: 2004-April-22
  • Gender:Female
  • Location:UK

Posted 2019-March-04, 02:11

View Postthepossum, on 2019-March-03, 21:07, said:

DD results do not represent true bridge scores and are already an overfit with too little variance that do not represent real bridge bidding and play

One simple way to (partially) address this is to use the DD result given the most "natural" lead, or given a random choice between all natural leads, instead of (as one would normally do) use DD given the "best" (i.e. most successful) lead.

For low-level contracts, this would probably give results closer to real bridge, as declarer's advantage is mostly due to defenders not finding the best lead. It will also give an award to bidding systems that avoid leaking info to opps and thereby make it more difficult to find the best lead.

For slam hands, it is different because here declarer in practice tend to make fewer tricks than he should according to DD. So using blind leads would make the bias worse and cause the AI to bid too many slams.

So maybe the way forward is to use calibrated DD results. You can probably find all the calibration factors you need on Richard Pavlicek's website, but otherwise it would be relatively easy to construct them using either BrBr, the Vugraph archive, or Phil King's database.
The world would be such a happy place, if only everyone played Acol :) --- TramTicket
0

#29 User is offline   helene_t 

  • The Abbess
  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 17,080
  • Joined: 2004-April-22
  • Gender:Female
  • Location:UK

Posted 2019-March-04, 02:28

I have an idea which I didn't explore in the other thread.

Thinning of a random bidding system:
Start by clustering the two hands the two partners can have. For example, in the follow up to (3)-3NT problem, overcaller could have eight different hand types: with 3-4 hearts or less, with 19 HCPs or less, and balanced vs a long minor. Similarly, advancer's hands could be divided into some 20 clusters.
Then you decide that auction terminations are only in three categories: 3NT, 4 and 4NT+. You assign a score (expected IMP loss) to each combination of hands/contracts, i.e. 8*20*3=480 different scores.
Now you start by a random bidding system: as response to 3NT, advancer will for each of the 8 clusters select each of the five possible actions (pass, 4, 4, 4, bypass) with 20% probability.
And finally you iteratively remove calls from the system. For example, while the initial system would let an opener who holds a balanced maximum respond to a 4 advance as 33% pass, 33% 4 and 33% bypass, you could remove the pass, thus changing it to 50% 4 and 50% bypass. Which call to remove would be chosen on the basis of the improvement in total expected IMPs of the system.
Repeat until the system has become deterministic.
The world would be such a happy place, if only everyone played Acol :) --- TramTicket
0

#30 User is offline   cloverenge 

  • Pip
  • Group: Members
  • Posts: 8
  • Joined: 2017-September-30

Posted 2019-March-04, 08:21

View Postmadvorak, on 2019-March-02, 02:59, said:

Hello!

I am a student of Computer Science at the Charles University in Prague. I am also an enthusiastic bridge player. You can probably guess where this combination of interests leads to ...

My master thesis deals with an artificial intelligence for bridge. Specifically, my goal is to let my computer design its own bidding system without any prior knowledge. I perform experiments with evolutionary algorithms (involving genetic programming and learning classifier systems) for this purpose.

Would you be interested in watching my progress? I think that posting information about my research could increase my motivation to progress faster.

I would love to see your progress!
0

#31 User is offline   Tramticket 

  • PipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 2,072
  • Joined: 2009-May-03
  • Gender:Male
  • Location:Kent (Near London)

Posted 2019-March-04, 11:17

Good luck with this project.

I have nothing helpful to add, but if you want a humorous take on AI and bridge, I recommend The Principle of restricted Talent - Danny Kleinman
0

#32 User is offline   bayhilljim 

  • Pip
  • Group: Members
  • Posts: 1
  • Joined: 2014-December-31

Posted 2019-March-04, 11:29

View Postmadvorak, on 2019-March-02, 02:59, said:

Hello!

I am a student of Computer Science at the Charles University in Prague. I am also an enthusiastic bridge player. You can probably guess where this combination of interests leads to ...

My master thesis deals with an artificial intelligence for bridge. Specifically, my goal is to let my computer design its own bidding system without any prior knowledge. I perform experiments with evolutionary algorithms (involving genetic programming and learning classifier systems) for this purpose.

Would you be interested in watching my progress? I think that posting information about my research could increase my motivation to progress faster.



I think you have a very interesting project and I'd love to be kept informed of your progress.

Good luck!
0

#33 User is offline   etha 

  • PipPipPipPip
  • Group: Full Members
  • Posts: 252
  • Joined: 2005-August-25

Posted 2019-March-05, 03:16

I had a few thoughts about this. What are you doing about position and vulnerability? It seems to me there can be 16 systems depending on these. You surely want to have different agreements based on seat and vul to be optimal.

Is the plan to make a system your bot can only play with itself? Is the plan to make the best system you can play with another bot? Or is the plan to make the best system you can with a human with a decent but not infinite memory? If the latter you would need to describe the system bids in some form a human can understand which would be very different from the others.
0

#34 User is offline   helene_t 

  • The Abbess
  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 17,080
  • Joined: 2004-April-22
  • Gender:Female
  • Location:UK

Posted 2019-March-05, 15:05

View Postetha, on 2019-March-05, 03:16, said:

I had a few thoughts about this. What are you doing about position and vulnerability? It seems to me there can be 16 systems depending on these. You surely want to have different agreements based on seat and vul to be optimal.

Once the algorithm is in place for a particular setting (say 1st seat red/red), it would be easy to run it again under other settings.

As for position, there are really only two: first and second seat. This is because a 3rd seat opening should not be seen as an opening but rather as a response to the first seat pass.

But maybe the objective is, for simplicity, to make a single system that applies in all seats and vulnerabilities. In that case, one could optimize it for average vulnerability, i.e. a 500 points game bonus and 75 points for an undoubled undertrick.

A system that is optimized to both 1st and 2nd seat is slightly more complicated. One could include both 1st and 2nd seat positions in the training set but obviously a response to a 1st seat pass is very different from a response to a 2nd seat pass so this isn't really good.

I think I would just optimize it to 1st seat. There are a lot of complexities in bridge (legality of the system, vulnerability, risk willingness, opps' style etc) which are not very interesting from an AI research point of view. And the problem is still hugely complex even if it is simplified a little bit :)
The world would be such a happy place, if only everyone played Acol :) --- TramTicket
0

#35 User is offline   cherdano 

  • 5555
  • PipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 9,516
  • Joined: 2003-September-04
  • Gender:Male

Posted 2019-March-05, 23:21

I would start with something like uncontested 2NT auctions, i.e. fixing the 2NT opening bid to some standard-ish agreement.
The easiest way to count losers is to line up the people who talk about loser count, and count them. -Kieran Dyke
1

#36 User is offline   madvorak 

  • PipPip
  • Group: Members
  • Posts: 14
  • Joined: 2019-March-02

Posted 2019-March-10, 09:38

View Postcherdano, on 2019-March-05, 23:21, said:

I would start with something like uncontested 2NT auctions, i.e. fixing the 2NT opening bid to some standard-ish agreement.


So you would choose opening bid rules manually and then let the AI develop responses and rebids after each one individually?

Pro: opening bids could be made disjoint.
Cons: my selection of opening bids may be unsuitable for AI (moreover, having disjoint openings isn't a big deal, because the system will develop responses and other bids that cannot be disjoint, since such a test would be NP-hard; therefore, the system will have to deal with "bid ambiguity" anyways).
0

#37 User is offline   madvorak 

  • PipPip
  • Group: Members
  • Posts: 14
  • Joined: 2019-March-02

Posted 2019-March-10, 09:40

View Posthelene_t, on 2019-March-05, 15:05, said:

Once the algorithm is in place for a particular setting (say 1st seat red/red), it would be easy to run it again under other settings.

As for position, there are really only two: first and second seat. This is because a 3rd seat opening should not be seen as an opening but rather as a response to the first seat pass.

But maybe the objective is, for simplicity, to make a single system that applies in all seats and vulnerabilities. In that case, one could optimize it for average vulnerability, i.e. a 500 points game bonus and 75 points for an undoubled undertrick.

A system that is optimized to both 1st and 2nd seat is slightly more complicated. One could include both 1st and 2nd seat positions in the training set but obviously a response to a 1st seat pass is very different from a response to a 2nd seat pass so this isn't really good.

I think I would just optimize it to 1st seat. There are a lot of complexities in bridge (legality of the system, vulnerability, risk willingness, opps' style etc) which are not very interesting from an AI research point of view. And the problem is still hugely complex even if it is simplified a little bit :)


Good ideas! I agree with everything you say.
0

  • 2 Pages +
  • 1
  • 2
  • You cannot start a new topic
  • You cannot reply to this topic

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users