BBO Discussion Forums: Can a bot learn bridge through experience? - BBO Discussion Forums

Jump to content

Page 1 of 1
  • You cannot start a new topic
  • You cannot reply to this topic

Can a bot learn bridge through experience? GIB programming

#1 User is offline   fstrick604 

  • Pip
  • Group: Members
  • Posts: 3
  • Joined: 2012-March-17

Posted 2017-October-29, 08:37

I just read an interesting article in the 10/28 wsj. It seems an app called AlphaGoZero played AlphaGo and beat it 100 games to 0. AlphaGo was programed like GIB I imagine, and it famously beat a top Go expert not long ago. But AlphaGoZero was taught nothing about Go by humans except the rules. It was then made to play 400,000 games against itself and was able to learn the strategy by experience. This is partially how humans learn games, but I had thought computers could not do this yet. Evidently I was wrong. Bridge is different from games like Go and Chess as you don't know where all the pieces are. But my question is could Matt program GIB to learn bridge by playing millions of hands against itself or against humans?
0

#2 User is offline   nige1 

  • 5-level belongs to me
  • PipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 7,045
  • Joined: 2004-August-30
  • Gender:Male
  • Location:Glasgow Scotland
  • Interests:Poems Computers

Posted 2017-October-29, 09:47

View Postfstrick604, on 2017-October-29, 08:37, said:

I just read an interesting article in the 10/28 wsj. It seems an app called AlphaGoZero played AlphaGo and beat it 100 games to 0. AlphaGo was programed like GIB I imagine, and it famously beat a top Go expert not long ago. But AlphaGoZero was taught nothing about Go by humans except the rules. It was then made to play 400,000 games against itself and was able to learn the strategy by experience. This is partially how humans learn games, but I had thought computers could not do this yet. Evidently I was wrong. Bridge is different from games like Go and Chess as you don't know where all the pieces are. But my question is could Matt program GIB to learn bridge by playing millions of hands against itself or against humans?

Matt Ginsberg enhanced 50+ year-old AI techniques with his partition-search to create his Gib Bridge program.
AlphaGo is based on 2 more modern neural-nets and trained on top human Go games.
AlphaGo Zero reduced that to 1 neural-net and was taught only the rules of Go. It learned just by playing earlier versions of itself.
Bridge rules and scoring are more complex than Go. Bridge has a large element of chance incluidng bluff and mixed-strategy components. It requires partnership communication and co-operation. To emulate Alphago, Gib would need to be completely rebuilt starting from scratch, using a neural-net, but I think the Deepmind team could manage it, using their TPUs (Tensor-Programming Units). Although the program would probably need more games to train itself.
An interesting by-product might be much better bidding and carding systems.
.
0

#3 User is offline   fstrick604 

  • Pip
  • Group: Members
  • Posts: 3
  • Joined: 2012-March-17

Posted 2017-November-02, 11:12

View Postnige1, on 2017-October-29, 09:47, said:

Matt Ginsberg enhanced 50+ year-old AI techniques with his partition-search to create his Gib Bridge program.
AlphaGo is based on 2 more modern neural-nets and trained on top human Go games.
AlphaGo zero reduced that to 1 neural-net and was taught only the rules of Go. It learned just by playing earlier versions of itself.
Bridge rules and scoring are more complex than Go. Bridge has a large element of chance incluidng bluff and mixed-strategy components. It requires partnership communication and co-operation. To emulate Alphago, Gib would need to be completely rebuilt starting from scratch, using a neural-net, but I think the Deepmind team could manage it, using their TPUs (Tensor-Programming Units). Although the program would probably need more games to train itself.
An interesting by-product might be much better bidding and carding systems.
.

Thanks for the reply. Certainly it would be much more complex to do. It would have to have a bidding system as it does now. I wonder though if it could develop bidding judgment by playing and learning from it's mistakes. I don't know. It is interesting. I know I learned systems mostly from books, but judgment mostly from years of playing. Not always good judgment either. One thing I have always struggled to overcome is a fear of doubling because I find it so humiliating to have one wrapped. That should be so easy to overcome, but for me it isn't. I think a robot would do better in that area as they would never be ruled by irrational emotions.
0

#4 User is offline   virgosrock 

  • PipPipPipPip
  • Group: Full Members
  • Posts: 188
  • Joined: 2015-April-07

Posted 2017-November-02, 11:27

View Postfstrick604, on 2017-November-02, 11:12, said:

Thanks for the reply. Certainly it would be much more complex to do. It would have to have a bidding system as it does now. I wonder though if it could develop bidding judgment by playing and learning from it's mistakes. I don't know. It is interesting. I know I learned systems mostly from books, but judgment mostly from years of playing. Not always good judgment either. One thing I have always struggled to overcome is a fear of doubling because I find it so humiliating to have one wrapped. That should be so easy to overcome, but for me it isn't. I think a robot would do better in that area as they would never be ruled by irrational emotions.


Who is going to tell GIBBO it has made a mistake?

vrock
0

#5 User is offline   johnu 

  • PipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 1,526
  • Joined: 2008-September-10

Posted 2017-November-02, 13:19

View Postfstrick604, on 2017-October-29, 08:37, said:

But my question is could Matt program GIB to learn bridge by playing millions of hands against itself or against humans?


Matt is no longer associated with GIB so the simple answer is no. Could somebody else do the programming? GIB was not designed to "learn" so you would have to throw out all of the old code and start over again, going in a different direction. So a new program would have basically nothing in common with the old GIB.
0

#6 User is offline   smerriman 

  • PipPipPipPip
  • Group: Full Members
  • Posts: 488
  • Joined: 2014-March-15
  • Gender:Male

Posted 2017-November-02, 13:24

People have definitely experimented with learning algorithms. See eg https://arxiv.org/pdf/1607.03290.pdf . The catch is that the best bidding system for a robot is completely incomprehensible for a human, and as stated near the end of that PDF, trying to make it human-understandable usually gives a much worse result.

(GIB itself was designed to play Moscito Byte, a much simpler system for robots than 2/1, but much more complicated for humans).
0

#7 User is offline   nige1 

  • 5-level belongs to me
  • PipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 7,045
  • Joined: 2004-August-30
  • Gender:Male
  • Location:Glasgow Scotland
  • Interests:Poems Computers

Posted 2017-November-03, 08:36

View Postvirgosrock, on 2017-November-02, 11:27, said:

Who is going to tell GIBBO it has made a mistake?
Deepmind became super-expert at Atari games, by playing many games. All it was aware of was screen pixels and the score. The score provided the necessary feedback on performance.

Donald Michie's matchbox noughts-and-crosses machine would learn bad-play against really poor players because, against them, bad-play won more games than good-play. Modern AI programs seem to cope with this kind of "plateau" problem.

Bridge is complex. For example, there is the chance element. A bid or play that would be inferior in the long run, can "get lucky" on a particular layout. Neural networks seem to be able to cope with this kind of fuzziness.
0

#8 User is offline   virgosrock 

  • PipPipPipPip
  • Group: Full Members
  • Posts: 188
  • Joined: 2015-April-07

Posted 2017-November-03, 10:00

View Postnige1, on 2017-November-03, 08:36, said:

Deepmind became super-expert at Atari games, by playing many games. All it was aware of was screen pixels and the score. The score provided the necessary feedback on performance.

Donald Michie's matchbox noughts-and-crosses machine would learn bad -play against really poor players because, against them, bad-play won more games than good-play. Modern AI programs seem to cope with this kind of "plateau" problem.

Bridge is complex. For example, there is the chance element. A bid or play that would be inferior in the long run, can "get lucky" on a particular layout. Neural networks seem to be able to cope with this kind of fuzziness.


Does GIBBO know about Kelsey's Law of Vacant Places? was it place or space? I forget.

To me after playing Money Bridge on BBO ad nauseum it would seem the solution is simple. "Look at my hand/HCP. If this does not match the Blurb do something else". Not sure if can be implemented in software - should be. This takes care of a lot of problems people are seeing. The Blurb Engine and Bidding Engine don't communicate.

vrock
0

#9 User is offline   johnu 

  • PipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 1,526
  • Joined: 2008-September-10

Posted 2017-November-03, 12:52

View Postvirgosrock, on 2017-November-03, 10:00, said:

Does GIBBO know about Kelsey's Law of Vacant Places? was it place or space? I forget.


I'm not sure that GIB understands finesses :P It was my understanding that GIB always ran simulations (maybe not on opening leads?) to determine the best course of play, so if GIB used the best estimates of the other distributions, the simulations should produce the best percentage plays. To the extent that GIB's descriptions frequently have little correlation with the actual hand, the simulations aren't going to be particularly accurate.

Another problem is that GIB doesn't seem to use enough simulations to correctly model low percentage scenarios (e.g. a 4-0 split) so it won't make a no cost safety play.
0

#10 User is offline   virgosrock 

  • PipPipPipPip
  • Group: Full Members
  • Posts: 188
  • Joined: 2015-April-07

Posted 2017-November-03, 18:01

View Postjohnu, on 2017-November-03, 12:52, said:

I'm not sure that GIB understands finesses :P It was my understanding that GIB always ran simulations (maybe not on opening leads?) to determine the best course of play, so if GIB used the best estimates of the other distributions, the simulations should produce the best percentage plays. To the extent that GIB's descriptions frequently have little correlation with the actual hand, the simulations aren't going to be particularly accurate.

Another problem is that GIB doesn't seem to use enough simulations to correctly model low percentage scenarios (e.g. a 4-0 split) so it won't make a no cost safety play.


People in the know are stating quite aggressively that simulations are done only by advanced GIBBO or in rare cases where the bidding has advanced quite a bit and it has to make "hard" decisions.
It rarely takes finesses. It mostly goes for strip squeeze and endplay type of scenarios. Based on GIBBO's actions, my understanding is GIBBO was designed for endcases.

vrock
0

#11 User is offline   smerriman 

  • PipPipPipPip
  • Group: Full Members
  • Posts: 488
  • Joined: 2014-March-15
  • Gender:Male

Posted 2017-November-03, 18:20

I said that basic GIB doesn't run simulations during bidding. The entire cardplay algorithm is based on double dummy analysis of simulated opposition hands (as is the case for all bridge robots).
0

#12 User is offline   virgosrock 

  • PipPipPipPip
  • Group: Full Members
  • Posts: 188
  • Joined: 2015-April-07

Posted 2017-November-03, 19:48

View Postsmerriman, on 2017-November-03, 18:20, said:

I said that basic GIB doesn't run simulations during bidding. The entire cardplay algorithm is based on double dummy analysis of simulated opposition hands (as is the case for all bridge robots).


I stand corrected Sir. Thank you.

vrock
0

Share this topic:


Page 1 of 1
  • You cannot start a new topic
  • You cannot reply to this topic

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users