BBO Discussion Forums: ELO system for bbo? - BBO Discussion Forums

Jump to content

  • 3 Pages +
  • 1
  • 2
  • 3
  • You cannot start a new topic
  • You cannot reply to this topic

ELO system for bbo?

#1 User is offline   lmhk 

  • Pip
  • Group: Members
  • Posts: 4
  • Joined: 2015-December-14

Posted 2021-January-17, 03:59

I enjoy playing bridge as much as the next person. Much more, I enjoy playing competitive bridge. I know there are tournaments and masterpoints and even BBO prime on the site now. Instead of these platforms, how about creating an ELO system where each player is rated? I don't know how the programming would work, but rating systems exist in many games, including chess, GO, sports, esports and board games. From my experience in playing other games, having a rating system maximizes the competitiveness and fun. I also think it might be an interesting way to popularize the game (especially among younger people) and increase traffic. Currently, the only rating on BBO is a self-assessment of skill level, which by the way, is highly inaccurate (extremely incompetent "advanced" players all over the site). Of course, you can still divide the site into two sections - casual and competitive, and those who merely want to relax and play some evening bridge without having to worry about others judging your rating could just stay in the casual section.
0

#2 User is online   pilowsky 

  • PipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 3,622
  • Joined: 2019-October-04
  • Gender:Male
  • Location:Israel

Posted 2021-January-17, 04:33

Hi, in the short time that I've been looking at this Forum, this question has come up many times (including from me as ex-junior chess player).
Bridge is similar to all other equilibrium games such as Chess and Go, but the big dissimilarity is the requirement for people to play as a pair.
This means that if you want a rating system, there would need to be a rating system for pairs rather than individuals.
Unfortunately, many Bridge partnerships are as evanescent as steam from a boiling kettle (I could meander more through that metaphor but you get the idea).

There is a workaround that I have used. You can download an ELO calculator from the web and then play in regular massive daylongs where the number of competitors is often greater than 1000 every day. The quality is very mixed so it can be characterised as international tournament every day.
Map your results from these robot daylongs, and you will gain a rough idea of your equivalency to Chess ratings.

Clearly, stars, masterpoints and success in previous tournaments - all of which contribute to overall ranking in Bridge - are of little value in determining a player's instantaneous skill level.
This is because masterpoints accumulate like barnacles and exist to make money for Bridge organisations who then provide the tournaments that we all enjoy.

There is an equivalent problem in Chess in that titles such as International Master and Grandmaster are awarded based on achieving excellent results in high-quality tournaments.
This is why I suggest using large daylongs if you really want to map Elo (his name was Arpad Elo by the way) rankings onto Bridge players.

Of course, there is the little problem that not everyone regards robot Bridge with the same affection as I do (litotes), but them's the breaks.
0

#3 User is online   paulg 

  • PipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 5,053
  • Joined: 2003-April-26
  • Gender:Male
  • Location:Scottish Borders

Posted 2021-January-17, 05:13

One of the forerunners of BBO, OKBridge, had a rating system (Lehmans) but it generated behaviour that was generally seen as unacceptable.

A lot of members focused on their rating and would not play with weaker partners or opponents: even though the rating system protected them from weaker players, very few believed this and even fewer were willing to take the risk. This caused a lot of bad feeling when players were booted by the table host shortly after sitting down.

I believe that BBO deliberated avoided a ratings system initially to avoid this behaviour.

It is interesting to see that the English Bridge Union, and perhaps others, now have national rating schemes and the players are more accepting. Perhaps because there are more events targeted at them.
The Beer Card

I don't work for BBO and any advice is based on my BBO experience over the decades
1

#4 User is offline   lmhk 

  • Pip
  • Group: Members
  • Posts: 4
  • Joined: 2015-December-14

Posted 2021-January-17, 05:29

This is similar to ranking systems in team-based games. Personally I had some experience gaming and I understand the frustration there would be if you feel your partner is bringing you down. It is the same as losing a basketball game because of bad teammates, even if you are the best player on the team. However, similar to other games that need teammates, I believe in the long run, good players will have an increase in rating and lesser players the opposite. Or there could be partnership ratings as well as individual ratings. As for the poor behaviour, I assume there would be some terms to agree to before participating in the rating system. I believe there is a way around this if they really wanted to do it, and personally I think it would make bridge so much more fun and competitive.
0

#5 User is offline   lmhk 

  • Pip
  • Group: Members
  • Posts: 4
  • Joined: 2015-December-14

Posted 2021-January-17, 05:34

I am not trying to find out what my rating is, I am merely suggesting that having a rating system could make the game a lot more fun and competitive. There are ranking systems out there in the gaming industry that are also team-based. Everyone would still get an individual rating. The idea is, if you are a good player, in the long run, given random partners, you would yield winning results and move up and vice versa. I think it is definitely plausible and more fun for everyone if they can get matched up with players of similar skill-levels.
0

#6 User is offline   hrothgar 

  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 15,372
  • Joined: 2003-February-13
  • Gender:Male
  • Location:Natick, MA
  • Interests:Travel
    Cooking
    Brewing
    Hiking

Posted 2021-January-17, 06:21

View Postlmhk, on 2021-January-17, 05:34, said:

I think it is definitely plausible and more fun for everyone if they can get matched up with players of similar skill-levels.


The ride isn't worth the cost of the ticket

Introducing rating systems introduces an enormous range of problems.

The most direct are a series of incredibly painful social interactions when people are either

A. Trying to protect their ratings
B. Trying to blame other people because they just damaged their ratings

Running a close second is a series of incredibly painful interactions trying to explaining to people that the rating systems says that are bad bridge players.

And of course, there is the never ending joy of trying to explain "How the ratings system works" and "No, the ratings system is not broken" to people who are mathematically illiterate and technophobes.

As was already discussed, Fred made a very conscious decision not to implement a ratings system after seeing how destructive this was to the OKB playing environment.

I think that he made the right choice at the time. Right now, I think that the arguments against creating a ratings system are somewhat less strong (I think that there are good enough machine learning algorithms to do a good job at creating a ratings system. 18 years back, I'm less sure). However, the critical issue is still the impacts on the social dynamics of the site. And here, I think that ratings systems are still rank poison.
Alderaan delenda est
0

#7 User is offline   pescetom 

  • PipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 7,204
  • Joined: 2014-February-18
  • Gender:Male
  • Location:Italy

Posted 2021-January-17, 09:27

View Posthrothgar, on 2021-January-17, 06:21, said:

Right now, I think that the arguments against creating a ratings system are somewhat less strong (I think that there are good enough machine learning algorithms to do a good job at creating a ratings system. 18 years back, I'm less sure). However, the critical issue is still the impacts on the social dynamics of the site. And here, I think that ratings systems are still rank poison.

Not sure we even need machine learning. As previously discussed, the EBU NGS ranking scheme seems to work and is designed to see beyond pairs. It could be adapted to the more mainstream BBO tournaments without problems I would expect.

As for the social issues, you may be right. I have played on another online card game site where the rating system poisons all social interactions, for the reasons you mention. But I would be curious to know if there are counter-experiences. Does Elo create chronic social problems in online chess, which is considerably more popular than bridge? If not, could it be because the rating is accepted as site-independent and true?
0

#8 User is offline   hrothgar 

  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 15,372
  • Joined: 2003-February-13
  • Gender:Male
  • Location:Natick, MA
  • Interests:Travel
    Cooking
    Brewing
    Hiking

Posted 2021-January-17, 09:31

View Postpescetom, on 2021-January-17, 09:27, said:

Not sure we even need machine learning. As previously discussed, the EBU NGS ranking scheme seems to work and is designed to see beyond pairs. It could be adapted to the more mainstream BBO tournaments without problems I would expect.


"seems to work" is mighty thin soup

I can point out any number of flaws with the NGS system

If you are going to do this, do it right.
Alderaan delenda est
1

#9 User is offline   pescetom 

  • PipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 7,204
  • Joined: 2014-February-18
  • Gender:Male
  • Location:Italy

Posted 2021-January-17, 09:40

View Posthrothgar, on 2021-January-17, 09:31, said:

"seems to work" is mighty thin soup

You said that before, but I don't remember you pointing out any actual flaws.
Have you done so to the NGS Working Group, or on BridgeWinners which would be a better place than here?
0

#10 User is offline   hrothgar 

  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 15,372
  • Joined: 2003-February-13
  • Gender:Male
  • Location:Natick, MA
  • Interests:Travel
    Cooking
    Brewing
    Hiking

Posted 2021-January-17, 10:21

View Postpescetom, on 2021-January-17, 09:40, said:

You said that before, but I don't remember you pointing out any actual flaws.
Have you done so to the NGS Working Group, or on BridgeWinners which would be a better place than here?


Yes I have.
They don't care.
They are happy with their archaic little scheme and aren't interested in considering other approaches.

In terms of the flaw, at the most basic the NGS scheme calculated results on a session by session basis. It is unable to adjust for the way in which differences in individual boards impact your results. Some board are (naturally) going to be flat. Other have a lot more room for player skill to impact results. If you're unlucky enough to play flat boards versus weak pairs and complicated boards against strong pairs you're going to have a crappy session.

NGS, ELO and the like are based on methods that are 40+ years old.

In the world of machine learning and AI, 5 years is a lifetime.

The EBU's attitude seems to be

We came up with the following.
We think it's good enough
There's no reason for us to improve anything
We aren't even interested in doing to bake off to evaluate other schemes or compare accuracy
Alderaan delenda est
1

#11 User is offline   nige1 

  • 5-level belongs to me
  • PipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 9,128
  • Joined: 2004-August-30
  • Gender:Male
  • Location:Glasgow Scotland
  • Interests:Poems Computers

Posted 2021-January-17, 14:35

View Postpescetom, on 2021-January-17, 09:27, said:

Not sure we even need machine learning. As previously discussed, the EBU NGS ranking scheme seems to work and is designed to see beyond pairs. It could be adapted to the more mainstream BBO tournaments without problems I would expect.

As for the social issues, you may be right. I have played on another online card game site where the rating system poisons all social interactions, for the reasons you mention. But I would be curious to know if there are counter-experiences. Does Elo create chronic social problems in online chess, which is considerably more popular than bridge? If not, could it be because the rating is accepted as site-independent and true?

Some of my partners have EBU NGS ratings. NGS ratings are crude and simple but they're easy to understand and friends generally like them. Rating schemes can cause problems and BBO has been set against them from the beginning. In the unlikely event that BBO adopts such a scheme, IMO, BBO should allow dissenting players to opt out and revert to the current daft self-rating system.
0

#12 User is online   johnu 

  • PipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 4,834
  • Joined: 2008-September-10
  • Gender:Male

Posted 2021-January-17, 14:55

If you play ACBL pair games, the scores are recorded at

Colorado Springs Power Ratings

I can't vouch for their accuracy, but I did find a local cheating pair based on an extraordinary high online rating.
0

#13 User is offline   pescetom 

  • PipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 7,204
  • Joined: 2014-February-18
  • Gender:Male
  • Location:Italy

Posted 2021-January-17, 15:35

View Posthrothgar, on 2021-January-17, 10:21, said:

Yes I have.
They don't care.
They are happy with their archaic little scheme and aren't interested in considering other approaches.

In terms of the flaw, at the most basic the NGS scheme calculated results on a session by session basis. It is unable to adjust for the way in which differences in individual boards impact your results. Some board are (naturally) going to be flat. Other have a lot more room for player skill to impact results. If you're unlucky enough to play flat boards versus weak pairs and complicated boards against strong pairs you're going to have a crappy session.

NGS, ELO and the like are based on methods that are 40+ years old.

In the world of machine learning and AI, 5 years is a lifetime.

The EBU's attitude seems to be

We came up with the following.
We think it's good enough
There's no reason for us to improve anything
We aren't even interested in doing to bake off to evaluate other schemes or compare accuracy


Thanks. Flat boards versus weak pairs and complicated boards against strong pairs is one reason we need to play so many boards at pairs: but if we do and the algorithm evaluates hundreds of such tournaments it's not obvious to me why the rating should be crappy in predicting the outcome of a pair of such tournaments. Many archaic things work. The NGS team say "we expect the standard deviation of the error in your current grade to be around 2%, provided you have a typical mix of partners". Are they way off? Is ELO?
0

#14 User is offline   sfi 

  • PipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 2,576
  • Joined: 2009-May-18
  • Location:Oz

Posted 2021-January-17, 15:56

Australia actually adopted an optional ELO-like system about 10 years ago, and our local club decided to start using it. It was in place for two years and was never very popular with the membership. Now that could just be familiarity and it might have been more widely followed in time. However, we also had another club in town who had been using this scheme for many years, and there was some crossover in players between the two clubs.

Here the scheme caused real problems. See, the other club was of a lower standard but the initial data had no way to reflect that. So they would come to our club and their rating would drop. Once they realised that, these players would simply stop coming to "protect" their rating. The effect was so pronounced that the club simply dropped the scheme altogether after two years.
0

#15 User is offline   pescetom 

  • PipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 7,204
  • Joined: 2014-February-18
  • Gender:Male
  • Location:Italy

Posted 2021-January-17, 16:12

View Postsfi, on 2021-January-17, 15:56, said:

Australia actually adopted an optional ELO-like system about 10 years ago, and our local club decided to start using it. It was in place for two years and was never very popular with the membership. Now that could just be familiarity and it might have been more widely followed in time. However, we also had another club in town who had been using this scheme for many years, and there was some crossover in players between the two clubs.

Here the scheme caused real problems. See, the other club was of a lower standard but the initial data had no way to reflect that. So they would come to our club and their rating would drop. Once they realised that, these players would simply stop coming to "protect" their rating. The effect was so pronounced that the club simply dropped the scheme altogether after two years.


Sure, the diffusion problem is real, and not just in terms of clubs: just think about two players who only ever partner each other, the system can not distinguish their strength. But there are ways of seeding diffusion and over a few years things should work out anyway - would Meckstroth really put up with me for years, does nobody ever leave that bad club? I figure that if there is so little interplay between clubs that they remain isolated for years then any kind of national or international rating is superfluous to them anyway.
People from a weak club visiting a strong club will finish near bottom more often than not. That will discourage them, with or without a rating system.
0

#16 User is offline   sfi 

  • PipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 2,576
  • Joined: 2009-May-18
  • Location:Oz

Posted 2021-January-17, 16:26

View Postpescetom, on 2021-January-17, 16:12, said:

Sure, the diffusion problem is real, and not just in terms of clubs: just think about two players who only ever partner each other, the system can not distinguish their strength. But there are ways of seeding diffusion and over a few years things should work out anyway - would Meckstroth really put up with me for years, does nobody ever leave that bad club? I figure that if there is so little interplay between clubs that they remain isolated for years then any kind of national or international rating is superfluous to them anyway.

Yes, it would have worked itself out eventually if people continued playing at both clubs. But the short-term impact was damaging enough both to the players and to the club that the end point was never reached, and the fact that players stopped playing at the other club meant that this point would take longer to reach. Another factor is that people don't like being told they're not as good as they think they are. And this would have been a long-term impact of this rating once it eventually started to reflect reality.

The bottom line was that the scheme was measurably hurting table numbers and not bringing any perceived benefit to either the players or the club. So it was dropped.
0

#17 User is offline   hrothgar 

  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 15,372
  • Joined: 2003-February-13
  • Gender:Male
  • Location:Natick, MA
  • Interests:Travel
    Cooking
    Brewing
    Hiking

Posted 2021-January-17, 17:06

View Postpescetom, on 2021-January-17, 15:35, said:

Thanks. Flat boards versus weak pairs and complicated boards against strong pairs is one reason we need to play so many boards at pairs: but if we do and the algorithm evaluates hundreds of such tournaments it's not obvious to me why the rating should be crappy in predicting the outcome of a pair of such tournaments. Many archaic things work. The NGS team say "we expect the standard deviation of the error in your current grade to be around 2%, provided you have a typical mix of partners". Are they way off? Is ELO?


Given that the ACBL and the EBU refuse to release data sets or compare the accuracy of the algorithms that are being used, who the ***** can tell...

The big issue here is that groups like the EBU refuse to do appropriate due diligence

They have a system
They claim it works
Alderaan delenda est
0

#18 User is online   helene_t 

  • The Abbess
  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 17,068
  • Joined: 2004-April-22
  • Gender:Female
  • Location:UK

Posted 2021-January-17, 18:36

You can judge the progress of your own play by seeing how your score against robots evolve under various conditions (Daylong, with three robots scored against random vugraph deals, etc). Probably your long-term trend will have a negative bias because the robots get better and better, but that will only be a serious problem if we are talking about decades rather than years or months.

Judging other players, or comparing your own bridge to that of other players, is something you should avoid, though. It would create lots of social issues:
- players blaming their partners for ruining their rating
- players leaving mid-hand when they are on their way to a result that would ruin their rating
- players selecting partners and opponents based on (mostly misguided) beliefs about how the choice influences their rating
- players avoiding playing when they are tired or distressed because of fear of ruining their own rating
- players accusing each other of cheating
- forum discussions being dominated by conspiracy theories about how the rating system is biased in favour of certain players
- players creating new accounts to start with a fresh rating (which in turns leads other players to be prejudiced against new accounts and thereby make it difficult for genuine newbies to get into the community)

Fred and Uday had seen this and/or similar disasters happening on other sites so they rightly chose not to implement a rating system on BBO.
The world would be such a happy place, if only everyone played Acol :) --- TramTicket
3

#19 User is online   johnu 

  • PipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 4,834
  • Joined: 2008-September-10
  • Gender:Male

Posted 2021-January-17, 21:19

View Posthelene_t, on 2021-January-17, 18:36, said:

You can judge the progress of your own play by seeing how your score against robots evolve under various conditions (Daylong, with three robots scored against random vugraph deals, etc). Probably your long-term trend will have a negative bias because the robots get better and better, but that will only be a serious problem if we are talking about decades rather than years or months.

Judging other players, or comparing your own bridge to that of other players, is something you should avoid, though. It would create lots of social issues:
- players blaming their partners for ruining their rating
- players leaving mid-hand when they are on their way to a result that would ruin their rating
- players selecting partners and opponents based on (mostly misguided) beliefs about how the choice influences their rating
- players avoiding playing when they are tired or distressed because of fear of ruining their own rating
- players accusing each other of cheating
- forum discussions being dominated by conspiracy theories about how the rating system is biased in favour of certain players
- players creating new accounts to start with a fresh rating (which in turns leads other players to be prejudiced against new accounts and thereby make it difficult for genuine newbies to get into the community)

Fred and Uday had seen this and/or similar disasters happening on other sites so they rightly chose not to implement a rating system on BBO.

If you play only robot games, it shouldn't matter that the robots get better, or worse, over time. You are playing against other human players so as long as the average field stays the same, your score should vary according to your own skill level trends. If you are an improving player, your scores should go up accordingly.

As for the social issues, I don't think a rating system affects the playing environment nearly as much as some people think.

E.g. BBO implemented a self-rating system. Many BBO tables (players) state they don't want beginners/novices, or maybe just expert and above. So, many players overrate themselves so they can play in advanced games. Only problem is that it doesn't take more than a hand or two before they are exposed as beginners to intermediate players. Then they'll get bounced from the table. Having a data driven rating system won't make this worse.

As for most of the other bad behavior, you see players randomly jumping to 7NT, and redoubling for down many, leaving mid hand, whether or not the hand is going to be a disaster, actual cheating, whether for fun or for actual real life master points, getting banned and creating new accounts, etc. One "solution" would be to "fine" players a rating point of two for egregious bad behavior.
2

#20 User is online   pilowsky 

  • PipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 3,622
  • Joined: 2019-October-04
  • Gender:Male
  • Location:Israel

Posted 2021-January-17, 22:04

All of these problems are the same in real-life (Bridge not being real-life in case anyone forgot) as well whenever there is competition.
Academics are constantly trying to devise new rankings so that they look better than other academics.
citations=likes
number of publications = masterpoints, and don't get me started on the H-index.
0

  • 3 Pages +
  • 1
  • 2
  • 3
  • You cannot start a new topic
  • You cannot reply to this topic

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users