It’s a fun and fast game that can be over in a matter of minutes, or can last longer depending on the luck of the draw with the cards in the pack. The other players will try to do the same, while also blocking yours and the other opponents’ abilities to play by using the power cards such as “pick up two” or “miss a turn”, even changing the direction of play, missing a turn and forcibly changing the colour of the next card to be laid. You do this by matching colour or number to the last card placed down. The aim of the game is to divulge yourself of your hand first, declaring “UNO” once you are on the last card in your hand. Players are dealt a hand of 7 cards running from 0-9 across four different colours, and there are power play cards within the pack as well. It closely resembles a few classic card games using a traditional deck so you may already be familiar with some of the mechanics. Not familiar with Uno? Where have you been? It’s a wildly popular card game with simple rules making it easy to pick up and is a staple of items we take with us on family holidays. Well now it is back, developed by Ubisoft Chengdu in China, but this time it’s a toned down, safer version, depending on your outlook. It was so popular it was the first Arcade game to exceed a million downloads for the Xbox 360 programme, a huge achievement. It was an incredibly popular game for the Arcade section of the console and had a, let’s call it, specialised audience. Thus, I formed a 16-digit state identification, that captures which cards the agent holds and which he can play, differentiating only between colors, and special cards (skip, reverse, etc.You may remember that Uno was available “back in the day” on the Xbox 360. Since the RL-agent has to learn an optimal action for each state, it makes sense to limit the number of states. To be precise there are about 10¹⁶³ combinations. However, when figuring out the possible combinations of cards a player can hold, it quickly gets out of hand. This could be the type of cards he holds, the number of cards the opponent holds, or information regarding cards that have already been played.Īt first sight, UNO might appear to be a very simplistic card game, due to its limited set of rules. Therefore states, actions, and rewards need to be defined.Ī state can represent any information available to the decision-maker, that is useful to describe the current situation of the game. RL on the other hand evaluates each action sequentially and individually at each step. In a supervised machine learning set-up, a function is fitted to map the available features to the output variable. The basic techniques, which I applied are Monte Carlo and Q-Learning with a discrete state-action matrix. The stochastic element, which is inherent through the randomly drawn cards as well as the opponents’ moves, require numerous simulations, to identify a long-run optimum. Reward: Taking certain actions can lead to a desirable terminal state (e.g winning the game), which is rewarded.Actions: The decision-maker interacts with the game by taking actions based on the state he is in.States: Each step of the game can be described by a state.Thereby the game itself can be framed as a finite Markov Decision Process (MDP), which implies the following characteristics: Searching for the optimal strategy in the UNO card game is a classical use case for Reinforcement Learning. It makes sense that normal cards (0–9) from all colors are played the most often since they are the most common cards in a deck. It stands out, that most of the time players are lingering in the “mid-game” by holding 4–5 cards. The axes of the heatmap denote a players’ number of hand cards, as well as the action taken at the respective point in time. Number of state-action occurrences during 100,000 simulated games
0 Comments
Leave a Reply. |