Introduction

Othello, also known as Reversi has been played for over a hundred years and like chess and go has world championships and national federations. Though presumably named Reversi because it involves disk flipping it can also exhibit dramatic reversals of fortune for players.

Contents

It is a two person game played on an 8x8 board using disks which are white on one side and black on the other. One person plays black, the other white. The game always starts with two black and two white disks in particular positions in the centre of the board and black always plays first. The aim of the game is to finish with more of your colour disks face up on the board than those of your opponent. Players take turns to make moves. For each turn the player puts a single disk of her colour on the board. She can only place disks on squares adjacent to an opponent's disk. That opponent's disk must be at the end of an unbroken run of disks of the opponent's colour terminated at the other end by one of the player's disks. These runs can be along rows, columns or diagonals and may be in more than one direction. When the disk is played the opponent's disks between the pair(s) of player's disks are flipped over to reveal the player's colour. (Please refer to the caption of the Othello Example.) Then it is the opponent's turn. If no move is possible a player misses a turn. Usually the game ends when the board is full but if neither player can go the game is over and the winner is the player with most disks of her colour on the board.

For our version of the game the player usually plays against the computer which employs a range of algorithms of different playing strengths. By choosing the appropriate algorithm the non-expert human player can arrange to play games which suit her ability. Note that it is many years since a human was able to win against the strongest Othello computer programs but our game is nowhere near as sophisticated.

Our display has a short toolbar at the top with a menu button containing the game Options and settings, a white disk for starting a new game and a wand. During Human versus Machine games the wand is used for requesting and executing hints. For Machine versus Machine games or replaying transcripts the wand is used to step through the game, and so gives the player complete control over the speed of play. Next to this is a clock which times the human's moves, counting down from 30 minutes. To the right are the current human and machine disk counts. When a Human v Machine game is underway, if the clock it ticking it is the human's turn. If the mouse cursor is moved over the board it will highlight any square with a legal move. If the player left clicks on such a square the move is made, the player's disk will be placed and the affected opponent disks flipped.

Othello Example

Othello Example

A screenshot from othello showing a partially completed game. If the highlighted square is indicating black's next move, then the two adjacent white disks would be flipped to become black. If it is instead highlighting white's turn the two black disks diagonally below the square would be flipped to white.

Options

Game Type

You can play against the computer "Human v Machine" or using the wand to step through the moves, watch the computer play with itself "Machine v Machine".

Black Algorithm

Which of the 7 algorithms will be used by Black.

White Algorithm

Which of the 7 algorithms will be used by White.

Black Look Ahead

For most algorithms, when choosing a move the program checks all the possible moves for both players for several turns to come. If Black is using one of these algorithms, this is the number of turns ahead it is examining.

White Look Ahead

See Black Look Ahead.

Human's Colour

Choose your colour. This determines whether you play first, which algorithm will be used for hints and how far ahead the algorithm will look.

Move Pause

During Human versus machine games, in order to give players time to see the square the machine is going to use the program highlights the square for "Move Pause" seconds before playing and flipping the disks.

Save Board

At any time you can save the current state of the board to a file.

Load Board

You can load a saved board and continue playing a game or have the machine play both colours (depending on "Game Type") under control of the wand.

Save Transcript

You can save to a file a record of all the moves played during a game.

Load Transcript

You can load a file a containing a record of all the moves played during a game and then use the wand to step through all the moves.

Font Size

The size of the board can be set.

Set Colour

The colours of the board, grid and highlight can be chosen.

Save Settings

Save the current configuration to a file that will be loaded next time you start the game.

Othello Worked Example

A video from othello. The player uses the menu to set the game type to "Human v Machine", chooses to play White and sets the White algorithm to ab-weighted-diff and depth to 5, and then checks that the Machine algorithm is Random. Then she clicks the button to start the game and the clock shows 30.00. The machine is playing Black and so has first turn. Notice that the square the machine has chosen is highlighted briefly (Move Pause seconds) before the move is made and the disks flipped. Then it is the player's turn and the clock starts counting down. Notice that only the squares with legal moves are highlighted. The player makes her move and the clock pauses. Play continues until she decides to use the hint button. She accepts the hint by clicking again on the hint button. The machine moves and she uses the hint button again, but this time decides to ignore it and make her own choice. Play continues and she sees an opportunity to occupy a corner square (occupying corners and edges is about the extent of her tactical knowledge). Eventually, with a little more help from the hint button (using ab-weighted-diff, with depth 5) she wins 53 - 11.

Algorithms

For all the previous pzl games we have enjoyed writing our own solving algorithms. But, when we were researching Othello we came across the excellent book "Paradigms of Artificial Intelligence Programming: CASE STUDIES IN COMMON LISP" (1992) by Peter Norvig, which has a chapter on Othello. For those interested, the book can be downloaded as a PDF from here.

We also discovered that Daniel Connelly rewrote some of Norvig's programs in Python and made them available here.

We were so impressed by Norvig's and Connelly's work that rather than attempt to write our own algorithms we decided to use Connelly's code instead. So, the algorithms used by the "Machine" and the player's hints were effectively written by Daniel Connelly and we have tried to change his code as little as possible while incorporating it into our regular framework.

This has given us algorithms that are sufficiently strong to enable non-expert Othello players to have interesting games. It also provides a graphical user interface for those wishing to explore Norvig and Connelly's methods. In the book the algorithms (strategies in Norvig's terminology) are developed in increasing order of complexity and we follow that below.

Random

The program randomly chooses from the available legal moves.

Maximize difference

For every legal move the program adds up all the player's pieces and subtracts the opponent's. It chooses the one with the biggest difference. In general, this should beat "Random".

Maximize weighted difference

PAIP page 608: "Humans learn that the edge squares, for example, are valuable because the player dominating the edges can surround the opponent, while it is difficult to capture an edge. This is especially true of corner pieces, which can never be recaptured. Using this knowledge, a clever player can temporarily sacrifice pieces to obtain edge and corner squares in the short run, and win back pieces in the long run."

Using the reasoning from the quote above Norvig assigns values to each square on the board.

            120 -20  20   5   5  20 -20 120
            -20 -40  -5  -5  -5  -5 -40 -20
             20  -5  15   3   3  15  -5  20
              5  -5   3   3   3   3  -5   5
              5  -5   3   3   3   3  -5   5
             20  -5  15   3   3  15  -5  20
            -20 -40  -5  -5  -5  -5 -40 -20
            120 -20  20   5   5  20 -20 120

Then, like Maximize difference, we add up all the player's pieces and subtract the opponent's, but each piece is weighted according to the square it occupies. In general this should beat "Maximize difference".

Speed of operation gives computers a big advantage over humans. It enables them to look ahead a certain distance and try all the possible moves, examine their consequences and choose the best. The remaining algorithms all look ahead for a given depth (Look Ahead) when choosing the next move.

Minimax-diff

Using the Maximize difference measure defined above, search for the specified depth and find a move which gives the guaranteed minimum score achievable for the player.

Minimax-weighted-diff

Using the Maximize weighted difference measure defined above, search for the specified depth and find a move which gives the guaranteed minimum score achievable for the player.

The minimax algorithm evaluates more moves than is necessary. To quote Wikipedia:

"Alpha–beta pruning is a search algorithm that seeks to decrease the number of nodes that are evaluated by the minimax algorithm in its search tree. It stops completely evaluating a move when at least one possibility has been found that proves the move to be worse than a previously examined move. Such moves need not be evaluated further. When applied to a standard minimax tree, it returns the same move as minimax would, but prunes away branches that cannot possibly influence the final decision."

Ab-diff

Using the Maximize difference measure defined above and the alpha-beta strategy search for the specified depth and find a move which gives the guaranteed minimum score achievable for the player.

Ab-weighted-diff

Using the Weighted-difference measure defined above and the alpha-beta strategy search for the specified depth and find a move which gives the guaranteed minimum score achievable for the player.

As can be seen in the results shown below, of the methods we have implemented the Alpha-beta weighted difference algorithm provides players with the strongest and fastest opponent. They can alter its strength by varying the Look Ahead value.

Algorithm Comparisons

In the tables below we show the results of playing the Random algorithm against all the others using varying Look Ahead (depth) values. Each algorithm was used 100 times. Results for algorithms where Look Ahead is not applied have been left to show the variation in the counts. For each data point we show the number of wins for the algorithm and the number of seconds it took to play the complete set of 100 games. So, for example, in the first table minimax-diff with depth 3 won 85 games in a total time 50 seconds. There are 2 tables because we departed from Connelly's code by sorting the board weights so that when evaluating the board score the squares with the highest weight are tried first (PAIP pages 630-631). As can be seen for ab-weigthed-diff, and as Norvig states, this helps to improve the speed.


with sorted weights

depth                     1        2        3         4          5
max-diff              [73, 2], [77, 2],  [75, 2],  [75, 2],   [78, 2]
max_weighted-diff     [80, 2], [79, 3],  [80, 3],  [84, 3],   [82, 3]
minimax-diff          [69, 3], [71, 9],  [85, 50], [85, 356], [94, 3310]
minimax-weighted-diff [80, 3], [83, 11], [85, 76], [90, 746], [98, 6924]
ab-diff               [76, 3], [83, 8],  [94, 21], [87, 83],  [95, 270]
ab-weighted-diff      [79, 4], [83, 9],  [83, 26], [89, 109], [93, 368]


with unsorted weights

depth                     1        2        3         4          5
max-diff              [68, 2], [69, 2],  [68, 2],   [61, 2],   [60, 2]
max-weighted-diff     [83, 2], [88, 2],  [76, 3],   [77, 3],   [80, 3]
minimax-diff          [60, 3]  [66, 9],  [83, 51],  [85, 364], [91, 3393]
minimax-weighted-diff [79, 3], [77, 11], [86, 79],  [86, 756], [95, 7291]
ab-diff               [65, 3], [77, 8],  [78, 23],  [81, 81],  [83, 304]
ab-weighted-diff      [80, 4], [80, 10], [93, 40],  [86, 191], [94, 844]


Wins and time comparisons for 6 algorithms.

Transcripts

A transcript of each game is made and can be saved to a file before the next game is started. The transcript from the Othello Worked Example is shown below and is loaded and played in the video beneath that. The transcript is written as an 8x8 grid with the move number occupying the square where it was made. Other formats like that shown further down can also be read.


 36  35  23  24  48  39  40  16
 54  50  15  12  38  27   9  19
 31  26  22   1   4   8  14  32
 37  34   5   0   0   7  17  25
 52  49   2   0   0  45  55  20
 13  10   3   6  21  41  46  47
 51  11  29  28  30  42  59  60
 18  43  44  33  53  56  57  58
 
The transcript from the worked example

F5D6C3D3C4F4F6F3E6E7D7G6G5C5C6H6D8E8F8G4F7B3E3B4H4H5H7B5B6C7H3G3C2C8A5A4A6E2D2A3F1G8H2F2A2B7E1G2A8A7B8A1H8G7G1B1B2C1D1H1
An alternative format specifying grid coordinates in move order

Here the board squares are assigned letters A-H in the x direction and and 1-8 in y.

Othello Worked Transcript

Replaying the worked example by loading the transcript. The player steps through the moves by clicking on the wand. The file handling dialogue is not shown to maintain the player's anonymity.

Saving and Loading

When playing Human v Machine the board can be saved to a file at any point and then loaded at a later time so that the game can be continued. Below we show a saved game being loaded and finished. The file opening dialogue is deliberately not shown.

Othello Reversal Example

An example of a dramatic reversal of fortune. It is White to play with only one disk on the board. Notice that White immediately gets another turn as Black has no legal moves.