This page describes an algorithm for solving the classic 9x9 Sudoku puzzles. The source code page will allow you to download a Python script to automate the process. This script will solve all the variants described on the variants page.
|
We will start off with an easy example. This one appeared in the Evening Standard on 31st May 2005, where it was described as "Hard". It has 27 squares filled in, the examples they describe as "Easy" have 32. This doesn't make much difference to a computer algorithm, though if you were working through this by hand the more initial numbers you are given the quicker you should be able to solve it. I call it "easy" as it can be solved after applying only the simpler parts of the algorithm. |
The algorithm works by writing out all the possible answers and then eliminating those that are incompatible with the numbers published in the puzzle. The starting position is that every cell can contain any of the numbers 1-9. We represent this by writing all the possible values into each cell, which gives us a grid that looks like this.
| 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 |
| 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 |
| 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 |
| 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 |
| 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 |
| 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 |
| 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 |
| 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 |
| 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 | 123456789 |
The overall approach is to eliminate the impossible so that what remains must be the answer.
In the example above we have been given the values for 27 of the cells. The first thing to do is to apply these to our grid and eliminate all the values that are inconsistent with these. Working across the top row the first value we have is in the 5th cell. This is a 1. This means that there can be no other '1's on the top row or in the 5th column, or in the top-middle block of 9. Crossing out all these '1's leaves
| 23456789 | 23456789 | 23456789 | 23456789 | 1 | 23456789 | 23456789 | 23456789 | 23456789 |
| 123456789 | 123456789 | 123456789 | 23456789 | 23456789 | 23456789 | 123456789 | 123456789 | 123456789 |
| 123456789 | 123456789 | 123456789 | 23456789 | 23456789 | 23456789 | 123456789 | 123456789 | 123456789 |
| 123456789 | 123456789 | 123456789 | 123456789 | 23456789 | 123456789 | 123456789 | 123456789 | 123456789 |
| 123456789 | 123456789 | 123456789 | 123456789 | 23456789 | 123456789 | 123456789 | 123456789 | 123456789 |
| 123456789 | 123456789 | 123456789 | 123456789 | 23456789 | 123456789 | 123456789 | 123456789 | 123456789 |
| 123456789 | 123456789 | 123456789 | 123456789 | 23456789 | 123456789 | 123456789 | 123456789 | 123456789 |
| 123456789 | 123456789 | 123456789 | 123456789 | 23456789 | 123456789 | 123456789 | 123456789 | 123456789 |
| 123456789 | 123456789 | 123456789 | 123456789 | 23456789 | 123456789 | 123456789 | 123456789 | 123456789 |
Continuing this process; after entering the 9 given values on the top three rows we have
| 2368 | 2368 | 2368 | 23567 | 1 | 9 | 3578 | 4 | 3578 |
| 1239 | 1239 | 4 | 8 | 2357 | 2357 | 6 | 13579 | 13579 |
| 7 | 5 | 13689 | 346 | 346 | 346 | 1389 | 1389 | 2 |
| 12345689 | 12346789 | 12356789 | 12345679 | 23456789 | 12345678 | 12345789 | 12356789 | 13456789 |
| 12345689 | 12346789 | 12356789 | 12345679 | 23456789 | 12345678 | 12345789 | 12356789 | 13456789 |
| 12345689 | 12346789 | 12356789 | 12345679 | 23456789 | 12345678 | 12345789 | 12356789 | 13456789 |
| 12345689 | 12346789 | 12356789 | 12345679 | 23456789 | 12345678 | 12345789 | 12356789 | 13456789 |
| 12345689 | 12346789 | 12356789 | 12345679 | 23456789 | 12345678 | 12345789 | 12356789 | 13456789 |
| 12345689 | 12346789 | 12356789 | 12345679 | 23456789 | 12345678 | 12345789 | 12356789 | 13456789 |
This hasn't made much impact on the bottom two thirds of the grid because so far we've only had one value in each column. But we've got 18 more values to place. When we get to '6' in the 6th column of the 6th row the grid is becoming a lot sparser and we notice that the only possible value left for the sixth cell on the third row is '4'. So we have found our first missing number (shown in green).
| 2368 | 2368 | 2368 | 23567 | 1 | 9 | 3578 | 4 | 3578 |
| 1239 | 123 | 4 | 8 | 2357 | 57 | 6 | 13579 | 13579 |
| 7 | 5 | 13689 | 36 | 346 | 4 | 1389 | 1389 | 2 |
| 368 | 9 | 3678 | 1 | 578 | 2 | 3578 | 35678 | 4 |
| 12468 | 124678 | 12678 | 579 | 5789 | 3 | 125789 | 1256789 | 156789 |
| 5 | 12378 | 12378 | 4 | 789 | 6 | 123789 | 123789 | 13789 |
| 1234689 | 1234678 | 12356789 | 235679 | 23456789 | 14578 | 12345789 | 12356789 | 1356789 |
| 1234689 | 1234678 | 12356789 | 235679 | 23456789 | 14578 | 12345789 | 12356789 | 1356789 |
| 1234689 | 1234678 | 12356789 | 235679 | 23456789 | 14578 | 12345789 | 12356789 | 1356789 |
After we've used up all the given numbers (and the one we found) we we are left with the grid.
| 236 | 2368 | 238 | 3567 | 1 | 9 | 3578 | 4 | 578 |
| 1239 | 23 | 4 | 8 | 2357 | 57 | 6 | 159 | 1579 |
| 7 | 5 | 1389 | 36 | 36 | 4 | 1389 | 189 | 2 |
| 36 | 9 | 378 | 1 | 578 | 2 | 578 | 568 | 4 |
| 1246 | 24678 | 1278 | 579 | 578 | 3 | 125789 | 125689 | 156789 |
| 5 | 278 | 1278 | 4 | 78 | 6 | 12789 | 3 | 1789 |
| 8 | 24 | 259 | 56 | 456 | 15 | 1259 | 7 | 3 |
| 239 | 237 | 6 | 357 | 357 | 8 | 4 | 1259 | 159 |
| 34 | 1 | 357 | 2 | 9 | 57 | 58 | 568 | 568 |
So we move onto phase 2, "Looking for Singletons".
Phase two involves counting how often the undecided numbers appear in a set of cells (row, column or 3x3 super-cell). If we start with the first column, the undecided cells are the 1st, 2nd, 4th, 5th, 8th and 9th. The possibilities are '236', '1239', '36', '1246', '239' and '34'. If we count how often each undecided number appears (the known ones are excluded from this process) we get
| Number | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
| Frequency | 2 | 4 | 5 | 2 | - | 3 | - | - | 2 |
Every undecided number appears at least twice, so this doesn't help us. Repeating this exercise for columns 2 and 3 doesn't produce any useful information either. When we try column 4 though, we get
| Number | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
| Frequency | - | - | 3 | - | 4 | 3 | 3 | - | 1 |
Now the '9' only appears once, in the 5th row, so that is where it must be. We have found our second number. We can now set the '9' and repeat the process described in phase 1 giving.
| 236 | 2368 | 238 | 3567 | 1 | 9 | 3578 | 4 | 578 |
| 1239 | 23 | 4 | 8 | 2357 | 57 | 6 | 159 | 1579 |
| 7 | 5 | 1389 | 36 | 36 | 4 | 1389 | 189 | 2 |
| 36 | 9 | 378 | 1 | 578 | 2 | 578 | 568 | 4 |
| 1246 | 24678 | 1278 | 9 | 578 | 3 | 12578 | 12568 | 15678 |
| 5 | 278 | 1278 | 4 | 78 | 6 | 12789 | 3 | 1789 |
| 8 | 24 | 259 | 56 | 456 | 15 | 1259 | 7 | 3 |
| 239 | 237 | 6 | 357 | 357 | 8 | 4 | 1259 | 159 |
| 34 | 1 | 357 | 2 | 9 | 57 | 58 | 568 | 568 |
That didn't help very much, so we restart this phase at the beginning. The first four columns offer no additional insights, but now the frequencies in column 5 are
| Number | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
| Frequency | - | 1 | 3 | 1 | 5 | 2 | 5 | 3 | - |
Which gives us two more discoveries, '2' on the second row and '4' on the 7th. Filling these in using the phase 1 algorithm sets off a chain reaction...
| Action | Consequences |
| (2,5) = 2 | (2,2) = 3 |
| (2,2) = 3 | None |
| (7,5) = 4 | (7,2) = 2 |
| (7,2) = 2 | (8,2) = 7 |
| (8,2) = 7 | (6,2) = 8 |
| (6,2) = 8 | (1,2) = 6, (6,5) = 7 |
| (1,2) = 6 | (1,1) = 2, (5,2) = 4 |
| (6,5) = 7 | None |
| (1,1) = 2 | (1,3) = 8 |
| (5,2) = 4 | None |
| (1,3) = 8 | None |
Co-ordinates are (row, column) with the top left being (1,1). The grid now looks like this (with the new discoveries highlighted).
| 2 | 6 | 8 | 357 | 1 | 9 | 357 | 4 | 57 |
| 19 | 3 | 4 | 8 | 2 | 57 | 6 | 159 | 1579 |
| 7 | 5 | 19 | 36 | 36 | 4 | 1389 | 189 | 2 |
| 36 | 9 | 37 | 1 | 58 | 2 | 578 | 568 | 4 |
| 16 | 4 | 127 | 9 | 58 | 3 | 12578 | 12568 | 15678 |
| 5 | 8 | 12 | 4 | 7 | 6 | 129 | 3 | 19 |
| 8 | 2 | 59 | 56 | 4 | 15 | 159 | 7 | 3 |
| 39 | 7 | 6 | 35 | 35 | 8 | 4 | 1259 | 159 |
| 34 | 1 | 35 | 2 | 9 | 57 | 58 | 568 | 568 |
Restarting this phase counting down the columns shows us that the bottom cell in the first column must be '4'. This doesn't lead to any further discoveries. Trying again reveals that the top cell in the 4th column must be '7'. This second discovery forces both (1,9) and (2,6) to be '5', which in turn forces three more discoveries, leaving.
| 2 | 6 | 8 | 7 | 1 | 9 | 3 | 4 | 5 |
| 19 | 3 | 4 | 8 | 2 | 5 | 6 | 19 | 179 |
| 7 | 5 | 19 | 36 | 36 | 4 | 189 | 189 | 2 |
| 36 | 9 | 37 | 1 | 58 | 2 | 578 | 568 | 4 |
| 16 | 4 | 127 | 9 | 58 | 3 | 12578 | 12568 | 1678 |
| 5 | 8 | 12 | 4 | 7 | 6 | 129 | 3 | 19 |
| 8 | 2 | 59 | 56 | 4 | 1 | 59 | 7 | 3 |
| 39 | 7 | 6 | 35 | 35 | 8 | 4 | 1259 | 19 |
| 4 | 1 | 35 | 2 | 9 | 7 | 58 | 568 | 68 |
Searching the columns again we see that the third cell in the fifth column must be a 6. Setting this triggers another cascade of discoveries which culminates with the solution.
| 2 | 6 | 8 | 7 | 1 | 9 | 3 | 4 | 5 |
| 1 | 3 | 4 | 8 | 2 | 5 | 6 | 9 | 7 |
| 7 | 5 | 9 | 3 | 6 | 4 | 1 | 8 | 2 |
| 3 | 9 | 7 | 1 | 8 | 2 | 5 | 6 | 4 |
| 6 | 4 | 2 | 9 | 5 | 3 | 7 | 1 | 8 |
| 5 | 8 | 1 | 4 | 7 | 6 | 2 | 3 | 9 |
| 8 | 2 | 5 | 6 | 4 | 1 | 9 | 7 | 3 |
| 9 | 7 | 6 | 5 | 3 | 8 | 4 | 2 | 1 |
| 4 | 1 | 3 | 2 | 9 | 7 | 8 | 5 | 6 |
For this example that is as far as we need to go, which is why I've described it as "Easy". I don't have any examples, but any patterns that can be solved only by using the first phase could be described as "Trivial". The next sections discuss some more difficult examples of the problem.
The first two phases are sufficient to many examples, but here is a more difficult example. It only has 23 given numbers, 4 less than the previous example.
| 2 | ||||||||
| 5 | 8 | 6 | ||||||
| 3 | 8 | 5 | ||||||
| 1 | 4 | 7 | 6 | |||||
| 9 | 6 | 5 | 7 | |||||
| 7 | 3 | 9 | 4 | |||||
| 7 | 6 | 8 | ||||||
| 9 | 8 | 1 | ||||||
| 9 |
After applying phases 1 and 2 we are still left with.
| 134 | 379 | 134 | 1578 | 14589 | 14 | 2 | 679 | 13469 |
| 1234 | 5 | 8 | 127 | 1249 | 6 | 3479 | 79 | 1349 |
| 6 | 279 | 124 | 3 | 1249 | 124 | 479 | 8 | 5 |
| 238 | 1 | 23 | 4 | 7 | 5 | 6 | 29 | 289 |
| 9 | 4 | 6 | 128 | 128 | 12 | 5 | 3 | 7 |
| 5 | 28 | 7 | 6 | 3 | 9 | 1 | 4 | 28 |
| 7 | 6 | 12345 | 125 | 1245 | 8 | 349 | 259 | 2349 |
| 234 | 23 | 2345 | 9 | 2456 | 7 | 8 | 1 | 2346 |
| 1248 | 28 | 9 | 125 | 12456 | 3 | 47 | 2567 | 246 |
Which appears to have still a long way to go. If we examine the second column we can see two things of interest.
The first of these rules will allow us to simplify cells (1,2) and (3,2) which currently show the options 3,7,9 and 2,7,9. Because 7 and 9 only appear in these two cells, one must contain 7 and the other 9. Though we don't know which is which, this does allow us to remove the possibilities that these cells contain a 3 or a 2.
The second of these rules applies to cells (6,2) and (9,2), both of which can only contain a 2 or an 8. Again this means that one of the cells must contain 2 and the other 8. Although we don't know which is which we do no that none of the other cells in the column can be a 2 or an 8. This allows us to remove the 2s from (3,2) and (8,2), though in this case (3,2) has already had its two removed using the previous rule.
The table now looks like
| 134 | 79 | 134 | 1578 | 14589 | 14 | 2 | 679 | 13469 |
| 1234 | 5 | 8 | 127 | 1249 | 6 | 3479 | 79 | 1349 |
| 6 | 79 | 124 | 3 | 1249 | 124 | 479 | 8 | 5 |
| 238 | 1 | 23 | 4 | 7 | 5 | 6 | 29 | 289 |
| 9 | 4 | 6 | 128 | 128 | 12 | 5 | 3 | 7 |
| 5 | 28 | 7 | 6 | 3 | 9 | 1 | 4 | 28 |
| 7 | 6 | 12345 | 125 | 1245 | 8 | 349 | 259 | 2349 |
| 234 | 3 | 2345 | 9 | 2456 | 7 | 8 | 1 | 2346 |
| 1248 | 28 | 9 | 125 | 12456 | 3 | 47 | 2567 | 246 |
With the changed cells in green. The fact that 3 is now in a cell on it's own means that we can eliminate all the other 3's in the same block or row.
| 134 | 79 | 134 | 1578 | 14589 | 14 | 2 | 679 | 13469 |
| 1234 | 5 | 8 | 127 | 1249 | 6 | 3479 | 79 | 1349 |
| 6 | 79 | 124 | 3 | 1249 | 124 | 479 | 8 | 5 |
| 238 | 1 | 23 | 4 | 7 | 5 | 6 | 29 | 289 |
| 9 | 4 | 6 | 128 | 128 | 12 | 5 | 3 | 7 |
| 5 | 28 | 7 | 6 | 3 | 9 | 1 | 4 | 28 |
| 7 | 6 | 1245 | 125 | 1245 | 8 | 349 | 259 | 2349 |
| 24 | 3 | 245 | 9 | 2456 | 7 | 8 | 1 | 246 |
| 1248 | 28 | 9 | 125 | 12456 | 3 | 47 | 2567 | 246 |
In this case there are no more pairs that can be used to simplify the grid, so we move onto phase 4, "Eliminating bad guesses".
This phase is simple to describe but harder to implement. In the previous example, after exhausting phase 3 we are still left with
| 134 | 79 | 134 | 1578 | 14589 | 14 | 2 | 679 | 13469 |
| 1234 | 5 | 8 | 127 | 1249 | 6 | 3479 | 79 | 1349 |
| 6 | 79 | 124 | 3 | 1249 | 124 | 479 | 8 | 5 |
| 238 | 1 | 23 | 4 | 7 | 5 | 6 | 29 | 289 |
| 9 | 4 | 6 | 128 | 128 | 12 | 5 | 3 | 7 |
| 5 | 28 | 7 | 6 | 3 | 9 | 1 | 4 | 28 |
| 7 | 6 | 1245 | 125 | 1245 | 8 | 349 | 259 | 2349 |
| 24 | 3 | 245 | 9 | 2456 | 7 | 8 | 1 | 246 |
| 1248 | 28 | 9 | 125 | 12456 | 3 | 47 | 2567 | 246 |
The next phase involves making guesses and observing the consequences. There are three possible outcomes,
This first outcome at first glance appears to be ideal, but at this moment we don't know there is only one solution so we ignore it. Instead what we look for is outcome 2, the contradiction. When we find a contradiction we know that that guess must be wrong so we can remove that number from the possibilities.
Looking at the table above there are 141 possible guesses, starting with the top left cell being 1. This part of the algorithm involves making that guess and then applying the three previous phases to see where it leads us. In this case setting cell (1,1) to 1 leads to a contradiction so we can eliminate it leaving only 3 or 4 in the top left cell. Trying 3 causes no problems but 4 leads to a contradiction, so we can eliminate that too.
Now we only have 3 left so that must be the value in the top left cell. Fixing this value and applying the previous phases leads to the solution
| 3 | 9 | 1 | 8 | 5 | 4 | 2 | 7 | 6 |
| 4 | 5 | 8 | 7 | 2 | 6 | 3 | 9 | 1 |
| 6 | 7 | 2 | 3 | 9 | 1 | 4 | 8 | 5 |
| 8 | 1 | 3 | 4 | 7 | 5 | 6 | 2 | 9 |
| 9 | 4 | 6 | 1 | 8 | 2 | 5 | 3 | 7 |
| 5 | 2 | 7 | 6 | 3 | 9 | 1 | 4 | 8 |
| 7 | 6 | 4 | 2 | 1 | 8 | 9 | 5 | 3 |
| 2 | 3 | 5 | 9 | 6 | 7 | 8 | 1 | 4 |
| 1 | 8 | 9 | 5 | 4 | 3 | 7 | 6 | 2 |
If we take the example from the start of phase 3 (Pairs) and remove the '6' on the fifth row we are left with this puzzle.
| 2 | ||||||||
| 5 | 8 | 6 | ||||||
| 3 | 8 | 5 | ||||||
| 1 | 4 | 7 | 6 | |||||
| 9 | 5 | 7 | ||||||
| 7 | 3 | 9 | 4 | |||||
| 7 | 6 | 8 | ||||||
| 9 | 8 | 1 | ||||||
| 9 |
After applying the first 4 phases we are left with
| 134 | 79 | 134 | 78 | 1589 | 45 | 2 | 67 | 1346 |
| 1234 | 5 | 8 | 127 | 129 | 6 | 3479 | 379 | 134 |
| 126 | 79 | 26 | 3 | 129 | 14 | 479 | 8 | 5 |
| 8 | 1 | 35 | 4 | 7 | 25 | 6 | 239 | 39 |
| 9 | 34 | 346 | 168 | 68 | 12 | 5 | 23 | 7 |
| 56 | 2 | 7 | 56 | 3 | 9 | 1 | 4 | 8 |
| 7 | 6 | 1234 | 12 | 1245 | 8 | 349 | 359 | 2349 |
| 2345 | 34 | 2345 | 9 | 2456 | 7 | 8 | 1 | 2346 |
| 145 | 8 | 9 | 1256 | 12456 | 3 | 47 | 567 | 246 |
Phase 5 works by working through the possible values recursively until it finds a solution or a contradiction. It steps through the possible guesses sequentially. The first cell (1,1) can be 1, 3 or 4. Choosing 1 gives
| 1 | 79 | 34 | 8 | 59 | 45 | 2 | 67 | 346 |
| 34 | 5 | 8 | 7 | 2 | 6 | 349 | 39 | 1 |
| 26 | 79 | 26 | 3 | 19 | 14 | 479 | 8 | 5 |
| 8 | 1 | 35 | 4 | 7 | 25 | 6 | 239 | 39 |
| 9 | 34 | 346 | 16 | 8 | 12 | 5 | 23 | 7 |
| 56 | 2 | 7 | 56 | 3 | 9 | 1 | 4 | 8 |
| 7 | 6 | 1 | 2 | 45 | 8 | 349 | 359 | 349 |
| 2345 | 34 | 2345 | 9 | 456 | 7 | 8 | 1 | 346 |
| 45 | 8 | 9 | 156 | 1456 | 3 | 47 | 567 | 2 |
The next guess is to try 7 in cell (1,2). This leads to
| 1 | 7 | 3 | 8 | 9 | 5 | 2 | 6 | 4 |
| 4 | 5 | 8 | 7 | 2 | 6 | 39 | 39 | 1 |
| 2 | 9 | 6 | 3 | 1 | 4 | 7 | 8 | 5 |
| 8 | 1 | 5 | 4 | 7 | 2 | 6 | 39 | 39 |
| 9 | 3 | 4 | 6 | 8 | 1 | 5 | 2 | 7 |
| 6 | 2 | 7 | 5 | 3 | 9 | 1 | 4 | 8 |
| 7 | 6 | 1 | 2 | 4 | 8 | 39 | 5 | 39 |
| 3 | 4 | 2 | 9 | 5 | 7 | 8 | 1 | 6 |
| 5 | 8 | 9 | 1 | 6 | 3 | 4 | 7 | 2 |
The third guess, setting (2,7) to 3 leads to a solution
| 1 | 7 | 3 | 8 | 9 | 5 | 2 | 6 | 4 |
| 4 | 5 | 8 | 7 | 2 | 6 | 3 | 9 | 1 |
| 2 | 9 | 6 | 3 | 1 | 4 | 7 | 8 | 5 |
| 8 | 1 | 5 | 4 | 7 | 2 | 6 | 3 | 9 |
| 9 | 3 | 4 | 6 | 8 | 1 | 5 | 2 | 7 |
| 6 | 2 | 7 | 5 | 3 | 9 | 1 | 4 | 8 |
| 7 | 6 | 1 | 2 | 4 | 8 | 9 | 5 | 3 |
| 3 | 4 | 2 | 9 | 5 | 7 | 8 | 1 | 6 |
| 5 | 8 | 9 | 1 | 6 | 3 | 4 | 7 | 2 |
Stepping back one guess and trying (2,7) = 9 also leads to a solution. All in all there are 32 possibilities, see here.
| Visits since June 2005: |