CHAPTER 15

image

Generating Sudokus

To generate a valid Sudoku puzzle you have to go through the following two steps: generate a completed Sudoku and remove numbers from it in such a way that only one solution is possible with the clues that are left.

It sounds easy, but it isn’t—especially the part involved in checking that the solution is unique.

The Sudoku generator I am going to describe to you is the fourth I have written. The first three generators were more cumbersome, used the strategies I developed for the Solver, and were seriously affected by some damning bad choices of mine.

Generating a Solved Sudoku

When I started working on the Generator, I already knew that I wanted to generate many Sudokus. For that reason, I built in file support from the very start. Listing 15-1 shows the global-definitions file, while Listing 15-2 shows the beginning of the Generator up to and including where it generates a solved Sudoku.

Listing 15-1. def.h—Global Definitions for the Generator

/* def.h
 *
 * Definitions and declarations
 *
 * Copyright (C) 2015  Giulio Zambon  - http://zambon.com.au/
 *
 */
#ifndef DEF
#define DEF

// General definitions
#define FALSE  0
#define TRUE   1

// Used in some strategies for clarity
#define ROW 0
#define COL 1
#define BOX 2
extern char *unit_names[3];

// grid declarations
extern char grid[9][9];
extern char row[9][9][2];
extern char col[9][9][2];
extern char box[9][9][2];
extern char solved[9][9];

// Flags
extern int silent;

// Patch because Windows doesn't recognize srandom() and random()
#ifdef __WIN32__
#define srandom srand
#define random rand
#endif

#endif

The side effect of using different random generators on the Mac and on the PC is that puzzles generated with the same random seed are different on the two systems. I don’t think this is a problem. I executed many tests with srand() and rand() and am pretty confident that the statistical analyses of Chapter 16 remain perfectly valid, even if each individual puzzle is different.

Listing 15-2. sudoku_gen.c—Part 1: Generating a Solved Sudoku

/* sudoku_gen.c
 *
 * Copyright (C) 2015  Giulio Zambon  - http://zambon.com.au/
 *
 */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include "brute_comp.h"
#include "count_solved.h"
#include "def.h"
#include "display.h"
#include "display_string.h"
#include "fill.h"
#include "inconsistent_grid.h"
#include "inconsistent_unit.h"
#include "init.h"
#include "in_box.h"
#include "list_solved.h"
#include "multi_html.h"
#include "save_html.h"

#define LOG_TO_FILE___NO
#define FILE_NAME "puzzles.txt"

#define SAVE_HTML_PUZZLE
#define SAVE_HTML_SOLUTION

// N_GRIDS is defined in multi_html.h.
// When N_GRIDS is between 2 and 5, it triggers the creation of multi-grid
// puzzles. With any other value, the Generator creates a classic Sudoku.
#define DO_MULTI_GRID (N_GRIDS >= 2  &&  N_GRIDS <= 5)

// Parameters
#define N_SET_QUADS 5
#define N_SET_PAIRS 10
#define N_SET_CELLS 0
#define ADDITIONAL_CELLS TRUE
#define FIRST_SEED 123456
#define N_SEEDS 1

// Global variables
char *unit_names[3] = {"row", "column", "box"};
char grid[9][9];
char row[9][9][2];
char col[9][9][2];
char box[9][9][2];
char puzzle[9][9];
int silent = FALSE;

// Variables and functions local to this module
char solved[9][9];
int r_1[81];
int c_1[81];
int k_cell;
int remove_quads(int k_puz);
int remove_pairs(int k_puz);
void make_clue_list(void);
int remove_clues(int k_puz);
void remove_more_clues(int k_puz);
int check_uniqueness(void);

// The following table identifies the box of grid 0 that overlaps with other
// grids when creating multi-grid Sudokus.
// The first box refers to puzzle0 and second one to the other puzzle:
//
// N_GRIDS  kPuz=1  kPuz=2  kPuz=3  kPuz=4
//     2    b0-b8
//     3    b0-b8   b8-b0
//     4    b0-b8   b8-b0   b2-b6
//     5    b0-b8   b8-b0   b2-b6   b6-b2
//
// Puzzle:     0   1   2   3   4
int box0[] = {-1,  0,  8,  2,  6};

//======================================================================== main
int main(int argc, char *argv[]) {
  printf("*** sudoku_gen *** ");
  char mess[32];
  int n_seeds = N_SEEDS;
  int k_try = 0;

#if DO_MULTI_GRID
  // When creating multi-grid puzzles, set n_seed to the number of
  // puzzles that are needed and prepare to write the puzzles to file
  n_seeds = N_GRIDS;
#endif

  // Open a file to log the results
  FILE *fp = NULL;
#ifdef LOG_TO_FILE
  fp = fopen(FILE_NAME, "a");
  if (fp == NULL) {
    printf("Unable to open the file '%s' for reading ", FILE_NAME);
    return EXIT_FAILURE;                                                  //==>
    }
#endif

  // Try all the seeds in the given range
  unsigned long start_time = clock();

  for (int k_seed = 0; k_seed < n_seeds; k_seed++) {
    int seed = FIRST_SEED + k_seed;
    srand(seed);
    int brute_result;
    int n;

    // Keep repeating the generation until you find a unique solution
    char puzzle_string[82];
    char solution_string[82];
    do {

      // Generate a solved Sudoku
      do { init(); } while (fill());

      // Save the solved Sudoku
      for (int k = 0; k < 9; k++) {
        for (int j = 0; j < 9; j++) {
          solved[k][j] = grid[k][j];
          puzzle[k][j] = grid[k][j];
          }
        }

At first sight, the definitions that precede the beginning of the executable code look similar to those of the Solver. I will talk about the parameters as you need them, but consider the definition of the Sudoku grid. When generating a Sudoku, unlike when solving one, there is no need to keep track of the candidates. Consequently, the grid array has one dimension less, and the unsolved cells are simply set to 0.

Also notice that, beside grid[9][9], there are also a puzzle[9][9] and a solved[9][9]. That is where the Generator stores a puzzle and its reference solution, against which it must then match all subsequent attempted solutions of the same puzzle, in order to ensure uniqueness.

As you can see, you use FIRST_SEED and N_SEEDS together with the control variable k_seed of the subsequent for-loop to seed a different pseudorandom sequence for each puzzle you want to generate. A valid seed is any value between 0 and 2,147,483,647, for each one of which the Generator most likely produces a different puzzle. You cannot be sure that different seeds result in different puzzles, but after creating thousands of puzzles, I haven’t yet encountered a case in which that is not true.

The functions declared below the comment “Variables and functions local to this module” are pieces of code of the main program that I carved out into separate functions to make the code easier to understand, although the main program only executes them once. I will explain the variables and the functions when you get to the point of removing clues.

Also the use of the variables n and brute_result will become clear as we progress through this chapter.

Listing 15-2 doesn’t show the while-condition associated with the do-loop, but the explanatory comment before the do will have to suffice for the time being. Let’s proceed one step at a time.

The first interesting bit is in the statement

do { init(); } while (fill());

This is where you do all the generation work. To understand it, you need to look at init() and fill() (Listing 15-3 and 15-4, respectively).

init() for the Generator

Listing 15-3. init.c for the Generator

/* init.c
 *
 * Copyright (C) 2015  Giulio Zambon  - http://zambon.com.au/
 *
 */
#include <stdio.h>
#include <stdlib.h>
#include "def.h"
#include "init.h"

void init() {

  // Initialize the sudoku arrays
  for (int k = 0; k < 9; k++) {
    for (int j = 0; j < 9; j++) {
      grid[k][j] = 0;
      row[k][j][0] = k;
      row[k][j][1] = j;
      col[j][k][0] = k;
      col[j][k][1] = j;
      box[k/3*3+j/3][k%3*3+j%3][0] = k;
      box[k/3*3+j/3][k%3*3+j%3][1] = j;
      }
    }
  }

init() for the Generator is simpler than its counterpart for the Solver described in Chapter 3 because the Sudoku grid for the Generator doesn’t keep track of cell candidates and also because there is no input string to process. So, you only need to initialize the four arrays grid[][], row[][][], col[][][], and box[][][].

fill()

Listing 15-4. fill.c

/* fill.c
 *
 * Copyright (C) 2015  Giulio Zambon  - http://zambon.com.au/
 *
 */
#include <stdio.h>
#include <stdlib.h>
#include "def.h"
#include "display.h"
#include "fill.h"
#include "fill_digit.h"

int fill(void) {
  int problem_found = FALSE;
  int i;
  int kkount = 0;
  for (i = 1; i <= 9 && kkount < 729; i++) {
    int kount = 0;
    do {
      kount++;
      problem_found = fill_digit((char)i);
      if (!silent) printf("fill %d [%d %d]: %s ",
          i, kount, kkount, (problem_found) ? "failed" : "succeeded"
          );
      } while (problem_found && kount < 9);

    if (problem_found) {
      for (int k = 0; k < 9; k++) {
        for (int j = 0; j < 9; j++) {
          if (grid[k][j] == i || grid[k][j] == i-1) grid[k][j] = 0;
          }
        }
      i -= 2;
      } // if (problem_found..
    kkount++;
    }
  return problem_found || kkount >= 729;
  }

fill() uses fill_digit() to fill in the Sudoku grid one number at a time, from 1 to 9. As you will see in the next section, fill_digit() relies on the random generator to choose among the available cells. Due to the fact that in a 9x9 matrix there are many more impossible combinations than valid Sudokus, fill_digit() often fails. When this happens, fill() clears the current number and resumes the attempt from the previous one. It subtracts -2 from the current number because i is automatically incremented at the end of the for-loop. This results in re-entering the code inside the for-loop with an i decremented by 1.

But this cannot go on forever. Therefore, after going through the loop 729 times, fill() gives up and returns with a failure indication. Then, the do-loop in the main program cleans up the Sudoku grid by executing init() and then restarts fill(). Why 729? It is 9 times 81 (i.e. 9 cubed). It seems a good value. The actual requirement is that the number of attempts is significantly greater than 81, so that fill() doesn’t give up too easily.

If you start the Generator with silent set to FALSE, the following is an example of what fills() logs onto the standard output:

fill 1 [1 0]: succeeded
fill 2 [1 1]: succeeded
fill 3 [1 2]: succeeded
fill 4 [1 3]: succeeded
fill 5 [1 4]: succeeded
fill 6 [1 5]: succeeded
fill 7 [1 6]: failed
fill 7 [2 6]: failed
fill 7 [3 6]: succeeded
fill 8 [1 7]: failed
fill 8 [2 7]: failed
fill 8 [3 7]: failed
fill 8 [4 7]: failed
fill 8 [5 7]: succeeded
fill 9 [1 8]: succeeded

And here is another example with some backtracking.

fill 1 [1 0]: succeeded
fill 2 [1 1]: succeeded
fill 3 [1 2]: succeeded
fill 4 [1 3]: succeeded
fill 5 [1 4]: failed
fill 5 [2 4]: succeeded
fill 6 [1 5]: failed
fill 6 [2 5]: failed
fill 6 [3 5]: succeeded
fill 7 [1 6]: failed
fill 7 [2 6]: failed
fill 7 [3 6]: failed
fill 7 [4 6]: failed
fill 7 [5 6]: failed
fill 7 [6 6]: failed
fill 7 [7 6]: failed
fill 7 [8 6]: failed
fill 7 [9 6]: failed
fill 6 [1 7]: failed
fill 6 [2 7]: failed
fill 6 [3 7]: failed
fill 6 [4 7]: failed
fill 6 [5 7]: succeeded
fill 7 [1 8]: failed
fill 7 [2 8]: failed
fill 7 [3 8]: failed
fill 7 [4 8]: succeeded
fill 8 [1 9]: succeeded
fill 9 [1 10]: succeeded

Both examples resulted in a solved Sudoku, as confirmed by the fact that the last entry in the log states that fill() succeeded in placing the 9s. But now have a look at fill_digit() (see Listing 15-5), which includes an interesting algorithm.

fill_digit()

Listing 15-5. fill_digit.c

/* fill_digit.c
 *
 * Copyright (C) 2015  Giulio Zambon  - http://zambon.com.au/
 *
 */
#include <stdio.h>
#include <stdlib.h>
#include "def.h"
#include "fill_digit.h"

int fill_digit(char i) {
  const int other_box[9][2][2] = {  // [this box][row/column][..]
              {/* 0 */ {-1    }, {-1    }},
              {/* 1 */ { 0, -1}, {-1    }},
              {/* 2 */ { 0,  1}, {-1    }},
              {/* 3 */ {-1    }, { 0, -1}},
              {/* 4 */ { 3, -1}, { 1, -1}},
              {/* 5 */ { 3,  4}, { 2, -1}},
              {/* 6 */ {-1    }, { 0,  3}},
              {/* 7 */ { 6, -1}, { 1,  4}},
              {/* 8 */ { 6,  7}, { 2,  5}},
            };
  int solved_cells[2][9] = {{-1, -1, -1, -1, -1, -1, -1, -1, -1}};

  int problem_found = FALSE;
  int n_cells;
  int cell[9][2];
  for (int kB = 0; kB < 9 && !problem_found; kB++) {
    problem_found = TRUE;
    n_cells = 0;
    for (int k = 0; k < 9; k++) {
      int kR = box[kB][k][ROW];
      int kC = box[kB][k][COL];
      if (grid[kR][kC] == 0) {
        int rc[2];
        rc[ROW] = kR;
        rc[COL] = kC;
        int conflict = FALSE;
        for (int kRC = 0; kRC < 2 && !conflict; kRC++) {
          int kkS = other_box[kB][kRC][0];
          if (kkS >= 0) {
            if (rc[kRC] == solved_cells[kRC][kkS]) {
              conflict = TRUE;
              }
            else {
              kkS = other_box[kB][kRC][1];
              if (kkS >= 0 && rc[kRC] == solved_cells[kRC][kkS]) {
                conflict = TRUE;
                }
              }
            } // if (kkS..
          } // for (int kRC..

        if (!conflict) {
          cell[n_cells][ROW] = kR;
          cell[n_cells][COL] = kC;
          n_cells++;
          }
        } // if (grid[kR][kC]..
      } // for (int k..

    // Pick a cell of the box
    if (n_cells > 0) {
      problem_found = FALSE;
      int kE = rand() % n_cells;
      solved_cells[ROW][kB] = cell[kE][ROW];
      solved_cells[COL][kB] = cell[kE][COL];
      grid[solved_cells[ROW][kB]][solved_cells[COL][kB]] = i;
      }

    } // for (int kB..

  if (problem_found) {

    // Restore the grid to its initial status
    for (int m = 0; m < 9 && solved_cells[ROW][m] >= 0; m++) {
      grid[solved_cells[ROW][m]][solved_cells[COL][m]] = 0;
      }
    }
  return problem_found;
  }

To understand how the algorithm works, you first have to be clear about the purpose of other_box and how it is structured. For convenience, I will reproduce here as Figure 15-1 the figure with the box IDs that you have already seen in Chapter 1 as Figure 1-3.

9781484209967_Fig15-01.jpg

Figure 15-1. Box IDs (repeated)

Imagine that you want to place one number 7 (just because I like 7s) in each box. You start with box 0 and place it anywhere you like.

When you want to place a 7 in box 1, you have to ensure that it is in a row other than that containing the 7 you have already placed in box 0. And when you want to place a 7 in box 2, you are only left with one row, because what is available in box 2 depends on the rows already occupied with 7s in box 0 and 1.

The same reasoning for boxes 1 and 2 applies to boxes 3 and 6 if you replace in the previous paragraph all occurrences of the word “row” with the word “column.”

Box 4 is different, because to find what cells are still available to place a 7, you have to eliminate the row used in box 3 and the column used in box 1.

You’ve got the idea. The most difficult box is box 8 (obviously assuming that you proceed with the boxes in order) because by eliminating the rows used in 6 and 7 and the columns used in 2 and 5, you are left with only a single cell.

In any case, a box never depends on more than two other boxes for what concerns the rows and two other boxes for what concerns the columns. The purpose of other_box is to store these dependencies. Its first index is the box ID, from 0 to 8. Its second index distinguishes between rows and columns (remember that ROW is 0 and COL is 1). Its third dimension stores the IDs of the boxes each box depends on. For example, other_box[5][ROW][0] contains 3 and other_box[5][ROW][1] contains 4 because box 5 has to avoid conflicts with the rows of box 3 and 4. Similarly other_box[5][COL][] contains 2 and -1 because box 5 has to avoid conflicts with the columns of box 2. The -1 indicates that there are no further boxes whose columns you need to consider when working on box 5. That’s why the first locations of other_box[0][ROW][] and other_box[0][COL][] are both set to -1: box 0 doesn’t depend on any other box.

If you are thinking that fill_digit() cannot ever fail, think again. What you are not considering is that as fill() proceeds with the numbers from 1 to 9, there are always fewer places available in the Sudoku grid. fill_digit(1) will always succeed, fill_digit(2) almost always, fill_digit(3) most likely, and so on. But when you arrive at the high numbers, there might be no solution available. In fact, already the placement of the second number might fail. If you are not convinced, look at Figure 15-2. Where would you place the 2 of box 8?

9781484209967_Fig15-02.jpg

Figure 15-2. A failed fill_digit()

Because it can fail, it is necessary for fill_digit() to remember the coordinates of all the cells of the Sudoku grid that it solves, so that it can revert the modified cells to their original free state if necessary. If it didn’t do so, it would have to start every time from box 0 and hope to succeed. As it is, fill_digit() behaves as if it tried to find a path through a maze by backtracking to the last fork rather than going back to the entrance after each dead end it encounters. The array solved_cells keeps track of what fill_digit() has already achieved: whenever it finds a solution for a box, it saves its coordinates in solved_cells. For example, when in the example in Figure 15-2 the algorithm places the 1 for box 7 in (6,3), it updates solved_cells as follows:

solved_cells[ROW][7] = 6;
solved_cells[COL][7] = 3;

For each box (i.e., for each value of kB from 0 to 8), the for-loop with control variable k saves in the array cell the coordinates of all the cells of the current box that have not been solved and that do not conflict with cells containing the same number in boxes the current box depends on.

As an example, let’s follow what happens inside the for-loop with control variable k when fill_digit() processes box 5 for the number 2 in the situation of  Figure 15-2.

After the processing of boxes from 0 to 4, solved_cells contains the following values:

           box ID:  0  1  2  3  4  5  6  7  8
solved_cells[ROW]:  0  2  1  4  5 -1 -1 -1 -1
solved_cells[COL]:  1  4  6  2  3  x  x  x  x

where x means undefined because fill_digit() doesn’t initialize the column fields.

k == 0 sets            kR = box[5][0][ROW]  (i.e., 3)
       and           kC = box[5][0][COL]  (i.e., 6)

As grid[3][6] is 0, fill_digit() sets rc[ROW] to 3 and rc[COL] to 6, after which it enters the for-loop with control variable kRC set to ROW and sets kkS as follows:

kkS = other_box[5][ROW][0] (i.e., 3)

This means that

solved_cells[ROW][kkS] == 4

The cell of box 3 containing a 2 is in row 4. As rc[ROW] == 3, it means that there is no conflict. fill_digit() sets kkS to the second value stored in other_box.

kkS = other_box[5][ROW][1] (i.e., 4)

If the current box depended on a single other box for what concerns rows, this second value of kkS would be -1. As it is, you compare rc[ROW] with

solved_cells[ROW][4] == 5

The cell of box 4 containing a 2 is in row 5. Once more, there is no conflict. The for-loop continues with kRC set to COL. This time

kkS = other_box[5][COL][0] (i.e., 2)

which means that

solved_cells[COL][2] == 6

The cell of box 2 containing a 2 is in column 6. But this is identical to rc[COL]!

conflict is set to TRUE, which causes the for-loop with control variable kRC to exit and the saving of the cell coordinates into cell to be skipped.

The next iteration with k == 1 means

kR = box[5][1][ROW] = 3
kC = box[5][1][COL] = 7

This cell is free and doesn’t cause any conflict. Therefore, fill_digit() can save its coordinates.

cell[0][ROW] = 3;
cell[0][COL] = 7;
n_cells = 1;

When k exceeds 8, the for-loop exits and

cell = (3,7), (3,8)
n_cells = 2

That is, fill_digit() has in box 5 a choice of two cells to place a 2. In the example, it picks at random (3,8), which contributes to making it impossible to place a 2 in box 8.

Removing Clues to Make a Puzzle

Conceptually, it is simple: keep removing a clue at a time as long as the puzzle has a unique solution. When you see that the puzzle has multiple solutions, go back one removal and you are done.

In practice, though, it makes sense to start by removing more than one clue at a time. First because you are not interested in puzzles with too many clues anyway and second because you can remove sets of clues symmetrically, so that the resulting puzzle looks somewhat symmetrical. There is no logical reason for preferring more symmetrical puzzles, but they are more pleasing to the eye.

You can remove some quadruples of cells and then some couples before removing individual cells. To do so, you make lists of all possible symmetrical quadruplets, of all possible symmetrical pairs, and of all cells. Then, you arrange the elements of the lists in a random order, so that when you need them, you can go through the lists in sequence. The alternative would be to pick quadruplets, pairs, and cells at random when needed, but preparing the lists in advance is neater.

Figure 15-3 shows some examples of symmetrical quadruples of cells. There are a total of 20 possible quadruplets, which you can identify by placing the top-left cell in all positions grayed in Figure 15-3.

9781484209967_Fig15-03.jpg

Figure 15-3. Examples of symmetric quadruplets of cells

Figure 15-4 shows some examples of symmetrical pairs. There are 40 possible pairs, and to identify them all, you only need to place one of the cells in the grayed area of Figure 15-4.

9781484209967_Fig15-04.jpg

Figure 15-4. Examples of symmetric pairs of cells

Listing 15-6 shows the portion of the Generator that removes clues.

Listing 15-6. sudoku_gen.c—Part 2: Removing Clues from Quads, Pairs, and Cells

//========= Remove N_SET_QUADS quadruples of clues
if (N_SET_QUADS > 0) {
  int success = remove_quads(k_seed);
  if (!success) {
    brute_result = BRUTE_COMP_DIFFERENT;
    goto skip;                                                      //==>
    }
  }

//========= Remove N_SET_PAIRS pairs of clues
if (N_SET_PAIRS > 0) {
  int success = remove_pairs(k_seed);
  if (!success) {
    brute_result = BRUTE_COMP_DIFFERENT;
    goto skip;                                                      //==>
    }
  }

//========= Remove N_SET_CELLS individual clues and then some more
make_clue_list();
k_cell = 0;
if (N_SET_CELLS > 0) {
  int success = remove_clues(k_seed);
if (!success) {
  brute_result = BRUTE_COMP_DIFFERENT;
  goto skip;                                                      //==>

  }
if (ADDITIONAL_CELLS && k_cell < 81) remove_more_clues(k_seed);

As you can see, the Generator executes remove_quads() to remove N_SET_QUADS quadruplets, remove_pairs() to remove N_SET_PAIRS pairs, and remove_clues() to remove N_SET_CELLS individual clues. The variable k_cell is one of those defined at the beginning of the module. As you will see in a moment, the Generator uses it to keep track of how many random individual clues it has removed or attempted to remove. If, after removing N_SET_CELLS individual clues, k_cell hasn’t reached 81 and ADDITIONAL_CELLS is set, it means that the Generator has not exhausted the list of clues that it can attempt to remove. Therefore, it tries to remove more individual clues by executing remove_more_clues().

Notice how I have used goto statements to break out of nested loops. For many programmers the use of goto is taboo but, to avoid goto’s in this case, you would have to add four if-statements. This would increase the maximum level of indentation and, ultimately, make the code less readable. Programming is a rational activity and shouldn’t become enslaved to taboos.

The goto skip statements cause jumps to the end of the big do-loop of which you still have to see the while-condition. When it is executed, the Generator restarts from the do, fills a new Sudoku from scratch, and tries again to remove first the quadruples of clues, then the pairs, and finally the individual clues.

In Listing 15-7 follow what happens within remove_quads():

Listing 15-7. sudoku_gen.c – remove_quads()

#define N_QUADS 20
int remove_quads(int kPuz) {

  // Build a random list of cells to be quadrupled
  int r_4[N_QUADS];
  int c_4[N_QUADS];
  {
    char quads[9][9] = {{0}};
    for (int k = 0; k < N_QUADS; k++) {
      int kR;
      int kC;
      do {
        int kk = rand() % N_QUADS;
        kR = kk >> 2;
        kC = kk - (kR << 2);
        } while (quads[kR][kC] > 0);
      r_4[k] = kR;
      c_4[k] = kC;
      quads[kR][kC] = 1;
      }
    }

  // Change quadruples until you get a matching solution
  int k_quad = -1;
  int n_quads = 0;
  while (n_quads < N_SET_QUADS && k_quad < N_QUADS-1) {
    k_quad++;
    n_quads++;
    int quad[4][2] = {{0}}; // [index][row/col]
    int kR = r_4[k_quad];
    int kC = c_4[k_quad];
    quad[0][ROW] = kR;
    quad[0][COL] = kC;
    quad[1][ROW] = kR;
    quad[1][COL] = 8 - kC;
    if (kR == 4) {
      quad[2][ROW] = kC;
      quad[2][COL] = kR;
      quad[3][ROW] = 8 - kC;
      quad[3][COL] = kR;
      }
    else {
      quad[2][ROW] = 8 - kR;
      quad[2][COL] = kC;
      quad[3][ROW] = 8 - kR;
      quad[3][COL] = 8 - kC;
      }
    if (!silent) printf("Removed quad %d:", k_quad);
    for (int k = 0; k < 4; k++) {
      int kR = quad[k][ROW];
      int kC = quad[k][COL];

      // The following 'if' is only needed when creating multi-grid puzzles
      if (kPuz == 0  ||  !in_box(kR, kC, overlapping_box[kPuz])) {
        grid[kR][kC] = 0;
        }
      if (!silent) printf("(%d,%d)", kR, kC);
      }
    if (!silent) printf(" ");

    // Save the Sudoku puzzle after the removal
    for (int k = 0; k < 9; k++) {
      for (int j = 0; j < 9; j++) {
        puzzle[k][j] = grid[k][j];
        }
      }

    // Solve with brute() and see whether the solution matches
    // the reference
    int brute_result = brute_comp();
    if (!silent) printf("Brute after removing quad %d: %s ",
        k_quad, brute_comp_err[brute_result]
        );

    // If not, backtrack
    if (brute_result != BRUTE_COMP_OK) {
      if (!silent) printf("Backtracking the last quadruple ");
      puzzle[kR][kC] = solved[kR][kC];
      puzzle[kR][8-kC] = solved[kR][8-kC];
      puzzle[8-kR][kC] = solved[8-kR][kC];
      puzzle[8-kR][8-kC] = solved[8-kR][8-kC];
      n_quads--;
      }

    // Restore the puzzle to how it was before solving it
    for (int k = 0; k < 9; k++) {
      for (int j = 0; j < 9; j++) {
        grid[k][j] = puzzle[k][j];
        }
      }
    } // while (n_quads..
  int success = n_quads == N_SET_QUADS;
  if (!silent) {
    if (success) {
      printf("%d clues left after removing the quadruples ", count_solved());
      display();

      // Save the Sudoku puzzle after removing the quadruples
      for (int k = 0; k < 9; k++) {
        for (int j = 0; j < 9; j++) {
          grid[k][j] = puzzle[k][j];
          }
        }
      }
    else {
      printf("No unique solution when removing quadruples. Run aborted. ");
      }
    }
  return success;
  } // remove_quads

The first part of remove_quads() generates the list of random cells that can be quadrupled—that is, of cells in the gray area of Figure 15-3. To do so, it keeps executing

int kk = random() % N_QUADS;
kR = kk >> 2;
kC = kk - (kR << 2);

until it hits a free place on the Sudoku grid. The shift-right-two-bits and shift-left-two-bits operations are equivalent, respectively, to dividing and multiplying by 4. I always like to replace divisions and multiplications by powers of two with bit-shifts. With optimizing compilers, it makes no difference, but it is for me a somewhat nostalgic reminder of when I had to deal with assembly language. You are obviously welcome to replace >> 2 with /4 and << 2 with *4. Incidentally, the parentheses enclosing the left-shift are unnecessary. They are only there because Eclipse otherwise issues an unjustified warning.

In any case, remove_quads() saves row and column coordinates of the cells it selects in r_4 and c_4.

The second part of the function is where remove_quads() actually removes the quadruples of clues from the Sudoku grid. It uses the k_quad variable to identify the quadruplet currently being removed, and n_quad to count the quadruplets already removed. It first saves the coordinates of the four symmetrical cells in quad[][], and then removes them with the following loop (here stripped of comments and printfs):

for (int k = 0; k < 4; k++) {
  int kR = quad[k][ROW];
  int kC = quad[k][COL];
  if (kPuz == 0  ||  !in_box(kR, kC, overlapping_box[kPuz])) {
    grid[kR][kC] = 0;
    }
  }

The reason for making the removal of the clue from the cell with coordinates (kR,kC) conditional will become clear in Chapter 18, where you will learn about multi-grid Sudokus (of which Samurai Sudokus are a particular case). There, you will also find a description of in_box(). For now, you can simply ignore it because, when generating classic Sudokus, kPuz is always 0. Therefore, the condition is always verified and the clue at coordinates (kR,kC) is always removed.

After removing the clues, remove_quads() saves the Sudoku grid into the aptly named variable puzzle, and attempts to solve it with a brute-force algorithm (explained in the following section). If the brute-force solution does not match the Sudoku grid as it was before the removal (i.e., as stored in the array solved), remove_quads() restores the Sudoku grid to how it was before removing the current quadruplet and tries the next quadruplet.

To avoid confusion, perhaps I should restate here that solved stores the reference solution, grid is where the various algorithms do their work, and puzzle is where you keep the current state of the puzzle as it goes through the various phases of the generation process.

remove_pairs() (see Listing 15-8) is very similar to remove_quads().

Listing 15-8. sudoku_gen.c – remove_pairs()

#define N_PAIRS 40
int remove_pairs(int kPuz) {

  // Build a random list of cells to be paired
  int r_2[N_PAIRS];
  int c_2[N_PAIRS];
  {
    char pairs[9][9] = {{0}};
    for (int k = 0; k < N_PAIRS; k++) {
      int kR;
      int kC;
      do {
        int kk = rand() % N_PAIRS;
        kR = kk / 9;
        kC = kk - kR * 9;
        } while (pairs[kR][kC] > 0);
      r_2[k] = kR;
      c_2[k] = kC;
      pairs[kR][kC] = 1;
      }
    }

  // Change pairs until you get a matching solution
  int k_pair = -1;
  int n_pairs = 0;
  while (n_pairs < N_SET_PAIRS && k_pair < N_PAIRS-1) {
    int kR;
    int kC;
    do {
      k_pair++;
      if (k_pair < N_PAIRS) {
        kR = r_2[k_pair];
        kC = c_2[k_pair];
        if (grid[kR][kC] == 0) {
          if (!silent) printf("Pair %d: (%d,%d) (%d,%d) overlaps"
              " with quadruple ", k_pair, kR, kC, 8-kR, 8-kC
              );
          }
        }
      } while (grid[kR][kC] == 0 && k_pair < N_PAIRS);
    if (k_pair < N_PAIRS) {

      // The following two 'if' are only needed when creating multi-grid puzzles
      if (kPuz == 0  ||  !in_box(kR, kC, overlapping_box[kPuz])) {
        grid[kR][kC] = 0;
        }
      if (kPuz == 0  ||  !in_box(8 - kR, 8 - kC, overlapping_box[kPuz])) {
        grid[8-kR][8-kC] = 0;
        }
      n_pairs++;
      if (!silent) printf("Removed pair %d: (%d,%d) (%d,%d) ",
          k_pair, kR, kC, 8-kR, 8-kC
          );

      // Save the Sudoku puzzle after the removal
      for (int k = 0; k < 9; k++) {
        for (int j = 0; j < 9; j++) {
          puzzle[k][j] = grid[k][j];
          }
        }

      // Solve with brute() and see whether the solution matches
      // the reference
      int brute_result = brute_comp();
      if (!silent) printf("Brute after removing pair %d: %s ",
          k_pair, brute_comp_err[brute_result]
          );

      // If not, backtrack
      if (brute_result != BRUTE_COMP_OK) {
        if (!silent) printf("Backtracking the last pair ");
        puzzle[kR][kC] = solved[kR][kC];
        puzzle[8-kR][8-kC] = solved[8-kR][8-kC];
        n_pairs--;
        }

      // Restore the puzzle to how it was before solving it
      for (int k = 0; k < 9; k++) {
        for (int j = 0; j < 9; j++) {
          grid[k][j] = puzzle[k][j];
          }
        }
      } // if (k_pair..
    } // while (n_pairs..

  int success = n_pairs == N_SET_PAIRS;
  if (!silent) {
    if (success) {
      printf("%d clues left after removing the pairs ", count_solved());
      display();

      // Save the Sudoku puzzle after removing the pairs
      for (int k = 0; k < 9; k++) {
        for (int j = 0; j < 9; j++) {
          grid[k][j] = puzzle[k][j];
          }
        }
      }
    else {
      printf("No unique solution when removing pairs. Run aborted. ");
      }
    }
  return success;
  } // remove_pairs

After removing the number of quadruplets defined in N_SET_QUADS, the Generator removes pairs, but only if the brute-force algorithm executed while removing quadruplets was successful in finding a solution that matched the original one. And then, after removing the number of pairs defined in N_SET_PAIRS, the Generator removes individual clues, but, again, only if the brute-force algorithm executed while removing pairs was successful in finding a solution that matched the original one.

When it comes to removing individual clues, you need to make something slightly different (see Listing 15-9 and Listing 15-10).

Listing 15-9. sudoku_gen.c – make_clue_list()

void make_clue_list() {
  char singles[9][9] = {{0}};
  for (int k = 0; k < 81; k++) {
    int kR;
    int kC;
    do {
      int kk = rand() % 81;
      kR = kk / 9;
      kC = kk - kR * 9;
      } while (singles[kR][kC] > 0);
    r_1[k] = kR;
    c_1[k] = kC;
    singles[kR][kC] = 1;
    }
  } // make_clue_list

The first difference is that the making of the clue list and the removal of a set number of individual clues must be in separate functions. This is necessary because the Generator needs a list of random individual clues also when N_SET_CELLS == 0. The list of cells for the quadruplets is inside remove_quads() and the list of cells for the pairs is inside remove_pairs(), but if the list of cells for individual clues were inside remove_clues(), you would only make it when N_SET_CELLS > 0. This would prevent you from being able to set N_SET_CELLS to 0 and still remove additional clues by setting ADDITIONAL_CELLS to TRUE.

The second difference is that you must keep k_cell, which is equivalent to k_quad inside remove_quads() and k_pair inside remove_pairs(), outside remove_clues() because the Generator otherwise would lose track of the individual clues already removed and wouldn’t be able to remove additional individual clues.

Listing 15-10. sudoku_gen.c – remove_clues()

int remove_clues(int kPuz) {
  int success = TRUE;
  int n_cells = 0;
  while (n_cells < N_SET_CELLS && success) {
    int kR;
    int kC;
    do {
      kR = r_1[k_cell];
      kC = c_1[k_cell];
      if (grid[kR][kC] == 0) {
        if (!silent) printf("1 Cell %d: (%d,%d) overlaps with quadruple"
            " or pair ", k_cell, kR, kC
            );
        }
      k_cell++;
      } while (grid[kR][kC] == 0 && k_cell < 81);
    if (k_cell > 81) {
      if (!silent) printf("1 No more cells available after removing"
          " %d clues. Run aborted. ", n_cells
          );
      success = FALSE;
      }
    // The following 'if' is only needed when creating multi-grid puzzles
    else if (kPuz == 0  ||  !in_box(kR, kC, overlapping_box[kPuz])) {
      grid[kR][kC] = 0;
      n_cells++;
      if (!silent) printf("1 Clue removal %d, removed"
          " cell %d: (%d,%d) ", n_cells, k_cell-1, kR, kC
          );

      // Save the Sudoku puzzle after the removal
      for (int k = 0; k < 9; k++) {
        for (int j = 0; j < 9; j++) {
          puzzle[k][j] = grid[k][j];
          }
        }

      // Solve with brute() and see whether the solution matches
      // the reference
      int brute_result = brute_comp();
      if (!silent) printf("1 Brute after removing cell %d: %s ",
          k_cell-1, brute_comp_err[brute_result]
          );

      // If not, backtrack
      if (brute_result != BRUTE_COMP_OK) {
        if (!silent) printf("1 Backtracking the last cell ");
        puzzle[kR][kC] = solved[kR][kC];
        n_cells--;
        }

      // Restore the puzzle to how it was before solving it
      for (int k = 0; k < 9; k++) {
        for (int j = 0; j < 9; j++) {
          grid[k][j] = puzzle[k][j];
          }
        }
      } // if (k_cell.. else ..
    } // while (n_cells..

  if (success) {

    // Save the Sudoku puzzle after removing the individual clues
    for (int k = 0; k < 9; k++) {
      for (int j = 0; j < 9; j++) {
        grid[k][j] = puzzle[k][j];
        }
      }
    if (!silent) {
      printf("%d clues left after removing %d individual clues ",
        count_solved(), n_cells
        );
      display();
      }
    }
  return success;
  } // remove_clues

For initial tests, you can set N_SET_QUADS to 5, N_SET_PAIRS to 10, and N_SET_CELLS to 0. The result before the Generator executes remove_more_clues() is a puzzle with 81 – N_SET_QUADS*4 - N_SET_PAIRS*2 - N_SET_CELLS*0 = 41 clues.

The major difference between remove_more_clues() (see Listing 15-11) and the other remove_ functions is that it keeps going until the brute-force algorithm fails to find a solution. After that, it only needs to restore the last removed clue to obtain a valid puzzle. This is how the Generator minimizes the number of clues in each puzzle.

Listing 15-11. sudoku_gen.c – remove_more_clues()

void remove_more_clues(int kPuz) {
  int brute_result;
  do {
    int kR;
    int kC;
    do {
      kR = r_1[k_cell];
      kC = c_1[k_cell];
      if (grid[kR][kC] == 0) {
        if (!silent) printf("2 Cell %d: (%d,%d) overlaps with quadruple"
            " or pair ", k_cell, kR, kC
            );
        }
      k_cell++;
      } while (grid[kR][kC] == 0 && k_cell < 81);

    // The second part of the following 'if' is only needed when creating
    // multi-grid puzzles
    if (k_cell <= 81  &&
        (kPuz == 0  ||  !in_box(kR, kC, overlapping_box[kPuz]))
        ) {
      grid[kR][kC] = 0;
      if (!silent) printf("2 Clue removal %d, removed"
          " cell %d: (%d,%d) ", 81-count_solved(), k_cell-1, kR, kC
          );

      // Save the Sudoku puzzle after the removal
      for (int k = 0; k < 9; k++) {
        for (int j = 0; j < 9; j++) {
          puzzle[k][j] = grid[k][j];
          }
        }

      // Solve with brute() and see whether the solution matches the reference
      brute_result = brute_comp();
      if (!silent) printf("2 Brute after removing cell %d: %s ",
          k_cell-1, brute_comp_err[brute_result]
          );

      // Restore the puzzle to how it was before solving it
      for (int k = 0; k < 9; k++) {
        for (int j = 0; j < 9; j++) {
          grid[k][j] = puzzle[k][j];
          }
        }
      } // if (k_cell..
    } while (brute_result == BRUTE_COMP_OK && k_cell < 81);

  // Restore the last clue removed
  if (brute_result != BRUTE_COMP_OK) {
    int kR = r_1[k_cell-1];
    int kC = c_1[k_cell-1];
    puzzle[kR][kC] = solved[kR][kC];
    }
  } // remove_more_clues

brute_comp()

This function has the dual purpose of attempting to execute a brute-force algorithm to solve a Sudoku puzzle and of analyzing its results (see Listing 15-12 and Listing 15-13).

Listing 15-12. brute_comp.h

/* brute_comp.h
 *
 * Solves a Sudoku by executing brute() and then compares the result with
 * the reference
 *
 * See below for the return codes
 *
 * Copyright (C) 2015  Giulio Zambon  - http://zambon.com.au/
 *
 */
#ifndef BRUTE_COMPARE
#define BRUTE_COMPARE

// If you modify the following list, change brute_comp_err accordingly
#define BRUTE_COMP_OK            0
#define BRUTE_COMP_DIFFERENT     1
#define BRUTE_COMP_TIMEOUT       2
#define BRUTE_COMP_INCONSISTENT  3
#define BRUTE_COMP_IMPOSSIBLE    4
#define BRUTE_COMP_PROBLEM       5

extern const char *brute_comp_err[];

int brute_comp(void);

#endif

Listing 15-13. brute_comp.c

/* brute_comp.c
 *
 * Copyright (C) 2015  Giulio Zambon  - http://zambon.com.au/
 *
 */
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include "brute.h"
#include "brute_comp.h"
#include "def.h"
#include "inconsistent_grid.h"

// This messages in clear must match the codes defined in brute_comp.h
const char *brute_comp_err[] = {
    /* 0 */ "same as reference",
    /* 1 */ "different from reference",
    /* 2 */ "timeout",
    /* 3 */ "grid inconsistent",
    /* 4 */ "puzzle impossible",
    /* 5 */ "unknown result"
    };

int brute_comp() {
  int result = BRUTE_COMP_PROBLEM;
  int brute_result = brute();
  switch (brute_result) {

    case BRUTE_SUCCESSFUL:
      result = BRUTE_COMP_OK;
      for (int kk = 0; kk < 9 && result == BRUTE_COMP_OK; kk++) {
        for (int jj = 0; jj < 9 && result == BRUTE_COMP_OK; jj++) {
          if (solved[kk][jj] != grid[kk][jj]) result = BRUTE_COMP_DIFFERENT;
          }
        }
      if (inconsistent_grid()) result = BRUTE_COMP_INCONSISTENT;
      break;

    case BRUTE_IMPOSSIBLE:
      result = BRUTE_COMP_IMPOSSIBLE;
      break;

    case BRUTE_TIMEOUT:
      result = BRUTE_COMP_TIMEOUT;
      break;

    default:
      result = BRUTE_COMP_PROBLEM;
      break;
    }
  return result;
  }

The code of brute_comp() is almost trivial. Notice that, unlike the Solver, which relied on the analytical strategies for backtracking, the Generator adopts a real brute-force approach, thereby making it unnecessary to include the Solver’s strategies. The interesting algorithm to solve a Sudoku with brute force is in brute() (see Listing 15-14 and Listing 15-15).

Listing 15-14. brute.h

/* brute.h
 *
 * Solves a Sudoku by brute force
 *
 * See below for the return codes
 *
 * Copyright (C) 2015  Giulio Zambon  - http://zambon.com.au/
 *
 */
#ifndef BRUTE
#define BRUTE

#define BRUTE_SUCCESSFUL  0
#define BRUTE_IMPOSSIBLE -1
#define BRUTE_TIMEOUT    -2

#define BRUTE_MAX_TIME 10

int brute(void);

#endif

Listing 15-15. brute.c

/* brute.c
 *
 * Copyright (C) 2015  Giulio Zambon  - http://zambon.com.au/
 *
 */
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include "brute.h"
#include "def.h"
#include "inconsistent_unit.h"

int brute() {
  int result = BRUTE_SUCCESSFUL;
  unsigned long start_time = clock()/CLOCKS_PER_SEC;
  unsigned long this_time;
  char initial[9][9];
  for (int k = 0; k < 9; k++) {
    for (int j = 0; j < 9; j++) {
      initial[k][j] = grid[k][j];
      }
    }
  int k = 0;
  int j = 0;
  do {
    do {
      if (initial[k][j] == 0) {
        int i = grid[k][j] + 1;
        if (i > 9) {
          grid[k][j] = 0;
          do {
            do { j--; } while (j >= 0 && initial[k][j] != 0);
            if (j < 0) {
              k--;
              if (k < 0) {
                result = BRUTE_IMPOSSIBLE;
                goto done;                                                //==>
                }
              j = 8;
              }
            } while (initial[k][j] != 0);
          } // if (i..
        else {
          grid[k][j] = i;
          int kB = k/3*3+j/3;
          if (    !inconsistent_unit("row", k, row[k])
               && !inconsistent_unit("column", j, col[j])
               && !inconsistent_unit("box", kB, box[kB])
               ) {
            j++;
            }
          } // if (i.. else
        } // if (initial[k][j]..
      else {
        j++;
        }
      this_time = clock()/CLOCKS_PER_SEC;
      if (this_time - start_time > BRUTE_MAX_TIME) {
        result = BRUTE_TIMEOUT;
        goto done;                                                        //==>
        }
      } while (j < 9);
    k++;
    j = 0;
    } while (k < 9);

done:
  return result;
  }

The time-out interval BRUTE_MAX_TIME of 10 seconds is completely arbitrary. I just found it boring to wait longer than that, and, in most cases, 10 seconds, even on a computer that is a few years old, are more than enough to solve a puzzle with brute force.

The algorithm is conceptually straightforward: go through all the unsolved cells in sequence and try to solve them with 1. If 1 is in conflict with some other numbers, try 2. If 2 also causes a conflict, try 3, and so on until you find a number that doesn’t cause any conflict. At that point, go to the next cell and do the same. If all numbers from 1 to 9 cause conflicts, free the cell, go back to the previous one, and increase the number you find there. The algorithm goes forth and back until it completes the Sudoku. The implementation is not trivial, though. To explain how it works, refer to the example of Figure 15-5. It shows a puzzle generated after removing five quadruplets, ten pairs, and four individual cells.

9781484209967_Fig15-05.jpg

Figure 15-5. A puzzle to solve with brute()

To understand brute(), concentrate on the lines of code where grid[k][j] is on the left-hand side of an assignment. They are only two and, to make your life a bit easier, I have highlighted them in bold. grid[k][j] = i is where brute() tries a number, and grid[k][j] = 0 is where it resets a cell before backtracking to the previous one.

If you insert a printf showing row, column, and number after the statement where brute() tries a number, you get the list of entries shown in Listing 15-16 (limited to the first 140 of 1,078 entries).

Listing 15-16. Tracking brute()

(0,1)1 (0,1)2 (0,3)1 (0,3)2 (0,3)3 (0,3)4 (0,5)1 (0,5)2 (0,5)3 (0,7)1 (0,7)2 (0,7)3 (0,7)4 (0,7)5 (0,7)6 (0,7)7 (0,7)8 (0,7)9 (0,5)4 (0,5)5 (0,5)6 (0,5)7 (0,5)8 (0,5)9 (0,3)5 (0,3)6 (0,3)7 (0,3)8 (0,3)9 (0,1)3 (0,1)4 (0,1)5 (0,1)6 (0,3)1 (0,3)2 (0,3)3 (0,3)4 (0,5)1 (0,5)2 (0,7)1 (0,7)2 (0,7)3 (0,8)1 (0,8)2 (0,8)3 (0,8)4 (0,8)5 (0,8)6 (0,8)7 (0,8)8 (1,1)1 (1,1)2 (1,4)1 (1,4)2 (1,4)3 (1,4)4 (1,4)5 (1,4)6 (1,4)7 (1,4)8 (1,4)9 (1,5)1 (1,6)1 (1,6)2 (1,6)3 (1,6)4 (1,6)5 (1,6)6 (1,6)7 (1,7)1 (1,7)2 (1,7)3 (1,7)4 (1,7)5 (1,7)6 (1,7)7 (1,7)8 (1,7)9 (1,6)8 (1,6)9 (1,5)2 (1,5)3 (1,5)4 (1,5)5 (1,5)6 (1,5)7 (1,5)8 (1,5)9 (1,1)3 (1,1)4 (1,1)5 (1,1)6 (1,1)7 (1,4)1 (1,4)2 (1,4)3 (1,4)4 (1,4)5 (1,4)6 (1,4)7 (1,4)8 (1,4)9 (1,5)1 (1,6)1 (1,6)2 (1,7)1 (1,7)2 (1,7)3 (1,7)4 (1,7)5 (1,7)6 (1,7)7 (1,7)8 (1,7)9 (1,6)3 (1,6)4 (1,6)5 (1,6)6 (1,6)7 (1,6)8 (1,6)9 (1,5)2 (1,5)3 (1,5)4 (1,5)5 (1,5)6 (1,5)7 (1,5)8 (1,5)9 (1,1)8 (1,1)9 (0,8)9 (0,7)4 (0,7)5 (0,7)6 (0,7)7 (0,7)8 (0,7)9 (0,5)3 (0,7)1

Scanning the puzzle of Figure 15-5 from the top-left, the first unsolved cell is (0,1). As unsolved cells are represented by a zero, the statement

int i = grid[k][j] + 1;

sets i to 1. As i is not greater than 9, brute() sets grid[0][1] to 1 and prints the first entry.

But there is already a 1 in row 0. Therefore, inconsistent_unit() invoked for row 0 finds an inconsistency, and j (i.e., the column index) is not incremented. The control goes back to the beginning of the second do-statement with the same row and column IDs, where you set i to 2.

As i is still not greater than 9, brute() sets grid[0][1] to 2, prints the second entry, and executes inconsistent_unit(). This time, there is no inconsistency, and brute() increments j to 2. As initial[0][3] == 1 (i.e., the cell is solved), brute() increments j to 3 and prints the third entry of Listing 15-16.

From the entries, you see that brute() fails to solve (0,3) with 1, 2, and 3, before succeeding with 4. Similarly, brute() solves (0,5) with 3. But when brute() tries to solve (0,7), it finds out that all numbers cause a conflict. As a result, the test i > 9 succeeds and (0,7) is reset back to 0.

The two do-loops that follow the assignment grid[k][j] = 0 have the purpose of making current the cell previously solved by brute(). You cannot simply decrement j to go to the previous column, because you might be looking at the very first cell of a row. In that case, the previous cell is in the previous row. Moreover, it could be that the cell immediately preceding the current one was already solved before entering brute().

In the example, starting from (0,7), the do-loop in

do { j--; } while (j >= 0 && initial[k][j] != 0);

iterates twice, and leaves j set to 5. The current value of (0,5) is 3. That’s why the entry after (0,7)9 is (0,5)4. Unfortunately, no further value can solve (0,5). Same story with (0,3). The result is that (0,1) is progressively incremented to 6 (first entry in bold).

Everything goes smoothly for a while: brute() tentatively completes the solution of row 0 (961472538) and tackles row 1. But after 81 entries, it discovers that with the current row 0, no number can solve (1,1) (second entry in bold).

This time, starting from (1,1), the backtracking do-loop

do { j--; } while (j >= 0 && initial[k][j] != 0);

exits with j set to -1.

As a result, k (the row ID) is decremented from 1 to 0 and j is set to 8. This “wrapping around” is how brute() backtracks through the beginning of a row. Notice that if k goes below 0, it means that the puzzle is unsolvable. You know that the puzzle you are looking at is solvable (because you generated it starting from a valid completed grid), but this check will become necessary when you will use brute() to thoroughly verify the uniqueness of a solution.

You had left (0,8) to be 8, and as you can see, the entry following (1,1)9 is (0,8)9. It turns out that further backtracking is necessary, but by now you should have a pretty good idea of how brute() works.

inconsistent_unit() for the Generator

As I had already done with init(), I adapted the Solver’s version of inconsistent_unit() to the simpler representation of the Generator’s Sudoku grid (see Listing 15-17). inconsistent_grid() essentially remains the same as that for the Solver, shown in Listing 3-17.

Listing 15-17. inconsistent_unit.c for the Generator

/* inconsistent_unit.c
 *
 * Copyright (C) 2015  Giulio Zambon  - http://zambon.com.au/
 *
 */
#include <stdio.h>
#include <stdlib.h>
#include "def.h"
#include "inconsistent_unit.h"

int inconsistent_unit(char *what, int kG, char unit[9][2]) {
  int result = FALSE;
  int i_vect[10] = {0};
  for (int k = 0; k < 9 && !result; k++) {
    int kR = unit[k][ROW];
    int kC = unit[k][COL];
    int i = grid[kR][kC];
    if (i > 0) {
      if (i_vect[i] == FALSE) {
        i_vect[i] = TRUE;
        }
      else {  // we have a duplicate solution
        result = TRUE;
        }
      } // if (i..
    } // for (int k..
  return result;
  }

Check for Uniqueness of the Solution

The removal of the clues results in a valid Sudoku puzzle. But you have to be completely sure that the solution is unique. At this point, the only verification of uniqueness is based on the fact that the brute-force solution of the puzzle and the reference solution are identical.

But it is not clear that all different ways of solving the puzzle would result in the same valid solution. To ensure that this is not a case, the Generator executes check_uniqueness() as shown in Listing 15-18. The function check_uniqueness() itself is shown in Listing 15-19.

Listing 15-18. sudoku_gen.c—Part 3: Checking for Uniqueness

//========= Check whether the solution is really unique
brute_result = check_uniqueness();

Listing 15-19. sudoku_gen.c – check_uniqueness()

int check_uniqueness() {
  int brute_result = BRUTE_COMP_OK;
  int incr = -8;
  while (incr < 9 && brute_result != BRUTE_COMP_DIFFERENT) {
    for (int k = 0; k < 9 && brute_result != BRUTE_COMP_DIFFERENT; k++) {
      for (int j = 0; j < 9 && brute_result != BRUTE_COMP_DIFFERENT; j++) {
        if (puzzle[k][j] == 0) {
          for (int kk = 0; kk < 9; kk++) {
            for (int jj = 0; jj < 9; jj++) {
              grid[kk][jj] = puzzle[kk][jj];
              } // for (int jj..
            } // for (int kk..
          grid[k][j] = solved[k][j] + incr;
          int kB = k/3*3+j/3;
          if (    grid[k][j] < 1
               || grid[k][j] > 9
               || inconsistent_unit("row", k, row[k])
               || inconsistent_unit("column", j, col[j])
               || inconsistent_unit("box", kB, box[kB])
               ) {
            grid[k][j] = 0;
            }
          else {
            brute_result = brute_comp();
            } // if (grid[k][j]..
          } // if (puzzle[k]j]..
        } // for (int j..
      } // for (int k..
    incr++;
    if (incr == 0) incr++;
    } // while (incr..
  return brute_result;
  } // check_uniqueness

For each unsolved cell of the puzzle, check_uniqueness() tries all possible numbers different from the number of the reference solution. For example, with the puzzle shown in Figure 15-5, the number that solves (0,1) in the reference solution is 1. Therefore, the Generator tries to solve the puzzle with brute force after setting (0,1) to 2, then to 3, 4, and so on until it tries 9. It keeps going as long as brute_comp() does not return BRUTE_COMP_DIFFERENT. If and when brute_comp() returns BRUTE_COMP_DIFFERENT, it means that it solved the puzzle although one of its unsolved cells was set to a number different from that of the reference solution. Therefore, the puzzle being tested doesn’t have a unique solution. check_uniqueness() then returns BRUTE_COMP_DIFFERENT, which forces the Generator to iterate through the very first do-loop of Listing 15-1—that is, to generate from scratch a brand-new puzzle while keeping the same random seed.

To be sure that all possibilities are considered, check_uniqueness() tries all possible numbers in all unsolved cells of the puzzle. For example, the puzzle of Figure 15-5 has 37 clues. This means that there are 81 – 37 = 44 unsolved cells. Therefore, check_uniqueness() tries to solve 44 * 8 = 352 puzzles, each one with 38 clues. The number of puzzles is not 44 * 9 because check_uniqueness() doesn’t need to try the correct number.

To try all possible numbers, check_uniqueness() adds to the number stored in the reference solution a value between -8 and +8, and only executes brute_comp() if the resulting number is between 1 and 9 and doesn’t make the puzzle impossible to solve.

Notice that check_uniqueness() only reacts when brute_comp() returns BRUTE_COMP_DIFFERENT. You might not be surprised that it ignores BRUTE_COMP_TIMEOUT, BRUTE_COMP_INCONSISTENT, BRUTE_COMP_IMPOSSIBLE, and even BRUTE_COMP_PROBLEM, but what about BRUTE_COMP_OK? Shouldn’t you be concerned if the altered puzzle produces the same solution? The fact is that it cannot happen, because the solution found with brute_comp() and the reference solution will always differ at the very least in the number that check_uniqueness() altered.

You might also be concerned that check_uniqueness() only alters one cell at a time. What about swapping two cells or, more in general, rotating any number of cells? If you think about it, you will see that they would be redundant tests. For example, in the reference solution, 2 solves (2,1) and 8 solves (6,1). If you swapped the two numbers, it would mean that you would try to solve a puzzle with the same 37 clues plus an 8 in (2,1) and a 2 in (6,1). But check_uniqueness() already tried to solve a puzzle with all possible values in (2,1), including 8, and then all possible values in (6,1), including 2. Therefore, if the puzzle with the swap led to a solution, check_uniqueness() would find it (even twice, if you let it).

Completing the Generator

After checking for the uniqueness of the solution, the Generator only needs to display some messages and report the results, as shown in Listing 15-20.

Listing 15-20. sudoku_gen.c—Part 4: Closing Off

//========= Check whether the solution is really unique
brute_result = check_uniqueness();

//========= Done
for (int k = 0; k < 9; k++) {
  for (int j = 0; j < 9; j++) {
    grid[k][j] = puzzle[k][j];
    }
  }
n = count_solved();
if (!silent && fp == NULL) {
  display();
  sprintf(mess, "seed %d %d", seed, n);
  display_string(mess);
  printf("The puzzle contains %d clues:", n);
  list_solved(stdout);
  }

skip:                                                                    // <==
  k_try++;
  printf("%d: %s ",
      k_try,
      (brute_result == BRUTE_COMP_DIFFERENT) ? "No" : "Yes"
      );
  } while (brute_result == BRUTE_COMP_DIFFERENT);

// Save puzzle and solution into strings
int kar = 0;
for (int k = 0; k < 9; k++) {
  for (int j = 0; j < 9; j++) {
    puzzle_string[kar] = puzzle[k][j] + '0';
    solution_string[kar] = solved[k][j] + '0';
    kar++;
    }
  }
puzzle_string[kar] = '';
solution_string[kar] = '';

#ifdef SAVE_HTML_PUZZLE
    save_html(puzzle_string, seed, "p");
#endif

#ifdef SAVE_HTML_SOLUTION
    save_html(solution_string, seed, "s");
#endif
    if (fp != NULL) {
      printf("#%d ", k_seed);
      fprintf(fp, "%s %d", puzzle_string, seed);
      if (brute_result != BRUTE_COMP_DIFFERENT) {
        fprintf(fp, " %d", n);
        list_solved(fp);
        }
      fprintf(fp, " ");
      }
    } // for (k_seed..

  unsigned long end_time = clock();
  printf("********* done in %ld microseconds ", end_time - start_time);
  if (fp != NULL) fclose(fp);
  return EXIT_SUCCESS;
  }

The output written to the file, if requested by defining LOG_TO_FILE at the beginning of the module, matches the format that the Solver expects as input.

list_solved(), shown in Listing 15-21, is a small utility function that prints out the number of times each number occurs in the puzzle.

Listing 15-21. list_solved.c

/* list_solved.c
 *
 * Copyright (C) 2015  Giulio Zambon  - http://zambon.com.au/
 *
 */
#include <stdio.h>
#include <stdlib.h>
#include "def.h"
#include "list_solved.h"

void list_solved(FILE *fp) {
  if (fp == NULL) {
    fp = stdout;
    }
  char spacing = (fp == stdout) ? ' ' : ' ';
  int digits[10] = {0};
  for (int k = 0; k < 9; k++) {
    for (int j = 0; j < 9; j++) {
      if (grid[k][j] != 0) {
        digits[(int)grid[k][j]]++;
        }
      } // for (int j..
    } // for (int k..
  for (int i = 1; i <= 9; i++) {
    fprintf(fp, "%c%d", spacing, digits[i]);
    }
  }

Utilities for the Generator

As the Generator uses a simpler Sudoku representation than the Solver, I had to adapt some of the utilities. You have already seen the Generator’s version of init() and inconsistent_unit(). In this section, you will find the listings of count_solved(), count_solved(), display(), display_string(), and save_html() (see Listing 15-22 to Listing 15-25).

Listing 15-22. count_solved.c for the Generator

/* count_solved.c
 *
 * Copyright (C) 2015  Giulio Zambon  - http://zambon.com.au/
 *
 */
#include <stdio.h>
#include <stdlib.h>
#include "count_solved.h"
#include "def.h"

int count_solved() {
  int result = 0;
  for (int k = 0; k < 9; k++) {
    for (int j = 0; j < 9; j++) {
      if (grid[k][j] != 0) result++;
      }
    }
  return result;
  }

list_solved(), shown in Listing 15-22, prints out the number of times each number occurs in the puzzle.

Listing 15-23. list_solved.c

/* list_solved.c
 *
 * Copyright (C) 2015  Giulio Zambon  - http://zambon.com.au/
 *
 */
#include <stdio.h>
#include <stdlib.h>
#include "def.h"
#include "list_solved.h"

void list_solved(FILE *fp) {
  if (fp == NULL) {
    fp = stdout;
    }
  char spacing = (fp == stdout) ? ' ' : ' ';
  int digits[10] = {0};
  for (int k = 0; k < 9; k++) {
    for (int j = 0; j < 9; j++) {
      if (grid[k][j] != 0) {
        digits[(int)grid[k][j]]++;
        }
      } // for (int j..
    } // for (int k..
  for (int i = 1; i <= 9; i++) {
    fprintf(fp, "%c%d", spacing, digits[i]);
    }
  }

Listing 15-24. display.c for the Generator

/* display.c
 *
 * Copyright (C) 2015  Giulio Zambon  - http://zambon.com.au/
 *
 */
#include <stdio.h>
#include <stdlib.h>
#include "def.h"
#include "display.h"

void display() {
  char *h = "  ++---+---+---++---+---+---++---+---+---++";
  char *hh = "  ++===+===+===++===+===+===++===+===+===++";
  int jBase[] = {2, 6, 10, 15, 19, 23, 28, 32, 36};
  printf("     0   1   2    3   4   5    6   7   8 ");
  for (int k = 0; k < 9; k++) {
    if (k%3 == 0) {
      printf("%s ", hh);
      }
    else {
      printf("%s ", h);
      }
    //                000 000 111  111 122 222  223 333 333
    //                234 678 012  567 901 345  890 234 678
    char top[42] = "||   |   |   ||   |   |   ||   |   |   ||";
    char mid[42] = "||   |   |   ||   |   |   ||   |   |   ||";
    char bot[42] = "||   |   |   ||   |   |   ||   |   |   ||";
    char *displ[42] = {top, mid, bot};
    for (int j = 0; j < 9; j++) {
      if (grid[k][j] == 0) {
        mid[jBase[j]+1] = ' ';
        }
      else {
        mid[jBase[j]+1] = '0' + grid[k][j];
        }
      } // for (int j..
    printf("  %s ", displ[0]);
    printf("%d %s ", k, displ[1]);
    printf("  %s ", displ[2]);
    }
  printf("%s ", hh);
  }

Listing 15-25. display_string.c for the Generator

/* display_string.c
 *
 * Copyright (C) 2015  Giulio Zambon  - http://zambon.com.au/
 *
 */
#include <stdio.h>
#include <stdlib.h>
#include "def.h"
#include "display_string.h"

void display_string(char *name) {
  for (int k = 0; k < 9; k++) {
    for (int j = 0; j < 9; j++) {
      printf("%d", grid[k][j]);
      }
    }
  if (name != NULL) printf(" "%s"", name);
  printf(" ");
  }

The utility function save_html() (see Listing 15-26) is specific to the Generator. Therefore, it doesn’t have any counterpart in the Solver. As the Solver deals with existing puzzles, it is not important to display them graphically. Thus, the simple format provided by display() is sufficient. But for the Generator, you must be able to display the puzzles and their solutions in graphical form, so that you can print them out, add them to a web site, or send them via e-mail.

Listing 15-26. save_html.c

/* save_html.c
 *
 * Copyright (C) 2015  Giulio Zambon  - http://zambon.com.au/
 *
 */
#include <stdio.h>
#include <stdlib.h>
#include "def.h"
#include "save_html.h"

void save_html(char *puzzle, int seed, char *suffix) {
  char *header_1 =
      "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "
        ""http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> "
      "<html xmlns="http://www.w3.org/1999/xhtml"> "
      "<head> "
      "<title>"
      ;
  char *header_2 =
      "</title> "
      "<meta http-equiv="Content-type" "
        "content="text/html;charset=UTF-8"/> "
      "<style type="text/css"> "
      "table {empty-cells:show; border-collapse:collapse;} "
      "td { "
      "  width:50px; height:50px; "
      "  border-style:solid; border-width:1px; border-color:#000000; "
      "  text-align:center; vertical-align:middle; "
      "  font-family:Arial, Verdana, Sans-serif; font-size:2em; "
      "  } "
      ".c0 {border-top-width:3px; border-left-width:3px;} "
      ".c1 {border-top-width:3px;} "
      ".c2 {border-top-width:3px; border-right-width:3px;} "
      ".c3 {border-left-width:3px;} "
      ".c4 {} "
      ".c5 {border-right-width:3px;} "
      ".c6 {border-bottom-width:3px; border-left-width:3px;} "
      ".c7 {border-bottom-width:3px;} "
      ".c8 {border-bottom-width:3px; border-right-width:3px;} "
      "</style> </head> <body> <table> "
      ;
  char *footer = "</table> </body> </html>";

  char f_name[64] = {0};
  sprintf(f_name, "%d%s.html", seed, suffix);

  FILE *fp = fopen(f_name, "w");
  if (fp == NULL) {
    printf("Unable to open the file '%s' for writing ", f_name);
    }
  else {
    fprintf(fp, "%s%d%s", header_1, seed, header_2);
    for (int i = 0; i < 81; i++) {
      int kR = i / 9;
      int kC = i - kR * 9;
      if (kC == 0) fprintf(fp, "<tr>");
      fprintf(fp, "<td class="c%d">%c</td>",
          kR % 3 * 3 + kC % 3, ((puzzle[i] == '0') ? ' ' : puzzle[i])
          );
      if ((kC + 1) % 3 == 0) fprintf(fp, " ");
      if (kC == 8) fprintf(fp, "</tr> ");
      }
    fprintf(fp, "%s ", footer);
    fclose(fp);
    }
  }

Several graphic packages let you generate images, but they are complex to use, and it is always advisable to use the simplest possible tools that do the job.

Strictly speaking, save_html() has nothing to do with graphics, because it stores a puzzle on a web page by means of an HTML table. But you can use it as an alternative to a drawing; it scales very well, and it uses up the same bandwidth regardless of how large you want to display the puzzle. Effectively, save_html() uses any web browser to do the heavy lifting of displaying graphics.

As you can see, the code is pretty straightforward. Most of it consists of the definition of HTML lines as strings.

An interesting bit is how you use the style of the cells to show borders of different thickness. To do so, save_html() defines a different style class for each cell of a box, from top-left (c0, with thick top and left borders) to bottom right (c8, with thick right and bottom borders). It then determines the correct position of each cell (and therefore of its style class) by calculating:

kR % 3 * 3 + kC % 3

When writing a cell to disk as an element of an HTML table with the following statement:

fprintf(fp, "<td class="c%d">%c</td>",
    kR % 3 * 3 + kC % 3, ((puzzle[i] == '0') ? ' ' : puzzle[i])
    );

the string c%d results in the name of the appropriate class. What you are doing is automatically generating HTML code from C.

Note the use of new-line characters to ensure that the generated HTML code is readable. Listing 15-27 shows the resulting HTML file.

Listing 15-27. Output of save_html()

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" ➥"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>100</title>
<meta http-equiv="Content-type" content="text/html;charset=UTF-8"/>
<style type="text/css">
table {empty-cells:show; border-collapse:collapse;}
td {
  width:50px; height:50px;
  border-style:solid; border-width:1px; border-color:#000000;
  text-align:center; vertical-align:middle;
  font-family:Arial, Verdana, Sans-serif; font-size:2em;
  }
.c0 {border-top-width:3px; border-left-width:3px;}
.c1 {border-top-width:3px;}
.c2 {border-top-width:3px; border-right-width:3px;}
.c3 {border-left-width:3px;}
.c4 {}
.c5 {border-right-width:3px;}
.c6 {border-bottom-width:3px; border-left-width:3px;}
.c7 {border-bottom-width:3px;}
.c8 {border-bottom-width:3px; border-right-width:3px;}
</style>
</head>
<body>
<table>
<tr><td class="c0">5</td><td class="c1"> </td><td class="c2">3</td>
<td class="c0"> </td><td class="c1"> </td><td class="c2"> </td>
<td class="c0">6</td><td class="c1"> </td><td class="c2">9</td>
</tr>
<tr><td class="c3">7</td><td class="c4"> </td><td class="c5"> </td>
<td class="c3"> </td><td class="c4"> </td><td class="c5"> </td>
<td class="c3"> </td><td class="c4"> </td><td class="c5">2</td>
</tr>
<tr><td class="c6"> </td><td class="c7">4</td><td class="c8">8</td>
<td class="c6">2</td><td class="c7"> </td><td class="c8">6</td>
<td class="c6">1</td><td class="c7">7</td><td class="c8"> </td>
</tr>
<tr><td class="c0"> </td><td class="c1">3</td><td class="c2"> </td>
<td class="c0">6</td><td class="c1"> </td><td class="c2">7</td>
<td class="c0"> </td><td class="c1">9</td><td class="c2"> </td>
</tr>
<tr><td class="c3"> </td><td class="c4"> </td><td class="c5"> </td>
<td class="c3"> </td><td class="c4">2</td><td class="c5"> </td>
<td class="c3"> </td><td class="c4"> </td><td class="c5"> </td>
</tr>
<tr><td class="c6"> </td><td class="c7">8</td><td class="c8"> </td>
<td class="c6">5</td><td class="c7"> </td><td class="c8">3</td>
<td class="c6"> </td><td class="c7">2</td><td class="c8"> </td>
</tr>
<tr><td class="c0"> </td><td class="c1">5</td><td class="c2">7</td>
<td class="c0">4</td><td class="c1"> </td><td class="c2">9</td>
<td class="c0">2</td><td class="c1">6</td><td class="c2"> </td>
</tr>
<tr><td class="c3">8</td><td class="c4"> </td><td class="c5"> </td>
<td class="c3"> </td><td class="c4"> </td><td class="c5"> </td>
<td class="c3"> </td><td class="c4"> </td><td class="c5">7</td>
</tr>
<tr><td class="c6">4</td><td class="c7"> </td><td class="c8">6</td>
<td class="c6"> </td><td class="c7"> </td><td class="c8"> </td>
<td class="c6">3</td><td class="c7"> </td><td class="c8">8</td>
</tr>
</table>
</body>
</html>

Figure 15-6 shows the screen capture of the page as it is displayed in a web browser.

9781484209967_Fig15-06.jpg

Figure 15-6. A puzzle displayed in a web browser

The code conforms to W3C’s XHTML 1.0 and CSS Level 2.1 standards. Perhaps I went a bit overboard with that, but I wanted to be sure that you would never have problems with any browser. To be sure, I still tested it on all the browsers I could put my hands on.

Summary

In this chapter you have learned in detail how the Generator works. In particular, you have seen how you can use the different parameters to create billions of puzzles, how you can save the generated puzzles to disk, and how you can display them in a web browser. In the next chapter, you will see how difficult the generated puzzles are and how you can generate many more with some clever thinking.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset