<< sudoku_findlocked Algorithms sudoku_findtwocolors >>

Sudoku Toolbox >> Algorithms > sudoku_findnakedsubset

sudoku_findnakedsubset

Find naked subsets of given length.

Calling Sequence

[X,C,L] = sudoku_findnakedsubset ( X , C , L , k )
[X,C,L] = sudoku_findnakedsubset ( X , C , L , k , stopatfirst )
[X,C,L] = sudoku_findnakedsubset ( X , C , L , k , stopatfirst , verbose )

Parameters

X:

a 9-by-9 matrix, with 0 for unknown entries

C:

a 9 x 9 cell of candidates

L:

the 9 x 9 matrix of number candidates

k:

the length of the naked subset

stopatfirst:

if %t, then stop when one or more candidates have been removed. (default = %t)

verbose:

a boolean. Set to %t to display the state of the matrix. (default = %f)

Description

Search for naked subsets of length k. For k=1, finds naked singles, for k=2, finds naked pairs, for k=3, finds naked triples, for k = 4, finds naked quads.

The algorithm computes a merged list of all candidates in the row, column or block. This merged list is simplified and only unique candidates are kept. The list is used to generate all combinations of k elements. Each combination S is searched in the row, column or block. The row, column or block contains a naked subset of length k if exactly k cells exactly have candidates all contained in S. If a naked subset of length k is found, these candidates are eliminated from the other cells in the row, column or block.

This generic algorithm can be specialized to find naked pairs, naked triples, naked quads, naked sextets, naked septets and naked octets. If k is near 4, the cost is higher. When k is near 1 or n, the cost is smaller. On the contrary that one may think at first, at worst, the number of naked subsets of length 7 is smaller than the number of triplets. Indeed, there are at most 9 possible candidates in the current row, column or block. Now, consider the following table of the binomial number (9,k) with k from 1 to 9: 9 36 84 126 126 84 36 9 1. Hence it costs less to try all naked subsets of length 7 (36 combinations), than to try all naked combinations of length 4 (126 combinations).

Note on interrupting the algorithm

Consider a row, column or block with n unfixed entries. Assume that there are p cells with a naked subset with candidates (a1,a2,...,ap). We can prove that there is an hidden subset of length n-p (see solvebylogic). The consequence is that, depending on the searched length k and the number of unfixed entries n. The following is the list of all cases, where Nk represents a Naked subset of length k and Hk represents a Hidden subset of length k. For each value of n, we give the list of possible combinations of naked and hidden subsets which can be found by the algorithm. n=9: N1+H8, N2+H7, N3+H6, N4+H5, N5+H4, N6+H3, N7+H2, N8+H1 In the case N2+H7, there a Naked pair and a Hidden 7-set. The hidden 7-set will not be searched, but the Naked pair will be found. Hence, the list of possible subsets which might be found is: N1, N2, N3, N4, H4, H3, H2, H1. We conclude that the maximum length of a subset is kmax = 4 if n=9.

In the following list of cases, we only detail the found subsets. n=9: N1, N2, N3, N4, H4, H3, H2, H1, kmax = 4 n=8: N1, N2, N3, N4, H4, H3, H2, H1, kmax = 4 n=7: N1, N2, N3, H3, H2, H1, kmax = 3 n=6: N1, N2, N3, H3, H2, H1, kmax = 3 n=5: N1, N2, H2, H1, kmax = 2 n=4: N1, N2, H2, H1, kmax = 2 n=3: N1, H1, kmax = 1 n=2: N1, H1, kmax = 1

The general rule is that the maximum length a subset in a row, column or block with n unfixed cells is kmax = ceil(n/2). If k is larger that kmax, we are sure that there will not be any subset of this length is the row, column or block.

Examples

X = [
4 0 0   3 9 0   0 0 2
2 6 0   0 5 8   3 9 0
5 9 3   6 0 0   1 8 0
..
1 0 0   8 6 0   0 0 9
6 0 5   9 0 0   2 0 0
0 3 9   2 4 5   0 1 6
..
0 5 6   0 0 9   0 2 0
0 1 4   7 0 0   9 0 5
9 0 0   5 3 0   0 0 0
];
[C,L] = sudoku_candidates(X);
// Find triple {1,7,8} in row 1 at (1,2), (1,3), (1,6).
sudoku_findnakedsubset ( X , C , L , %t );

Authors


<< sudoku_findlocked Algorithms sudoku_findtwocolors >>