Edit: I originally asked about the performance of my code to go along with the topic, but it seems it doesn't even fully work for all cases. I'll leave the topic, since it still stands as something worthwhile. My code is listed below where others have commented on it.
Peter's answer is long, and maybe hard to understand, but it goes through the rows and columns at the same time. I'm assuming that makes it faster and more efficient, since it loops through rows and columns separately.
It would be great if someone was willing to benchmark all the code posted by everyone so far to see whose was most efficient.
One way to measure the "efficiency" of Peter's solution is to count how many times each Sudoku square is visited. His solution starts with digit at 1 and counts how often it appears in column 1, then row 1 (it should be once each), then in column 2 and row 2 (again, once each). However, it should be just as efficient (though slightly longer in terms of writing code) if he visited each row first, then each column afterwards. Interleaving rows and columns doesn't make it more efficient.
To see this, imagine if you have a class of 25 students, seated in 5 rows with 5 columns. You say the name of each student, row by row, then column by column. Each student's name is called out twice, once as everyone's name is read row by row, and once again as it's being read column by column. Even if you alternated reading row 1, then column 1, then row 2, and column 2, you'd agree that each row is read once, and each column is read once, and so each name is read twice.
Since there are N * N squares and each are read twice, then there's 2 * N * N squares looked at for each digit. There are N digits, so the efficiency is 2 * N * N * N. In algorithms, this is said to be O(N^3). Constants like 2 are generally ignored.
Your code take a different approach. You are essentially checking for duplicates. In your first loop, you take a row and set count to the 0th element of that row, then search through the rest of the list to see if there's a match. However, you know k might be the same position of as row[count] (you call it i[count]), so you exclude it, and you double check that the number is not bigger than N. You should also check that the number is not less than 1.
Why? Take any valid solution of Sudoku. Negate all the numbers. This is no longer valid (since numbers must be between 1 and N). Yours would (in theory) say it's valid because all you're doing is checking for duplicates and there won't be any.
Now, it turns out your first loop doesn't work. What it's doing is looking at row 0 (which is a list) and at the first element of the list (when count is set to 0) and checking to see if the rest of the list has a number that matches the first element. What if the 3rd and 4th element are the same? You don't compare those two numbers to each other. Then, in the next iteration, you are looking at row 1 (the next row). count is 1, so you are looking at the second element of the list, and comparing every other number to the second number. Indeed, for each row, i, you are only check if the i_th element of the i_th row is duplicated, instead of comparing every pair of numbers within per row.
Your second loop does the same thing. Essentially, you compare the row 0, col 0 value to ever other value in that the column, but you don't make a check to see if, say, row 3, col 0 is equal to row 2, col 0. Essentially for each column, you're picking a specific number in that column and asking if the other numbers are equal to that number, but you are not checking every possible pair of numbers in the column.
So you weren't more inefficient, but being correct is the first priority, then efficient (one can always write an efficient incorrect solution). You had a good idea, but the code doesn't quite do what you thought it would do.
answered 15 Mar '12, 22:55
All of the code examples I've looked at so far seem to be running in time O(n^3), or in a few cases O(n^3 log n). It's not always obvious since Python can wrap a lot of processing into one line of code. Has anyone come up with an algorithm that does better than O(n^3)?
EDIT: Now that I think about it, I see that it can be done in only O(n^2 log n). A nice example of this is the beautiful one-line procedure by @EnTerr. It starts by appending the original Sudoku square (a list of row-lists) with its transpose (a list of column-lists), so there are 2n lists of n elements each. This bit of overhead should take O(n^2) operations, I think. Then each of these 2n lists is sorted (which should take O(n log n) steps) and then compared to the range 1,...,n (which will take O(n) steps). Altogether that's O(n^2) + 2n * [O(n log n) + O(n)]. The dominant term there is 2n*O(n log n), or in other words O(n^2 log n).
answered 18 Mar '12, 14:59