Efficiency of code for Sudoku question

Edit: I originally asked about the performance of my code to go along with the topic, but it seems it doesn't even fully work for all cases. I'll leave the topic, since it still stands as something worthwhile. My code is listed below where others have commented on it.

Peter's answer is long, and maybe hard to understand, but it goes through the rows and columns at the same time. I'm assuming that makes it faster and more efficient, since it loops through rows and columns separately.

It would be great if someone was willing to benchmark all the code posted by everyone so far to see whose was most efficient.

asked 15 Mar '12, 21:52

Kilby%20Sanchez's gravatar image

Kilby Sanchez
4921916

accept rate: 0%

edited 16 Mar '12, 00:47

Does your code work? It looks like it fails on cases such as

[[1, 2, 3, 4], [1, 2, 3, 4], [4, 3, 2, 1], [4, 3, 2, 1]]

and

[[1, 2, -1, 4], [2, 3, 4, 1], [3, 4, 1, 2], [4, 1, 2, 3]]
(15 Mar '12, 22:00)

fnenu-1 ♦♦

fnenu-1's gravatar image

My code, which is what fnenu is referring to:

def check_sudoku(s_list):
    count = 0
    for i in s_list:
        for k in i:
            if i[count] == k and i.index(k) != count or k > len(s_list):
                return False
        count += 1
    count = 0
    for i in s_list:
        for k in s_list:
            if i[count] == k[count] and i != k:
                return False
        count += 1
    return True
(16 Mar '12, 00:48)

Kilby Sanchez

Kilby%20Sanchez's gravatar image

3 Answers:

One way to measure the "efficiency" of Peter's solution is to count how many times each Sudoku square is visited. His solution starts with digit at 1 and counts how often it appears in column 1, then row 1 (it should be once each), then in column 2 and row 2 (again, once each). However, it should be just as efficient (though slightly longer in terms of writing code) if he visited each row first, then each column afterwards. Interleaving rows and columns doesn't make it more efficient.

To see this, imagine if you have a class of 25 students, seated in 5 rows with 5 columns. You say the name of each student, row by row, then column by column. Each student's name is called out twice, once as everyone's name is read row by row, and once again as it's being read column by column. Even if you alternated reading row 1, then column 1, then row 2, and column 2, you'd agree that each row is read once, and each column is read once, and so each name is read twice.

Since there are N * N squares and each are read twice, then there's 2 * N * N squares looked at for each digit. There are N digits, so the efficiency is 2 * N * N * N. In algorithms, this is said to be O(N^3). Constants like 2 are generally ignored.

Your code take a different approach. You are essentially checking for duplicates. In your first loop, you take a row and set count to the 0th element of that row, then search through the rest of the list to see if there's a match. However, you know k might be the same position of as row[count] (you call it i[count]), so you exclude it, and you double check that the number is not bigger than N. You should also check that the number is not less than 1.

Why? Take any valid solution of Sudoku. Negate all the numbers. This is no longer valid (since numbers must be between 1 and N). Yours would (in theory) say it's valid because all you're doing is checking for duplicates and there won't be any.

Now, it turns out your first loop doesn't work. What it's doing is looking at row 0 (which is a list) and at the first element of the list (when count is set to 0) and checking to see if the rest of the list has a number that matches the first element. What if the 3rd and 4th element are the same? You don't compare those two numbers to each other. Then, in the next iteration, you are looking at row 1 (the next row). count is 1, so you are looking at the second element of the list, and comparing every other number to the second number. Indeed, for each row, i, you are only check if the i_th element of the i_th row is duplicated, instead of comparing every pair of numbers within per row.

Your second loop does the same thing. Essentially, you compare the row 0, col 0 value to ever other value in that the column, but you don't make a check to see if, say, row 3, col 0 is equal to row 2, col 0. Essentially for each column, you're picking a specific number in that column and asking if the other numbers are equal to that number, but you are not checking every possible pair of numbers in the column.

So you weren't more inefficient, but being correct is the first priority, then efficient (one can always write an efficient incorrect solution). You had a good idea, but the code doesn't quite do what you thought it would do.

link

answered 15 Mar '12, 22:55

Charles%20Lin's gravatar image

Charles Lin
9.5k4295136

My code that charlzz is referring to:

def check_sudoku(s_list):
    count = 0
    for i in s_list:
        for k in i:
            if i[count] == k and i.index(k) != count or k > len(s_list):
                return False
        count += 1
    count = 0
    for i in s_list:
        for k in s_list:
            if i[count] == k[count] and i != k:
                return False
        count += 1
    return True

Funny enough, I actually got this marked right. The test cases weren't that extensive, I suppose. To fix my code would require adding an extra loop in each of the columns and rows sections, along with more count variables, which should make the entire thing very inefficient, I figure.

(16 Mar '12, 00:49)

Kilby Sanchez

Kilby%20Sanchez's gravatar image

All of the code examples I've looked at so far seem to be running in time O(n^3), or in a few cases O(n^3 log n). It's not always obvious since Python can wrap a lot of processing into one line of code. Has anyone come up with an algorithm that does better than O(n^3)?

EDIT: Now that I think about it, I see that it can be done in only O(n^2 log n). A nice example of this is the beautiful one-line procedure by @EnTerr. It starts by appending the original Sudoku square (a list of row-lists) with its transpose (a list of column-lists), so there are 2n lists of n elements each. This bit of overhead should take O(n^2) operations, I think. Then each of these 2n lists is sorted (which should take O(n log n) steps) and then compared to the range 1,...,n (which will take O(n) steps). Altogether that's O(n^2) + 2n * [O(n log n) + O(n)]. The dominant term there is 2n*O(n log n), or in other words O(n^2 log n).

link

answered 16 Mar '12, 01:01

Kurt%20Van%20Etten's gravatar image

Kurt Van Etten
6.6k1833

edited 16 Mar '12, 10:40

Actually, I wouldn't be surprised if creating the transposed was an O(1) operation. (I think zip is implemented as a generator, although I'm not sure.) Accesses are more expensive, of course.

(16 Mar '12, 10:59)

Anton Golov ♦

Anton%20Golov's gravatar image

Two nested loops so O(n^2) but I don't know about the complexity of append. Any ideas? Anything else expensive going on in there?

def check_sudoku(square):
    n=len(square)
    if n==0:
        return False
    colnum=0
    for e in square:   # n elements
        if len(e) != n:   
           return False
        checker=1
        rownum=0
        col=[]        
        for nums in square: # n elements
            col.append(square[rownum][colnum])  # complexity?
            rownum+=1
        colnum+=1
        for nums in square:   # n elements
            if checker not in e or checker not in col:  # I think this is probably O(1) -  hash table?
                return False             
            checker+=1            
      return True
(16 Mar '12, 23:36)

fnenu-1 ♦♦

fnenu-1's gravatar image

@fnenu: I had a little bit of trouble understanding your code at first because of the two "for nums in square" loops, since you don't actually use nums anywhere. You'd be better off doing something like "for rownum in range(n)" and "for checker in range(1, n+1)". But that shouldn't affect the overall complexity of your code.

I'd assume that append() is an O(1) operation. However, I don't think checking membership in a list is O(1). I don't know anything about how Python implements lists internally but it seems unlikely that it would use hash tables for this. However, you raise a good point--you could rewrite this to use a dict instead of a list to store the rows and columns, and then the membership check could be done in O(1). I haven't thought it through carefully yet, but it seems like this would bring your algorithm down to O(n^2).

(17 Mar '12, 23:51)

Kurt Van Etten

Kurt%20Van%20Etten's gravatar image

range(n) wasn't taught in the course up to that point so I worked around it. I prefer to work within the bounds of the course instead of using other things in what I submit. I guess I should have tidied it up before posting.

I tried to find some info on 'in' but couldn't work out what to look for to find its complexity. Doesn't look like it is O(1) though as you say.

Edit: http://wiki.python.org/moin/TimeComplexity so it's O(n).

(18 Mar '12, 00:12)

fnenu-1 ♦♦

fnenu-1's gravatar image

After finding if ... in... is O(n), I tried again. This code runs more slowly on small squares than my other code, but it's faster on larger for 100x100 squares. So have I missed anything 'expensive' on this?

def check_sudoku(square):
    n=len(square)
    if n==0:
        return False
    counter=0
    for row in square:   # n elements
        # check the rows are the right length
        if len(row) != n:   
           return False
        checker=1

        # create an empty column and row which will contain info as to whether 
        # the number has occurred before
        newcol = []
        newrow = []
        for i in range(n):
            newcol.append(False)
            newrow.append(False)

        # Change False to True if they're in the right range and aren't a duplicate. 
        # If it is a duplicate, return False
        for i in range(n): # n elements
            if square[i][counter] >0 and square[i][counter] <= n and not newcol[square[i][counter]-1]:
                newcol[square[i][counter]-1] = True
            else:
                return False
            if row[i] >0 and row[i] <= n and not newrow[row[i]-1]:
                newrow[row[i]-1] = True
            else:
                return False
        counter += 1

    return True

link

answered 18 Mar '12, 14:59

fnenu-1's gravatar image

fnenu-1 ♦♦
18.7k1981231

Hey, that looks really nice, and (unless I'm also missing something) it run in time O(n^2). And since you have to look at all n^2 entries at least once to check the Sudoku square, that's the best possible result one can get.

(20 Mar '12, 00:31)

Kurt Van Etten

Kurt%20Van%20Etten's gravatar image
Your answer
Question text:

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags

×29,746
×130
×118
×68
×63
×24

Asked: 15 Mar '12, 21:52

Seen: 501 times

Last updated: 20 Mar '12, 00:31