Kommentare und Kritik sind willkommen
69.96171.72
1.) Das Speichern des Puzzles impliziert das Speichern der Lösung (Information theoretisch).
t(α)α2t(α)αt(3) =2.444443
Pα4t(α)α2 ungleich Null Einträgen.
Mβ×α4β≥2t(α)α22t(α)α2{0,±1}β=kt(α)α2k
V=MPβ|α2|M{0,±1} .
Vβlogα2=2kt(α)α2logα bits.
In your case, α=3 and t(α) =3 and 2kt(α)α2logα=69.96kbits to 85.86k bits. k=2, the minumum required provides roughly 139.92bits to 171.72bits roughly as a lower bound for the average case.
Note that I have hand-waived some assumptions such as sizes of entries of MP and number of entries one has on average in the puzzle.
A.)Of course, it mightbe possible to reduce k from 2 since in sudoku the position of the sparse entries are not that mutually independent. Each entry on an average t(α)−1 entries each in its row, column and sub-box. That is given, that some entries are present in a sub-box or column or row, one can find the odds of entries being present in the same row, column or sub-box.
B.) Each row, column or sub-box is assumed to have on an average t(α) non-zero entries with no-repeating alphabet. This means some types of vectors with t(α) non-zero entries will never occur, thereby reducing the search space of solutions. This could also reduce k. For instance, fixing t(α) entries in a sub-box, a row and a column would reduce the search space from α4Ct(α)α2 to α4−(3α2−1)Ct(α)α2−3t(α).
A comment: May be a multi-user arbitrarily correlated Slepian-Wolf model will help make the entries independent while still respecting the atmost t(α)α2 non-zero entries criterion. However, if one could use it, one need not have gone through the compressed sensing route. So applicability of Slepian-Wolf might be hard.
C.)From an error correction analogy, an even significant reduction may be possible, since in higher dimensions, there could be gaps between the half-the-minimum-distance radii hamming balls around code points with a possibility to correct greater errors. This also should lead to reduction of k.
D.) V itself can be entropy compressed. If the entries of V are quite similar in sizes, then can we assume that the difference between any two of the entries is atmost O((√Vmax))=O(|α2|−−−√)? Then if encoding the differences between the entries suffices, this itself will remove the factor 2 in βlogα2=2kt(α)α2logα.
It would be interesting to see if 2k can be made equal or less than 2 using A.), B.), C.) and D.). This would be better than 89 bits (which is the best so far in other answers) and for the best case better than the absolute minimum for all puzzles which is around 73bits.