Zeit-Raum-Kompromiss für das Problem fehlender Elemente


14

Hier ist ein bekanntes Problem.

Bei gegebenem Array A[1n] positiver Ganzzahlen die kleinste positive Ganzzahl aus, die nicht im Array enthalten ist.

Das Problem kann in O(n) Raum und Zeit gelöst werden: Lesen Sie das Array, verfolgen Sie in O(n) Raum, ob 1,2,,n+1 aufgetreten sind, suchen Sie nach dem kleinsten Element.

Mir ist aufgefallen, dass man Raum gegen Zeit tauschen kann. Wenn Sie O(nk)Nur Speicher, du kannst es inkRunden machen und bekommst die ZeitO(kn). In einem speziellen Fall gibt es offensichtlich einen Algorithmus mit quadratischer Zeit und konstantem Raum.

Meine Frage ist:

Ist dies der optimale Kompromiss, dh ist timespace=Ω(n2) ? Wie beweist man solche Grenzen überhaupt?

Nehmen Sie ein RAM-Modell mit beschränkter Arithmetik und wahlfreiem Zugriff auf Arrays in O (1) an.

Inspiration für dieses Problem: Zeit-Raum-Kompromiss für Palindrome im Ein-Band-Modell (siehe zum Beispiel hier ).


2
No, you could sort your array in O(nlogn) then find missing number (first number should be 1, second should be 2, ... else you will find it) in O(n), this sorting could be done with inplace mergesort, means O(1) extra space, so time space belongs to O(nlogn). I don't know I got your problem exactly or not (because of this I didn't answer it, Also I don't know if there is a better bound).

Ich gehe davon aus, dass die Eingabe schreibgeschützt ist. (Ist dies nicht der Fall, kann das Problem in Zeit / O ( 1 ) Raum optimal gelöst werden: Multiplizieren Sie die Eingabe mit 2 und verwenden Sie die Parität, um den O ( n ) / O ( n ) -Algorithmus zu simulieren. )O(n)O(1)O(n)/O(n)
sdcvvc

Was ist der Konstantenraum-Algorithmus? Es scheint, als bräuchten Sie Platz für die n- 2- Version, die für mich "offensichtlich" istlognn2
Xodarap

In this model, word-size integers take O(1); if it's more convenient, you can answer any variant of the question with timespace=Ω(n2logkn) for some constant k.
sdcvvc

@sdcvvc, I can't understand your O(n)/O(1) algorithm, would you describe it a bit more? (just note that reading into bits takes O(logn)).

Antworten:


2

This can be done in O(nlogn) word operations and O(1) words of memory (respectively O(nlog2n) time and O(logn) memory in bit-level RAM model). Indeed, the solution will be based on the following observation.

Say there are n0 even and n1 odd numbers in range [1,n+1] (so n0n1 and n0+n1=n+1). Then there is b{0,1} such that there are at most nb1 values with parity b in the input. Indeed, otherwise there are at least n0 even and at least n1 odd values in the input, meaning that there are at least n0+n1=n+1 values in the input, contradiction (there are only n of them). It means that we can continue searching for missing number only among the odd or even numbers only. The same algorithm can be applied to higher bits of binary notation as well.

So our algorithm will look like that:

  1. Suppose that we already now that there are only x values in the input with remainder modulo 2b being equal to r[0,2b) but there are at least x+1 numbers in range [1,n+1] that have remainder r modulo 2b (at the start we know that for sure for b=0,r=0).

  2. Say there are x0 values in the input with remainder r modulo 2b+1 and x1 values in the input with remainder r+2b modulo 2b+1 (we can find these numbers in a single pass through the input). Clearly, x0+x1=x. Moreover, because there are at least x+1 numbers in the input with remainder r modulo 2b, at least one of the pairs (r,b+1),(r+2b,b+1) satisfies the requirements of the step 1.

  3. We have found the missing number when 2bn+1: there is only one number in range [1,n+1] that may have remainder r modulo 2b (r itself if it is in range), so there are at most zero values in the input that have such remainder. So r is indeed missing.

Clearly, the algorithm halts in O(logn) steps, each of them needs O(n) time (single pass over the input array). Moreover, only O(1) words of memory are required.


I'm happy to see the question answered after that time :)
sdcvvc

1

If I understand your definitions, this can be done in linear time with constant space. This is obviously the lowest bound, because we need to at least read the entire input.

The answer given in this question satisfies.

It's impossible to run this with less time or space, and adding extra time or space is useless, so there's no space-time tradeoff here. (Observe that n=O(n/k), so the tradeoff you observed doesn't hold asymptotically, in any case.)

In terms of your general question, I don't know of any nice theorems offhand which will help you prove space-time tradeoffs. This question seems to indicate that there isn't a (known) easy answer. Basically:

Suppose some language is decidable in t time (using some amount of space) and s space (using some amount of time). Can we find f,g such that L is decidable by M which runs in f(t,s) time and g(t,s) space?

is unknown, and a strong answer would solve a lot of open problems (most notably about SC), implying that no easy solution exists.


EDIT: Ok, with repetition (but I'm still assuming that with an input of size n the maximum possible number is n+1).

Observe that our algorithm needs to be able to differentiate between at least n possible answers. Suppose at each pass through the data we can get at most k pieces of data. Then we will need n/k passes to differentiate all answers. Assuming k=n/s then we run in nn/sn=sn time. So I think this proves what you want.

The difficulty is in showing that each time through we get only k bits. If you assume that our only legal operation is =, then we're good. However, if you allow more complex operations, then you'll be able to get more information.


3
The question you linked assumes that each number appears at most once. I do not make this assumption, so the solution does not apply. Thank you for second link.
sdcvvc

@sdcvvc: My mistake, I assumed you were using the version I'm familiar with. I don't have a complete answer, but it's too long for a comment - hopefully my edit is useful.
Xodarap

5
I don't buy your argument after "EDIT". Even if you can only collect k bits in a single pass, that's enough in principle to distinguish 2k possible outputs. So this argument can only imply a lower bound of n/2k passes, not n/k.
JeffE
Durch die Nutzung unserer Website bestätigen Sie, dass Sie unsere Cookie-Richtlinie und Datenschutzrichtlinie gelesen und verstanden haben.
Licensed under cc by-sa 3.0 with attribution required.