Zeit-Raum-Kompromiss für das Problem fehlender Elemente

Hier ist ein bekanntes Problem.

Bei gegebenem Array $A[1\dots n]$ positiver Ganzzahlen die kleinste positive Ganzzahl aus, die nicht im Array enthalten ist.

Das Problem kann in $O(n)$ Raum und Zeit gelöst werden: Lesen Sie das Array, verfolgen Sie in $O(n)$ Raum, ob $1,2,\dots,n+1$ aufgetreten sind, suchen Sie nach dem kleinsten Element.

Mir ist aufgefallen, dass man Raum gegen Zeit tauschen kann. Wenn Sie $O(\frac{n}{k})$ Nur Speicher, du kannst es in $k$ Runden machen und bekommst die Zeit $O(k n)$ . In einem speziellen Fall gibt es offensichtlich einen Algorithmus mit quadratischer Zeit und konstantem Raum.

Meine Frage ist:

Ist dies der optimale Kompromiss, dh ist $\operatorname{time} \cdot \operatorname{space} = \Omega(n^2)$ ? Wie beweist man solche Grenzen überhaupt?

Nehmen Sie ein RAM-Modell mit beschränkter Arithmetik und wahlfreiem Zugriff auf Arrays in O (1) an.

Inspiration für dieses Problem: Zeit-Raum-Kompromiss für Palindrome im Ein-Band-Modell (siehe zum Beispiel hier ).

complexity-theory time-complexity space-complexity

— sdcvvc
quelle

No, you could sort your array in

O (n \log n)

$O(n \log n)$ then find missing number (first number should be 1, second should be 2, ... else you will find it) in O(n), this sorting could be done with inplace mergesort, means

O (1)

$O(1)$ extra space, so time

\cdot

$\cdot$ space belongs to

O (n \log n)

$O(n \log n)$ . I don't know I got your problem exactly or not (because of this I didn't answer it, Also I don't know if there is a better bound).

Ich gehe davon aus, dass die Eingabe schreibgeschützt ist. (Ist dies nicht der Fall, kann das Problem in

Zeit /

Raum optimal gelöst werden: Multiplizieren Sie die Eingabe mit 2 und verwenden Sie die Parität, um den

-Algorithmus zu simulieren. )

O (n)

$O(n)$

O (1)

$O(1)$

O (n) / O (n)

$O(n)/O(n)$

— sdcvvc

Was ist der Konstantenraum-Algorithmus? Es scheint, als bräuchten Sie

Platz für die

Version, die für mich "offensichtlich" ist

\log n

$\log n$

n^{2}

$n^2$

— Xodarap

In this model, word-size integers take

O (1)

$O(1)$ ; if it's more convenient, you can answer any variant of the question with

time \cdot space = Ω (\frac{n^{2}}{\log^{k} n})

$\operatorname{time} \cdot \operatorname{space} = \Omega(\frac{n^2}{\log^k n})$ for some constant

k

$k$ .

— sdcvvc

@sdcvvc, I can't understand your

O (n) / O (1)

$O(n)/O(1)$ algorithm, would you describe it a bit more? (just note that reading into bits takes

O (\log n)

$O(\log n)$ ).

Antworten:

This can be done in $O(n \log n)$ word operations and $O(1)$ words of memory (respectively $O(n \log^2 n)$ time and $O(\log n)$ memory in bit-level RAM model). Indeed, the solution will be based on the following observation.

Say there are $n_0$ even and $n_1$ odd numbers in range $[1, n + 1]$ (so $n_0 \approx n_1$ and $n_0 + n_1 = n + 1$ ). Then there is $b \in \{0, 1\}$ such that there are at most $n_b - 1$ values with parity $b$ in the input. Indeed, otherwise there are at least $n_0$ even and at least $n_1$ odd values in the input, meaning that there are at least $n_0 + n_1 = n + 1$ values in the input, contradiction (there are only $n$ of them). It means that we can continue searching for missing number only among the odd or even numbers only. The same algorithm can be applied to higher bits of binary notation as well.

So our algorithm will look like that:

Suppose that we already now that there are only $x$ values in the input with remainder modulo $2^b$ being equal to $r \in [0, 2^b)$ but there are at least $x + 1$ numbers in range $[1, n + 1]$ that have remainder $r$ modulo $2^b$ (at the start we know that for sure for $b = 0, r = 0$ ).
Say there are $x_0$ values in the input with remainder $r$ modulo $2^{b + 1}$ and $x_1$ values in the input with remainder $r + 2^b$ modulo $2^{b + 1}$ (we can find these numbers in a single pass through the input). Clearly, $x_0 + x_1 = x$ . Moreover, because there are at least $x + 1$ numbers in the input with remainder $r$ modulo $2^b$ , at least one of the pairs $(r, b + 1), (r + 2^b, b + 1)$ satisfies the requirements of the step $1$ .
We have found the missing number when $2^b \geqslant n + 1$ : there is only one number in range $[1, n + 1]$ that may have remainder $r$ modulo $2^b$ ( $r$ itself if it is in range), so there are at most zero values in the input that have such remainder. So $r$ is indeed missing.

Clearly, the algorithm halts in $O(\log n)$ steps, each of them needs $O(n)$ time (single pass over the input array). Moreover, only $O(1)$ words of memory are required.

— Kaban-5
quelle

I'm happy to see the question answered after that time :)

— sdcvvc

If I understand your definitions, this can be done in linear time with constant space. This is obviously the lowest bound, because we need to at least read the entire input.

The answer given in this question satisfies.

It's impossible to run this with less time or space, and adding extra time or space is useless, so there's no space-time tradeoff here. (Observe that $n=O(n/k)$ , so the tradeoff you observed doesn't hold asymptotically, in any case.)

In terms of your general question, I don't know of any nice theorems offhand which will help you prove space-time tradeoffs. This question seems to indicate that there isn't a (known) easy answer. Basically:

Suppose some language is decidable in $t$ time (using some amount of space) and $s$ space (using some amount of time). Can we find $f,g$ such that $L$ is decidable by $M$ which runs in $f(t,s)$ time and $g(t,s)$ space?

is unknown, and a strong answer would solve a lot of open problems (most notably about SC), implying that no easy solution exists.

EDIT: Ok, with repetition (but I'm still assuming that with an input of size $n$ the maximum possible number is $n+1$ ).

Observe that our algorithm needs to be able to differentiate between at least $n$ possible answers. Suppose at each pass through the data we can get at most $k$ pieces of data. Then we will need $n/k$ passes to differentiate all answers. Assuming $k=n/s$ then we run in $\frac{n}{n/s}n=sn$ time. So I think this proves what you want.

The difficulty is in showing that each time through we get only $k$ bits. If you assume that our only legal operation is =, then we're good. However, if you allow more complex operations, then you'll be able to get more information.

— Xodarap
quelle

The question you linked assumes that each number appears at most once. I do not make this assumption, so the solution does not apply. Thank you for second link.

— sdcvvc

@sdcvvc: My mistake, I assumed you were using the version I'm familiar with. I don't have a complete answer, but it's too long for a comment - hopefully my edit is useful.

— Xodarap

I don't buy your argument after "EDIT". Even if you can only collect

k

$k$ bits in a single pass, that's enough in principle to distinguish

2^{k}

$2^k$ possible outputs. So this argument can only imply a lower bound of

n / 2^{k}

$n/2^k$ passes, not

n / k

$n/k$ .

— JeffE