Es gibt eine genaue Antwort (in Form eines Matrixprodukts, dargestellt in Punkt 4 unten). Aus diesen Beobachtungen ergibt sich ein einigermaßen effizienter Algorithmus für die Berechnung:
Eine zufällige Mischung von Karten kann durch zufälliges Mischen von N Karten und anschließendes zufälliges Verteilen der verbleibenden k Karten erzeugt werdenN+kNk Karten in diesen Karten erzeugt werden.
Indem Sie nur die Asse mischen und dann (unter Anwendung der ersten Beobachtung) die Zweien, dann die Dreien und so weiter durchmischen, kann dieses Problem als eine Kette von dreizehn Schritten angesehen werden.
Wir müssen mehr als den Wert der gesuchten Karte im Auge behalten. Dabei müssen wir jedoch nicht die Position der Marke in Bezug auf alle Karten berücksichtigen, sondern nur die Position in Bezug auf Karten mit gleichem oder kleinerem Wert.
Stellen Sie sich vor, Sie setzen eine Markierung auf das erste Ass und markieren dann die ersten beiden, die danach gefunden wurden, und so weiter. (Wenn zu irgendeinem Zeitpunkt der Stapel leer wird, ohne dass die Karte angezeigt wird, die wir gerade suchen, bleiben alle Karten unmarkiert.) Der "Platz" jeder Markierung (sofern vorhanden) entspricht der Anzahl der Karten mit dem gleichen oder einem niedrigeren Wert wurden ausgeteilt, als die Marke gemacht wurde (einschließlich der markierten Karte selbst). Die Orte enthalten alle wesentlichen Informationen.
Die Stelle nach der Markierung ist eine Zufallszahl. Für ein bestimmtes Deck bildet die Reihenfolge dieser Orte einen stochastischen Prozess. Es ist in der Tat ein Markov-Prozess (mit variabler Übergangsmatrix). Eine genaue Antwort kann daher aus zwölf Matrixmultiplikationen berechnet werden.ith
Unter Verwendung dieser Ideen erhält diese Maschine ein Wert von (computing in double precision floating point) in 1 / 9 Sekunden. Diese Annäherung des genauen Wertes 19826005792658947850269453319689390235225425695.83258855290199651/9
1982600579265894785026945331968939023522542569339917784579447928182134345929899510000000000
mit allen angezeigten Ziffern .
Der Rest dieses Beitrags enthält Details, stellt eine funktionierende Implementierung vor (in R
) und schließt mit einigen Kommentaren zu der Frage und der Effizienz der Lösung.
Zufälliges Mischen eines Decks
Tatsächlich ist es konzeptionell klarer und mathematisch nicht komplizierter, ein "Deck" (auch bekannt als Multiset ) von Karten zu betrachten, von denen es k 1 mit dem niedrigsten Nennwert gibt, k 2 mit dem nächsten am niedrigsten und so weiter. (Die gestellte Frage betrifft das vom 13- Vektor festgelegte Deck ( 4 , 4 , … , 4 ).N=k1+k2+⋯+kmk1k213(4,4,…,4) .)
Ein "zufälliges Mischen" von Karten ist eine Permutation, die gleichmäßig und zufällig aus dem N entnommen wird ! = N × ( N - 1 ) × ⋯ × 2 × 1 Permutationen der N Karten. Diese Shuffles fallen in Gruppen äquivalenter Konfigurationen, weil das Permutieren der k 1 "Asse" untereinander nichts ändert, das Permutieren der k 2 "Zweien" untereinander ebenfalls nichts ändert und so weiter. Daher enthält jede Gruppe von Permutationen, die identisch aussehen, wenn die Farben der Karten ignoriert werden, k 1NN!=N×(N−1)×⋯×2×1Nk1k2Permutationen. Diese Gruppen, deren Anzahl sich daher aus demMultinomialkoeffizienten ergibtk1!×k2!×⋯×km!
(Nk1,k2,…,km)=N!k1!k2!⋯km!,
werden "Kombinationen" des Decks genannt.
Es gibt eine andere Möglichkeit, die Kombinationen zu zählen. Die ersten -Karten können nur k 1 bilden ! / k 1 ! = 1 Kombination. Sie belassen k 1 + 1 "Slots" zwischen und um sie herum, in die die nächsten k 2 Karten gelegt werden können. Wir könnten dies mit einem Diagramm anzeigen, in dem " ∗ " eine der k 1 -Karten und " _ " einen Steckplatz bezeichnet, der zwischen 0 und k 2 zusätzliche Karten aufnehmen kann:k1k1!/k1!=1k1+1k2∗k1_0k2
_∗_∗_⋯_∗_k1 stars
k2k1+k2(k1+k2k1,k2)=(k1+k2)!k1!k2!
k3 "threes," we find there are ((k1+k2)+k3k1+k2,k3)=(k1+k2+k3)!(k1+k2)!k3! ways to intersperse them among the first k1+k2 cards. Therefore the total number of distinct ways to arrange the first k1+k2+k3 cards in this manner equals
1×(k1+k2)!k1!k2!×(k1+k2+k3)!(k1+k2)!k3!=(k1+k2+k3)!k1!k2!k3!.
After finishing the last kn cards and continuing to multiply these telescoping fractions, we find that the number of distinct combinations obtained equals the total number of combinations as previously counted, (Nk1,k2,…,km). Therefore we have overlooked no combinations. That means this sequential process of shuffling the cards correctly captures the probabilities of each combination, assuming that at each stage each possible distinct way of interspersing the new cards among the old is taken with uniformly equal probability.
The place process
Initially, there are k1 aces and obviously the very first is marked. At later stages there are n=k1+k2+⋯+kj−1 cards, the place (if a marked card exists) equals p (some value from 1 through n), and we are about to intersperse k=kj cards around them. We can visualize this with a diagram like
_∗_∗_⋯_∗_p−1 stars⊙_∗_⋯_∗_n−p stars
where "⊙" designates the currently marked symbol. Conditional on this value of the place p, we wish to find the probability that the next place will equal q (some value from 1 through n+k; by the rules of the game, the next place must come after p, whence q≥p+1). If we can find how many ways there are to intersperse the k new cards in the blanks so that the next place equals q, then we can divide by the total number of ways to intersperse these cards (equal to (n+kk), as we have seen) to obtain the transition probability that the place changes from p to q. (There will also be a transition probability for the place to disappear altogether when none of the new cards follow the marked card, but there is no need to compute this explicitly.)
Let's update the diagram to reflect this situation:
_∗_∗_⋯_∗_p−1 stars⊙∗∗⋯∗s stars | _∗_⋯_∗_n−p−s stars
The vertical bar "|" shows where the first new card occurs after the marked card: no new cards may therefore appear between the ⊙ and the | (and therefore no slots are shown in that interval). We do not know how many stars there are in this interval, so I have just called it s (which may be zero) The unknown s will disappear once we find the relationship between it and q.
Suppose, then, we intersperse j new cards around the stars before the ⊙ and then--independently of that--we intersperse the remaining k−j−1 new cards around the stars after the |. There are
τn,k(s,p)=((p−1)+jj)((n−p−s)+(k−j)−1k−j−1)
ways to do this. Notice, though--this is the trickiest part of the analysis--that the place of | equals p+s+j+1 because
- There are p "old" cards at or before the mark.
- There are s old cards after the mark but before |.
- There are j new cards before the mark.
- There is the new card represented by | itself.
Thus, τn,k(s,p) gives us information about the transition from place p to place q=p+s+j+1. When we track this information carefully for all possible values of s, and sum over all these (disjoint) possibilities, we obtain the conditional probability of place q following place p,
Prn,k(q|p)=(∑j(p−1+jj)(n+k−qk−j−1))/(n+kk)
where the sum starts at j=max(0,q−(n+1)) and ends at j=min(k−1,q−(p+1). (The variable length of this sum suggests there is unlikely to be a closed formula for it as a function of n,k,q, and p, except in special cases.)
The algorithm
Initially there is probability 1 that the place will be 1 and probability 0 it will have any other possible value in 2,3,…,k1. This can be represented by a vector p1=(1,0,…,0).
After interspersing the next k2 cards, the vector p1 is updated to p2 by multiplying it (on the left) by the transition matrix (Prk1,k2(q|p),1≤p≤k1,1≤q≤k2). This is repeated until all k1+k2+⋯+km cards have been placed. At each stage j, the sum of the entries in the probability vector pj is the chance that some card has been marked. Whatever remains to make the value equal to 1 therefore is the chance that no card is left marked after step j. The successive differences in these values therefore give us the probability that we could not find a card of type j to mark: that is the probability distribution of the value of the card we were looking for when the deck runs out at the end of the game.
Implementation
The following R
code implements the algorithm. It parallels the preceding discussion. First, calculation of the transition probabilities is performed by t.matrix
(without normalization with the division by (n+kk), making it easier to track the calculations when testing the code):
t.matrix <- function(q, p, n, k) {
j <- max(0, q-(n+1)):min(k-1, q-(p+1))
return (sum(choose(p-1+j,j) * choose(n+k-q, k-1-j))
}
This is used by transition
to update pj−1 to pj. It calculates the transition matrix and performs the multiplication. It also takes care of computing the initial vector p1 if the argument p
is an empty vector:
#
# `p` is the place distribution: p[i] is the chance the place is `i`.
#
transition <- function(p, k) {
n <- length(p)
if (n==0) {
q <- c(1, rep(0, k-1))
} else {
#
# Construct the transition matrix.
#
t.mat <- matrix(0, nrow=n, ncol=(n+k))
#dimnames(t.mat) <- list(p=1:n, q=1:(n+k))
for (i in 1:n) {
t.mat[i, ] <- c(rep(0, i), sapply((i+1):(n+k),
function(q) t.matrix(q, i, n, k)))
}
#
# Normalize and apply the transition matrix.
#
q <- as.vector(p %*% t.mat / choose(n+k, k))
}
names(q) <- 1:(n+k)
return (q)
}
We can now easily compute the non-mark probabilities at each stage for any deck:
#
# `k` is an array giving the numbers of each card in order;
# e.g., k = rep(4, 13) for a standard deck.
#
# NB: the *complements* of the p-vectors are output.
#
game <- function(k) {
p <- numeric(0)
q <- sapply(k, function(i) 1 - sum(p <<- transition(p, i)))
names(q) <- names(k)
return (q)
}
Here they are for the standard deck:
k <- rep(4, 13)
names(k) <- c("A", 2:9, "T", "J", "Q", "K")
(g <- game(k))
The output is
A 2 3 4 5 6 7 8 9 T J Q K
0.00000000 0.01428571 0.09232323 0.25595013 0.46786622 0.66819134 0.81821790 0.91160622 0.96146102 0.98479430 0.99452614 0.99818922 0.99944610
According to the rules, if a king was marked then we would not look for any further cards: this means the value of 0.9994461 has to be increased to 1. Upon doing so, the differences give the distribution of the "number you will be on when the deck runs out":
> g[13] <- 1; diff(g)
2 3 4 5 6 7 8 9 T J Q K
0.014285714 0.078037518 0.163626897 0.211916093 0.200325120 0.150026562 0.093388313 0.049854807 0.023333275 0.009731843 0.003663077 0.001810781
(Compare this to the output I report in a separate answer describing a Monte-Carlo simulation: they appear to be the same, up to expected amounts of random variation.)
The expected value is immediate:
> sum(diff(g) * 2:13)
[1] 5.832589
All told, this required only a dozen lines or so of executable code. I have checked it against hand calculations for small values of k (up to 3). Thus, if any discrepancy becomes apparent between the code and the preceding analysis of the problem, trust the code (because the analysis may have typographical errors).
Remarks
Relationships to other sequences
When there is one of each card, the distribution is a sequence of reciprocals of whole numbers:
> 1/diff(game(rep(1,10)))
[1] 2 3 8 30 144 840 5760 45360 403200
The value at place i is i!+(i−1)! (starting at place i=1). This is sequence A001048 in the Online Encyclopedia of Integer Sequences. Accordingly, we might hope for a closed formula for the decks with constant ki (the "suited" decks) that would generalize this sequence, which itself has some profound meanings. (For instance, it counts sizes of the largest conjugacy classes in permutation groups and is also related to trinomial coefficients.) (Unfortunately, the reciprocals in the generalization for k>1 are not usually integers.)
The game as a stochastic process
Our analysis makes it clear that the initial i coefficients of the vectors pj, j≥i, are constant. For example, let's track the output of game
as it processes each group of cards:
> sapply(1:13, function(i) game(rep(4,i)))
[[1]]
[1] 0
[[2]]
[1] 0.00000000 0.01428571
[[3]]
[1] 0.00000000 0.01428571 0.09232323
[[4]]
[1] 0.00000000 0.01428571 0.09232323 0.25595013
...
[[13]]
[1] 0.00000000 0.01428571 0.09232323 0.25595013 0.46786622 0.66819134 0.81821790 0.91160622 0.96146102 0.98479430 0.99452614 0.99818922 0.99944610
For instance, the second value of the final vector (describing the results with a full deck of 52 cards) already appeared after the second group was processed (and equals 1/(84)=1/70). Thus, if you want information only about the marks up through the jth card value, you only have to perform the calculation for a deck of k1+k2+⋯+kj cards.
Because the chance of not marking a card of value j is getting quickly close to 1 as j increases, after 13 types of cards in four suits we have almost reached a limiting value for the expectation. Indeed, the limiting value is approximately 5.833355 (computed for a deck of 4×32 cards, at which point double precision rounding error prevents going any further).
Timing
Looking at the algorithm applied to the m-vector (k,k,…,k), we see its timing should be proportional to k2 and--using a crude upper bound--not any worse than proportional to m3. By timing all calculations for k=1 through 7 and n=10 through 30, and analyzing only those taking relatively long times (1/2 second or longer), I estimate the computation time is approximately O(k2n2.9), supporting this upper-bound assessment.
One use of these asymptotics is to project calculation times for larger problems. For instance, seeing that the case k=4,n=30 takes about 1.31 seconds, we would estimate that the (very interesting) case k=1,n=100 would take about 1.31(1/4)2(100/30)2.9≈2.7 seconds. (It actually takes 2.87 seconds.)