




  • Ti=first position of card whose value is i
  • Ui=last position of card whose value is i






これは興味深い問題ですが、「問題は本質的に」の後にあなたが書く数学についてはわかりません。最初の文では、ではなくを書くつもりでしたか?しかし、それでも文が正しいかどうかはわかりません。始まるシーケンスを考えてください。我々は持っているT 1 = 2 T 2 = 1ので、T 1 > T 2が、私は間違いなくあなたのテキストの説明を理解していれば、我々はまだそれから第二の位置と5位で2でエースを選ぶことができますか?したがって、T 1 < T 2は必要条件ではありませんか?2AAA2T1=2,T2=1T1>T2T1<T2

@TooToneああ、私はあなたが言ったようにを意味しました、そしてあなたは正しいです。T1<T2 ...必要条件ではない





    2     3     4     5     6     7     8     9     J     K     Q     T 
 1406  7740 16309 21241 19998 15127  9393  4906   976   190   380  2334 


# monte carlo card-drawing functions from here
# http://streaming.stat.iastate.edu/workshops/r-intro/lectures/5-Rprogramming.pdf

# create a straightforward deck of cards
create_deck <-
    function( ){
        suit <- c( "H" , "C" , "D" , "S" )
        rank <- c( "A" , 2:9 , "T" , "J" , "Q" , "K" )
        deck <- NULL
        for ( r in rank ) deck <- c( deck , paste( r , suit ) )

# construct a function to shuffle everything
shuffle <- function( deck ){ sample( deck , length( deck ) ) }

# draw one card at a time
draw_cards <-
    function( deck , start , n = 1 ){
        cards <- NULL

        for ( i in start:( start + n - 1 ) ){
            if ( i <= length( deck ) ){
                cards <- c( cards , deck[ i ] )

        return( cards )

# create an empty vector for your results
results <- NULL

# run your simulation this many times..
for ( i in seq( 100000 ) ){
    # create a new deck
    sdeck <- shuffle( create_deck() )

    d <- sdeck[ grep('A|2' , sdeck ) ]
    e <- identical( grep( "2" , d ) , 1:4 )

    # loop through ranks in this order
    rank <- c( "A" , 2:9 , "T" , "J" , "Q" , "K" )

    # start at this position
    card.position <- 0

    # start with a blank current.draw
    current.draw <- ""

    # start with a blank current rank
    this.rank <- NULL

    # start with the first rank
    rank.position <- 1

    # keep drawing until you find the rank you wanted
    while( card.position < 52 ){

        # increase the position by one every time
        card.position <- card.position + 1

        # store the current draw for testing next time
        current.draw <- draw_cards( sdeck , card.position )

        # if you draw the current rank, move to the next.
        if ( grepl( rank[ rank.position ] , current.draw ) ) rank.position <- rank.position + 1

        # if you have gone through every rank and are still not out of cards,
        # should it still be a king?  this assumes yes.
        if ( rank.position == length( rank ) ) break        


    # store the rank for this iteration.
    this.rank <- rank[ rank.position ]

    # at the end of the iteration, store the result
    results <- c( results , this.rank )


# print the final results
table( results )

# make A, T, J, Q, K numerics
results[ results == 'A' ] <- 1
results[ results == 'T' ] <- 10
results[ results == 'J' ] <- 11
results[ results == 'Q' ] <- 12
results[ results == 'K' ] <- 13
results <- as.numeric( results )

# and here's your expected value after 100,000 simulations.
mean( results )


あなたは正しい..それは270725のうちの1つです-またはRコードで1/prod( 48:1 / 52:5 )





シミュレーションには、次のことが重要です 正確かつ高速であるです。 これらの目的は両方とも、プログラミング環境のコア機能をターゲットとするコードと、できるだけ短くシンプルなコードを記述することをお勧めします。シンプルさは明快さをもたらし、明快さは正確さを促進するからです。ここに両方を達成するための私の試みがありRます:

# Simulate one play with a deck of `n` distinct cards in `k` suits.
sim <- function(n=13, k=4) {
  deck <- sample(rep(1:n, k)) # Shuffle the deck
  deck <- c(deck, 1:n)        # Add sentinels to terminate the loop
  k <- 0                      # Count the cards searched for
  for (j in 1:n) {
    k <- k+1                          # Count this card
    deck <- deck[-(1:match(j, deck))] # Deal cards until `j` is found
    if (length(deck) < n) break       # Stop when sentinels are reached
  return(k)                   # Return the number of cards searched

これを適用する を再現可能な方法では、次のようにreplicate乱数シードを設定した後、関数を使用します。

> set.seed(17);  system.time(d <- replicate(10^5, sim(13, 4)))
   user  system elapsed 
   5.46    0.00    5.46


> n <- length(d)
> mean(d)
[1] 5.83488

> sd(d) / sqrt(n)
[1] 0.005978956

後者は標準誤差です。シミュレートされた平均値は、真の値の2または3 SE以内にあると予想されます。 これは、から5.853の間のどこかに真の期待を置き5.8175.853ます。


u <- table(d)
u.se <- sqrt(u/n * (1-u/n)) / sqrt(n)
cards <- c("A", "2", "3", "4", "5", "6", "7", "8", "9", "T", "J", "Q", "K")
dimnames(u) <- list(sapply(dimnames(u), function(x) cards[as.integer(x)]))
print(rbind(frequency=u/n, SE=u.se), digits=2)


                2       3      4      5      6      7       8       9       T       J       Q       K
frequency 0.01453 0.07795 0.1637 0.2104 0.1995 0.1509 0.09534 0.04995 0.02249 0.01009 0.00345 0.00173
SE        0.00038 0.00085 0.0012 0.0013 0.0013 0.0011 0.00093 0.00069 0.00047 0.00032 0.00019 0.00013


draw <- function(deck) {
  n <- length(sentinels <- sort(unique(deck)))
  deck <- c(deck, sentinels)
  k <- 0
  for (j in sentinels) {
    k <- k+1
    deck <- deck[-(1:match(j, deck))]
    if (length(deck) < n) break

(使用することが可能です drawsimどこでもが、最初にdraw行う余分な作業により、の2倍遅くなりますsim。)

これを適用して使用できます 、すべての与えられたデッキの明確なシャッフル。ここでの目的は少数の1回限りのテストであるため、これらのシャッフルの生成効率は重要ではありません。簡単なブルートフォースの方法は次のとおりです。

n <- 4 # Distinct cards
k <- 2 # Number of suits
d <- expand.grid(lapply(1:(n*k), function(i) 1:n))
e <- apply(d, 1, function(x) var(tabulate(x))==0)
g <- apply(d, 1, function(x) length(unique(x))==n)
d <- d[e & g,]


d$result <- apply(as.matrix(d), 1, draw)
    (counts <- table(d$result))


   2    3    4 
 420  784 1316 

(の値方法によって、理解することは簡単です:我々はまだカードで作業されるだろう2。すべての補数は、すべてのACEを先行している場合と場合にのみ、この出来事の可能性(2つのスーツを持つ)である1 / 2 + 24202。うち2520曲の別個のシャッフル、2520/6=1/(2+22)=1/62520この特性を有します。)2520/6=420

カイ2乗検定で出力をテストできます。 この目的のために私が適用sim 、この場合に時間をN = 4枚ので異なるカードK = 2つのスーツ。10,000n=4k=2

>d.sim <- replicate(10^4, sim(n, k))
>print((rbind(table(d.sim) / length(d.sim), counts / dim(d)[1])), digits=3)

         2     3     4
[1,] 0.168 0.312 0.520
[2,] 0.167 0.311 0.522

> chisq.test(table(d.sim), p=counts / dim(d)[1])

    Chi-squared test for given probabilities

data:  table(d.sim) 
X-squared = 0.2129, df = 2, p-value = 0.899

ので非常に高いので、我々は何の間に有意差見つけないと言うと、網羅的な列挙によって計算された値を。nkの他の(小さな)値に対してこの演習を繰り返すと、同等の結果が得られ、n = 13およびk =に適用した場合に信頼する十分な理由が得られます。psimnksimn=13ます。k=4


>y <- c(1660,8414,16973,21495,20021,14549,8957,4546,2087,828,313,109)
>chisq.test(cbind(u, y))

data:  cbind(u, y) 
X-squared = 142.2489, df = 11, p-value < 2.2e-16

巨大なカイ2乗統計量は、本質的にゼロであるp値を生成します。疑いなく、sim他の答えに同意しません。 意見の相違には2つの解決策があります。これらの回答の1つ(または両方!)が間違っているか、質問の異なる解釈を実装しています。たとえば、「デッキがなくなった」とは、最後のカードを観察し、可能な場合は手順を終了する前に「あなたがいる数字」を更新した後と解釈しました。おそらく最後の一歩を踏み出すことを意図したものではなかった。おそらく、このような解釈の微妙な違いが意見の相違を説明し、その時点で質問を修正して、求められている内容を明確にすることができます。



  1. ランダムにシャッフルカードがランダムにシャッフルすることによって生成することができるNのカードを、次にランダムに散在残りkはN+kNkそれらの中にカード。

  2. エースのみをシャッフルし、次に(最初の観測を適用して)2つ、3つなどを点在させることにより、この問題は13ステップのチェーンと見なすことができます。

  3. 探しているカードの価値以上のものを追跡する必要があります。ただし、これを行う場合、すべてのカードに対するマークの位置を考慮する必要はありません。、同等またはそれ以下の値のカードに対するます。

    最初のエースにマークを付け、その後に見つかった最初の2つにマークを付けることを想像してください。(現在探しているカードを表示せずにデッキがなくなった場合、すべてのカードにマークを付けません。)各マークの「場所」(存在する場合)を等しいかそれ以下の値のカードの数にします(マークされたカード自体を含む)マークが作成されたときに対処されました。 場所にはすべての重要な情報が含まれています。

  4. マークが作成された後の場所は乱数です。特定のデッキについて、これらの場所のシーケンスは確率的プロセスを形成します。実際、それはマルコフ過程です(可変遷移行列を使用)。 したがって、12回の行列乗算から正確な答えを計算できます。ith

これらのアイデアを使用して、このマシンは、の値を取得で(倍精度浮動小数点で計算)1 / 9秒。正確な値のこの近似値19826005792658947850269453319689390235225425695.83258855290199651/9


この投稿の残りの部分では詳細を説明し、実用的な実装を紹介します( R)を示し、最後に質問とソリューションの効率性に関するコメントを示します。


N = k 1 + k 2 + + k mの「デッキ」(別名マルチセット)を考慮することは、実際には概念的にはより明確であり、最も低い金種のk 1、次のk 2最低など。(尋ね懸念などの質問は、デッキはで決定13 -ベクトル4 4 ... 4 。)N=k1+k2++kmk1k213(4,4,,4)

「ランダムシャッフル」カードから均一かつランダムに撮影した1つの順列であるN = N × N 1 × × 2 × 1 N個のカードの順列。これらのシャッフルは、k 1の「エース」を並べ替えても何も変わらず、k 2の「2」を並べ替えても何も変わらないため、同等の構成のグループに分類されます。したがって、カードのスーツが無視されたときに同一に見える順列の各グループにはk 2が含まれていますNN!=N×(N1)××2×1Nk1k2k1!×k2!××km! permutations. These groups, whose number therefore is given by the multinomial coefficient


are called "combinations" of the deck.

There is another way to count the combinations. The first k1 cards can form only k1!/k1!=1 combination. They leave k1+1 "slots" between and around them in which the next k2 cards can be placed. We could indicate this with a diagram where "" designates one of the k1 cards and "_" designates a slot that can hold between 0 and k2 additional cards:

_____k1 stars


Repeating this procedure with k3 "threes," we find there are ((k1+k2)+k3k1+k2,k3)=(k1+k2+k3)!(k1+k2)!k3! ways to intersperse them among the first k1+k2 cards. Therefore the total number of distinct ways to arrange the first k1+k2+k3 cards in this manner equals


After finishing the last kn cards and continuing to multiply these telescoping fractions, we find that the number of distinct combinations obtained equals the total number of combinations as previously counted, (Nk1,k2,,km). Therefore we have overlooked no combinations. That means this sequential process of shuffling the cards correctly captures the probabilities of each combination, assuming that at each stage each possible distinct way of interspersing the new cards among the old is taken with uniformly equal probability.

The place process

Initially, there are k1 aces and obviously the very first is marked. At later stages there are n=k1+k2++kj1 cards, the place (if a marked card exists) equals p (some value from 1 through n), and we are about to intersperse k=kj cards around them. We can visualize this with a diagram like

_____p1 stars____np stars

where "" designates the currently marked symbol. Conditional on this value of the place p, we wish to find the probability that the next place will equal q (some value from 1 through n+k; by the rules of the game, the next place must come after p, whence qp+1). If we can find how many ways there are to intersperse the k new cards in the blanks so that the next place equals q, then we can divide by the total number of ways to intersperse these cards (equal to (n+kk), as we have seen) to obtain the transition probability that the place changes from p to q. (There will also be a transition probability for the place to disappear altogether when none of the new cards follow the marked card, but there is no need to compute this explicitly.)

Let's update the diagram to reflect this situation:

_____p1 starss stars | ____nps stars

The vertical bar "|" shows where the first new card occurs after the marked card: no new cards may therefore appear between the and the | (and therefore no slots are shown in that interval). We do not know how many stars there are in this interval, so I have just called it s (which may be zero) The unknown s will disappear once we find the relationship between it and q.

Suppose, then, we intersperse j new cards around the stars before the and then--independently of that--we intersperse the remaining kj1 new cards around the stars after the |. There are


ways to do this. Notice, though--this is the trickiest part of the analysis--that the place of | equals p+s+j+1 because

  • There are p "old" cards at or before the mark.
  • There are s old cards after the mark but before |.
  • There are j new cards before the mark.
  • There is the new card represented by | itself.

Thus, τn,k(s,p) gives us information about the transition from place p to place q=p+s+j+1. When we track this information carefully for all possible values of s, and sum over all these (disjoint) possibilities, we obtain the conditional probability of place q following place p,


where the sum starts at j=max(0,q(n+1)) and ends at j=min(k1,q(p+1). (The variable length of this sum suggests there is unlikely to be a closed formula for it as a function of n,k,q, and p, except in special cases.)

The algorithm

Initially there is probability 1 that the place will be 1 and probability 0 it will have any other possible value in 2,3,,k1. This can be represented by a vector p1=(1,0,,0).

After interspersing the next k2 cards, the vector p1 is updated to p2 by multiplying it (on the left) by the transition matrix (Prk1,k2(q|p),1pk1,1qk2). This is repeated until all k1+k2++km cards have been placed. At each stage j, the sum of the entries in the probability vector pj is the chance that some card has been marked. Whatever remains to make the value equal to 1 therefore is the chance that no card is left marked after step j. The successive differences in these values therefore give us the probability that we could not find a card of type j to mark: that is the probability distribution of the value of the card we were looking for when the deck runs out at the end of the game.


The following R code implements the algorithm. It parallels the preceding discussion. First, calculation of the transition probabilities is performed by t.matrix (without normalization with the division by (n+kk), making it easier to track the calculations when testing the code):

t.matrix <- function(q, p, n, k) {
  j <- max(0, q-(n+1)):min(k-1, q-(p+1))
  return (sum(choose(p-1+j,j) * choose(n+k-q, k-1-j))

This is used by transition to update pj1 to pj. It calculates the transition matrix and performs the multiplication. It also takes care of computing the initial vector p1 if the argument p is an empty vector:

# `p` is the place distribution: p[i] is the chance the place is `i`.
transition <- function(p, k) {
  n <- length(p)
  if (n==0) {
    q <- c(1, rep(0, k-1))
  } else {
    # Construct the transition matrix.
    t.mat <- matrix(0, nrow=n, ncol=(n+k))
    #dimnames(t.mat) <- list(p=1:n, q=1:(n+k))
    for (i in 1:n) {
      t.mat[i, ] <- c(rep(0, i), sapply((i+1):(n+k), 
                                        function(q) t.matrix(q, i, n, k)))
    # Normalize and apply the transition matrix.
    q <- as.vector(p %*% t.mat / choose(n+k, k))
  names(q) <- 1:(n+k)
  return (q)

We can now easily compute the non-mark probabilities at each stage for any deck:

# `k` is an array giving the numbers of each card in order;
# e.g., k = rep(4, 13) for a standard deck.
# NB: the *complements* of the p-vectors are output.
game <- function(k) {
  p <- numeric(0)
  q <- sapply(k, function(i) 1 - sum(p <<- transition(p, i)))
  names(q) <- names(k)
  return (q)

Here they are for the standard deck:

k <- rep(4, 13)
names(k) <- c("A", 2:9, "T", "J", "Q", "K")
(g <- game(k))

The output is

         A          2          3          4          5          6          7          8          9          T          J          Q          K 
0.00000000 0.01428571 0.09232323 0.25595013 0.46786622 0.66819134 0.81821790 0.91160622 0.96146102 0.98479430 0.99452614 0.99818922 0.99944610

According to the rules, if a king was marked then we would not look for any further cards: this means the value of 0.9994461 has to be increased to 1. Upon doing so, the differences give the distribution of the "number you will be on when the deck runs out":

> g[13] <- 1; diff(g)
          2           3           4           5           6           7           8           9           T           J           Q           K 
0.014285714 0.078037518 0.163626897 0.211916093 0.200325120 0.150026562 0.093388313 0.049854807 0.023333275 0.009731843 0.003663077 0.001810781

(Compare this to the output I report in a separate answer describing a Monte-Carlo simulation: they appear to be the same, up to expected amounts of random variation.)

The expected value is immediate:

> sum(diff(g) * 2:13)
[1] 5.832589

All told, this required only a dozen lines or so of executable code. I have checked it against hand calculations for small values of k (up to 3). Thus, if any discrepancy becomes apparent between the code and the preceding analysis of the problem, trust the code (because the analysis may have typographical errors).


Relationships to other sequences

When there is one of each card, the distribution is a sequence of reciprocals of whole numbers:

> 1/diff(game(rep(1,10)))
[1]      2      3      8     30    144    840   5760  45360 403200

The value at place i is i!+(i1)! (starting at place i=1). This is sequence A001048 in the Online Encyclopedia of Integer Sequences. Accordingly, we might hope for a closed formula for the decks with constant ki (the "suited" decks) that would generalize this sequence, which itself has some profound meanings. (For instance, it counts sizes of the largest conjugacy classes in permutation groups and is also related to trinomial coefficients.) (Unfortunately, the reciprocals in the generalization for k>1 are not usually integers.)

The game as a stochastic process

Our analysis makes it clear that the initial i coefficients of the vectors pj, ji, are constant. For example, let's track the output of game as it processes each group of cards:

> sapply(1:13, function(i) game(rep(4,i)))

[1] 0

[1] 0.00000000 0.01428571

[1] 0.00000000 0.01428571 0.09232323

[1] 0.00000000 0.01428571 0.09232323 0.25595013


 [1] 0.00000000 0.01428571 0.09232323 0.25595013 0.46786622 0.66819134 0.81821790 0.91160622 0.96146102 0.98479430 0.99452614 0.99818922 0.99944610

For instance, the second value of the final vector (describing the results with a full deck of 52 cards) already appeared after the second group was processed (and equals 1/(84)=1/70). Thus, if you want information only about the marks up through the jth card value, you only have to perform the calculation for a deck of k1+k2++kj cards.

Because the chance of not marking a card of value j is getting quickly close to 1 as j increases, after 13 types of cards in four suits we have almost reached a limiting value for the expectation. Indeed, the limiting value is approximately 5.833355 (computed for a deck of 4×32 cards, at which point double precision rounding error prevents going any further).


Looking at the algorithm applied to the m-vector (k,k,,k), we see its timing should be proportional to k2 and--using a crude upper bound--not any worse than proportional to m3. By timing all calculations for k=1 through 7 and n=10 through 30, and analyzing only those taking relatively long times (1/2 second or longer), I estimate the computation time is approximately O(k2n2.9), supporting this upper-bound assessment.

One use of these asymptotics is to project calculation times for larger problems. For instance, seeing that the case k=4,n=30 takes about 1.31 seconds, we would estimate that the (very interesting) case k=1,n=100 would take about 1.31(1/4)2(100/30)2.92.7 seconds. (It actually takes 2.87 seconds.)


Hacked a simple Monte Carlo in Perl and found approximately 5.8329.


use strict;

my @deck = (1..13) x 4;

my $N = 100000; # Monte Carlo iterations.

my $mean = 0;

for (my $i = 1; $i <= $N; $i++) {
    my @d = @deck;
    my $last = 0;
        foreach my $c (@d) {
        if ($c == $last + 1) { $last = $c }
    $mean += ($last + 1) / $N;

print $mean, "\n";

sub fisher_yates_shuffle {
    my $array = shift;
        my $i = @$array;
        while (--$i) {
        my $j = int rand($i + 1);
        @$array[$i, $j] = @$array[$j, $i];

Given the sharp discrepancy between this and all the previous answers, including two simulations and a theoretical (exact) one, I suspect you are interpreting the question in a different way. In the absence of any explanation on your part, we just have to take it as being wrong. (I suspect you may be counting one less, in which case your 4.8 should be compared to 5.83258...; but even then, your two significant digits of precision provide no additional insight into this problem.)

Yep! There was an off-by-one mistake.
