重み付きコレクションを作成して、そこからランダムな要素を選択するにはどうすればよいですか?


34

ランダムなアイテムで埋めたい戦利品ボックスがあります。しかし、私はそれぞれのアイテムが選ばれる異なったチャンスを持ちたいです。例えば:

  • 10ゴールドの5%チャンス
  • 20%の確率で剣
  • 45%の確率でシールド
  • 装甲の20%の確率
  • ポーションの確率10%

上記のアイテムのうちの1つを選択するようにするにはどうすればよいですか?それらのパーセンテージは戦利品を得るそれぞれのチャンスです?


1
参考までに、理論的には、サンプルごとのO(1)時間は、有限分布、エントリが動的に変化する分布でも可能です。たとえばcstheory.stackexchange.com/questions/37648/…を参照してください。
ニールヤング

回答:


37

The Soft-coded Probabilities Solution

The hardcoded probability solution has the disadvantage that you need to set the probabilities in your code. You can't determine them at runtime. It is also hard to maintain.

Here is a dynamic version of the same algorithm.

  1. Create an array of pairs of actual items and weight of each item
  2. When you add an item, the weight of the item needs to be its own weight plus the sum of the weights of all items already in the array. So you should track the sum separately. Especially because you will need it for the next step.
  3. オブジェクトを取得するには、0からすべてのアイテムの重みの合計までの乱数を生成します
  4. 重みが乱数以上のエントリを見つけるまで、配列を最初から最後まで繰り返します

これは、ゲームで使用するオブジェクトのインスタンスを作成できるテンプレートクラスの形式でのJavaのサンプル実装です。その後、メソッド.addEntry(object, relativeWeight)を使用してオブジェクトを追加し、以前に追加したエントリの1つを選択できます.get()

import java.util.ArrayList;
import java.util.List;
import java.util.Random;

public class WeightedRandomBag<T extends Object> {

    private class Entry {
        double accumulatedWeight;
        T object;
    }

    private List<Entry> entries = new ArrayList<>();
    private double accumulatedWeight;
    private Random rand = new Random();

    public void addEntry(T object, double weight) {
        accumulatedWeight += weight;
        Entry e = new Entry();
        e.object = object;
        e.accumulatedWeight = accumulatedWeight;
        entries.add(e);
    }

    public T getRandom() {
        double r = rand.nextDouble() * accumulatedWeight;

        for (Entry entry: entries) {
            if (entry.accumulatedWeight >= r) {
                return entry.object;
            }
        }
        return null; //should only happen when there are no entries
    }
}

使用法:

WeightedRandomBag<String> itemDrops = new WeightedRandomBag<>();

// Setup - a real game would read this information from a configuration file or database
itemDrops.addEntry("10 Gold",  5.0);
itemDrops.addEntry("Sword",   20.0);
itemDrops.addEntry("Shield",  45.0);
itemDrops.addEntry("Armor",   20.0);
itemDrops.addEntry("Potion",  10.0);

// drawing random entries from it
for (int i = 0; i < 20; i++) {
    System.out.println(itemDrops.getRandom());
}

Unity、XNA、またはMonoGameプロジェクト用にC#で実装された同じクラスを次に示します。

using System;
using System.Collections.Generic;

class WeightedRandomBag<T>  {

    private struct Entry {
        public double accumulatedWeight;
        public T item;
    }

    private List<Entry> entries = new List<Entry>();
    private double accumulatedWeight;
    private Random rand = new Random();

    public void AddEntry(T item, double weight) {
        accumulatedWeight += weight;
        entries.Add(new Entry { item = item, accumulatedWeight = accumulatedWeight });
    }

    public T GetRandom() {
        double r = rand.NextDouble() * accumulatedWeight;

        foreach (Entry entry in entries) {
            if (entry.accumulatedWeight >= r) {
                return entry.item;
            }
        }
        return default(T); //should only happen when there are no entries
    }
}

そして、ここにJavaScriptの 1つがあります

var WeightedRandomBag = function() {

    var entries = [];
    var accumulatedWeight = 0.0;

    this.addEntry = function(object, weight) {
        accumulatedWeight += weight;
        entries.push( { object: object, accumulatedWeight: accumulatedWeight });
    }

    this.getRandom = function() {
        var r = Math.random() * accumulatedWeight;
        return entries.find(function(entry) {
            return entry.accumulatedWeight >= r;
        }).object;
    }   
}

プロ:

  • 任意の重量比を処理できます。必要に応じて、天文学的に小さな確率でアイテムをセットに含めることができます。また、重みを合計する必要はありません。
  • 実行時にアイテムと重量を読むことができます
  • 配列内のアイテムの数に比例したメモリ使用量

コントラ:

  • 正しく動作させるには、さらにプログラミングが必要です
  • 最悪の場合、配列全体を反復する必要があります(O(n)実行時の複雑さ)。そのため、非常に大きなアイテムのセットがあり、非常に頻繁に描画すると、遅くなる可能性があります。単純な最適化では、最も可能性の高い項目を最初に配置して、ほとんどの場合アルゴリズムが早期に終了するようにします。あなたができるより複雑な最適化は、配列がソートされているという事実を利用して二分探索を行うことです。これにはO(log n)時間がかかります。
  • リストを使用する前にメモリに構築する必要があります(実行時にアイテムを簡単に追加できますが、アイテムを削除することもできますが、削除されたエントリの後に来るすべてのアイテムの累積ウェイトを更新する必要があります)再び持っているO(n))、最悪の場合の実行時間を

2
The C# code could be written using LINQ: return entries.FirstOrDefault(e => e.accumulatedWeight >= r). More importantly, there is a slight possibility that due to floating point precision loss this algorithm will return null if the random value gets just a tiny bit greater than the accumulated value. As a precaution, you might add a small value (say, 1.0) to the last element, but then you would have to explicitly state in your code that the list is final.
IMil

1
One small variant on this I've used personally, if you want the weight values in runtime to not be changed to the weight-plus-all-previous value, you can subtract the weight of each passed entry from your random value, stopping when the random value is less than the current items weight (or when subtracting the weight makes the value < 0)
Lunin

2
@BlueRaja-DannyPflughoeft premature optimization... the question was about selecting an object from an open loot box. Who is going to open 1000 boxes per second?
IMil

4
@IMil: No, the question is a general catch-all for selecting random weighted items. For lootboxes specifically, this answer is probably fine because there are a small number of items and the probabilities don't change (though, since those are usually done on a server, 1000/sec is not unrealistic for a popular game).
BlueRaja - Danny Pflughoeft

4
@opa then flag to close as a dupe. Is it really wrong to upvote a good answer just because the question has been asked before?
Baldrickk

27

Note: I created a C# library for this exact problem

The other solutions are fine if you only have a small number of items and your probabilities never change. However, with lots of items or changing probabilities (ex. removing items after selecting them), you'll want something more powerful.

Here are the two most common solutions (both of which are included in the above library)

Walker's Alias Method

A clever solution that's extremely fast (O(1)!) if your probabilities are constant. In essence, the algorithm creates a 2D dartboard ("alias table") out of your probabilities and throws a dart at it.

Dartboard

There are plenty of articles online about how it works if you'd like to learn more.

The only issue is that if your probabilities change, you need to regenerate the alias table, which is slow. Thus, if you need to remove items after they're picked, this is not the solution for you.

Tree-based solution

The other common solution is to make an array where each item stores the sum of its probability and all the items before it. Then just generate a random number from [0,1) and do a binary search for where that number lands in the list.

This solution is very easy to code/understand, but making a selection is slower than Walker's Alias Method, and changing the probabilities is still O(n). We can improve it by turning the array into a binary-search tree, where each node keeps track of the sum-of-probabilities in all the items in its subtree. Then when we generate the number from [0,1), we can just walk down the tree to find the item it represents.

This gives us O(log n) to pick an item and to change the probabilities! This makes NextWithRemoval() extremely fast!

The results

Here are some quick benchmarks from the above library, comparing these two approaches

         WeightedRandomizer Benchmarks                  |    Tree    |    Table
-----------------------------------------------------------------------------------
Add()x10000 + NextWithReplacement()x10:                 |    4 ms    |      2 ms
Add()x10000 + NextWithReplacement()x10000:              |    7 ms    |      4 ms
Add()x10000 + NextWithReplacement()x100000:             |   35 ms    |     28 ms
( Add() + NextWithReplacement() )x10000 (interleaved)   |    8 ms    |   5403 ms
Add()x10000 + NextWithRemoval()x10000:                  |   10 ms    |   5948 ms

So as you can see, for the special case of static (non-changing) probabilities, Walker's Alias method is about 50-100% faster. But in the more dynamic cases, the tree is several orders of magnitude faster!


The tree-based solution also gives us a decent run-time (nlog(n)) when sorting items by weight.
Nathan Merrill

2
I'm skeptical of your results, but this is the correct answer. Not sure why this isn't the top answer, considering this is actually the canonical way to handle this problem.
whn

Which file contains the tree based solution? Second, your benchmark table: is Walker's Alias the "table" column?
Yakk

1
@Yakk: The code for the tree-based solution is here. It's built upon an open-source implementation of an AA-tree. And 'yes' to your second question.
BlueRaja - Danny Pflughoeft

1
The Walker part is pretty just link-only.
Acccumulation

17

The Wheel of Fortune solution

You can use this method when the probabilities in your item pool have a rather large common denominator and you need to draw from it very often.

Create an array of options. But put each element into it multiple times, with the number of duplicates of each element proportional to its chance of appearing. For the example above, all elements have probabilities which are multipliers of 5%, so you can create an array of 20 elements like this:

10 gold
sword
sword
sword
sword
shield
shield
shield
shield
shield
shield
shield
armor
armor
armor
armor
potion
potion

Then simply pick a random element of that list by generating one random integer between 0 and the length of the array - 1.

Disadvantages:

  • You need to build the array the first time you want to generate an item.
  • When one of your elements is supposed to have a very low probability, you end up with a really large array, which can require a lot of memory.

Advantages:

  • When you already have the array and want to draw from it multiple times, then it is very fast. Just one random integer and one array access.

3
As a hybrid solution to avoid the second disadvantage, you can designate the last slot as "other," and handle it via other means, such as Philipp's array approach. Thus you might fill that last slot with an array offering a 99.9% chance of a potion, and just a 0.1% chance of an Epic Scepter of the Apocalypse. Such a two tiered approach leverages the advantages of both approaches.
Cort Ammon - Reinstate Monica

1
I use somewhat a variation of this in my own project. What I do is calculate each item & weight, and store those in an array, [('gold', 1),('sword',4),...], sum up all of the weights, and then roll a random number from 0 to the sum, then iterate the array and calculate where the random number lands (ie a reduce). Works fine for arrays that are updated often, and no major memory hog.

1
@Thebluefish That solution is described in my other answer "The Soft-coded Probabilities Solution"
Philipp

7

The Hard-coded Probabilities Solution

The most simple way find a random item from a weighted collection is to traverse down a chain of if-else statements, where each if-else increases in probably, as the previous one does not hit.

int rand = random(100); //Random number between 1 and 100 (inclusive)
if(rand <= 5) //5% chance
{
    print("You found 10 gold!");
}
else if(rand <= 25) //20% chance
{
    print("You found a sword!");
}
else if(rand <= 70) //45% chance
{
    print("You found a shield!");
}
else if(rand <= 90) //20% chance
{
    print("You found armor!");
}
else //10% chance
{
    print("You found a potion!");
}

The reason the conditionals are equal to its chance plus all of the previous conditionals chances is because the previous conditionals have already eliminated the possibility of it being those items. So for the shield's conditional else if(rand <= 70), 70 is equal to the 45% chance of the shield, plus the 5% chance of the gold and 20% chance of the sword.

Advantages:

  • Easy to program, because it requires no data structures.

Disadvantages:

  • Hard to maintain, because you need to maintain your drop-rates in your code. You can't determine them at runtime. So if you want something more future proof, you should check the other answers.

3
This would be really annoying to maintain. E.g. if you wish to remove gold, and make potion takes its spot, you need to adjust the probabilities of all items between them.
Alexander - Reinstate Monica

1
To avoid the issue that @Alexander mentions, you can instead subtract the current rate at each step, instead of adding it to each condition.
AlexanderJ93

2

In C# you could use a Linq scan to run your accumulator to check against a random number in the range 0 to 100.0f and .First() to get. So like one line of code.

So something like:

var item = a.Select(x =>
{
    sum += x.prob;
    if (rand < sum)
        return x.item;
    else
        return null;
 }).FirstOrDefault());

sum is a zero initialized integer and a is a list of prob/item structs/tuples/instances. rand is a previously generated random number in the range.

This simply accumulates the sum over the list of ranges until it exceeds the previously selected random number, and returns either the item or null, where null would be returned if the random number range (e.g. 100) is less than the total weighting range by mistake, and the random number selected is outside the total weighting range.

However, you will notice that weights in OP closely match a normal distribution (Bell Curve). I think in general you will not want specific ranges, you will tend to want a distribution that tapers off either around a bell curve or just on a decreasing exponential curve (for example). In this case you could just use a mathematical formula to generate an index into an array of items, sorted in order of preferred probability. A good example is CDF in normal distribution

Also an example here.

Another example is that you could take a random value from 90 degrees to 180 degrees to get the lower right quadrant of a circle, take the x component using cos(r) and use that to index into a prioritized list.

With different formulae you could have a general approach where you just input a prioritized list of any length (e.g. N) and map the outcome of the formula (e.g.: cos(x) is 0 to 1) by multiplication (e.g.: Ncos(x) = 0 to N) to get the index.


3
Could you give us this line of code if it's just one line? I'm not as familiar with C# so I don't know what you mean.
HEGX64

@HEGX64 added but using mobile and editor not working. Can you edit?
Sentinel

4
Can you change this answer to explain the concept behind it, rather than a specific imlementation in a specific language?
Raimund Krämer

@RaimundKrämer Erm, done?
Sentinel

Downvote without explanation = useless and antisocial.
WGroleau

1

Probabilities don’t need to be hard-coded. The items and the thresholds can be together in an array.

for X in itemsrange loop
  If items (X).threshold < random() then
     Announce (items(X).name)
     Exit loop
  End if
End loop

You do have to accumulate the thresholds still, but you can do it when creating a parameter file instead of coding it.


3
Could you elaborate on how to calculate the correct threshold? For example, if you have three items with 33% chance each, how would you build this table? Since a new random() is generated each time, the first would need 0.3333, the second would need 0.5 and the last would need 1.0. Or did I read the algorithm wrong?
pipe

You compute it the way others did in their answers. For equal probabilities of X items, the first threshold is 1/X, the second, 2/X, etc.
WGroleau

Doing that for 3 items in this algorithm would make the thresholds 1/3, 2/3 and 3/3 but the outcome probabilities 1/3, 4/9 and 2/9 for the first, second and third item. Do you really mean to have the call to random() in the loop?
pipe

No, that's definitely a bug. Each check needs the same random number.
WGroleau

0

I done this function: https://github.com/thewheelmaker/GDscript_Weighted_Random Now! in your case you can use it like this:

on_normal_case([5,20,45,20,10],0)

It gives just a number between 0 to 4 but you can put it in array where you got the items.

item_array[on_normal_case([5,20,45,20,10],0)]

Or in function:

item_function(on_normal_case([5,20,45,20,10],0))

Here is the code. I made it on GDscript, you can, but it can alter other language, also check for logic errors:

func on_normal_case(arrayy,transformm):
    var random_num=0
    var sum=0
    var summatut=0
    #func sumarrays_inarray(array):
    for i in range(arrayy.size()):
        sum=sum+arrayy[i]
#func no_fixu_random_num(here_range,start_from):
    random_num=randi()%sum+1
#Randomies be pressed down
#first start from zero
    if 0<=random_num and random_num<=arrayy[0]:
        #print(random_num)
        #print(array[0])
        return 0+ transformm
    summatut=summatut+arrayy[0]
    for i in range(arrayy.size()-1):
        #they must pluss together
        #if array[i]<=random_num and random_num<array[i+1]:
        if summatut<random_num and random_num<=summatut+arrayy[i+1]:
            #return i+1+transform
            #print(random_num)
            #print(summatut)
            return i+1+ transformm

        summatut=summatut+arrayy[i+1]
    pass

It works like this: on_normal_case([50,50],0) This gives 0 or 1, it has same probability both.

on_normal_case([50,50],1) This gives 1 or 2, it has same probability both.

on_normal_case([20,80],1) This gives 1 or 2, it has bigger change to get two.

on_normal_case([20,80,20,20,30],1) This give random numbers range 1-5 and bigger numbers are more likely than smaller numbers.

on_normal_case([20,80,0,0,20,20,30,0,0,0,0,33],45) This throw dices between numbers 45,46,49,50,51,56 you see when there is zero it never occure.

So it function returns just one random number that depends lenght of that arrayy array and transformm number, and ints in the array are probability weights that a number might occure, where that number is location on the array, pluss transformm number.

弊社のサイトを使用することにより、あなたは弊社のクッキーポリシーおよびプライバシーポリシーを読み、理解したものとみなされます。
Licensed under cc by-sa 3.0 with attribution required.