3.20 resampling implementation theory?

3
2

How does this(sebastian's code of sampling wheel) way of implementing really sample the particles according to the weights?
I just don't understand that....
please help...

asked 08 Mar '12, 17:31

Shailendra%20Rajput's gravatar image

Shailendra R...
8224
accept rate: 0%

edited 08 Mar '12, 20:04

jimgb-2's gravatar image

jimgb-2
7.3k3077148


6 Answers:

I think Alberto and Ralf are correct in why there is a "2" in

2 * max(w)

but maybe the explanation should be fuller so no steps are missed and the understanding is deepened. Treat each of the following thoughts individually. All these thoughts will come together at the end.

1) The "2" actually is in a more complex expression:

<increase of beta> = random.random() * 2 * max(w)

2) random.random() gives fractions that span from zero to one. The average of random.random() is 1/2

3) The smallest max(w) occurs when all weights are equal. When all weights are equal the distribution of weights is called a uniform distribution.

Here's how you can demonstrate to yourself that a uniform distribution has the smallest max(w). Draw a picture of a uniform distribution. If you reduce any one weight, one or more other weights must increase to keep the total weight the same. max(w) is supposed to be the largest of the weights. Since all weights were the same when you started, any weight could have been chosen to be max(w) but now you caused some other weight in the distribution to get even bigger. To reduce that new maximum weight you have to let all the other weights even out again (let them return to a uniform distribution). When all weights are the same the largest weight once more is the smallest it can be across all possible distributions.

4) We can't predict what the distribution of weights will be.

5) We'd like all particles to have a chance of being chosen. If you never visit a particle it cannot be chosen. We want to assure that we can most likely visit each particle at least once.

6) We have chosen to represent weights as slices of a pie.

7) There are N original particles and N original weights. There are N pie slices. We are going to take N random samples.

Let's put these seven thoughts together.

  • We have chosen to represent weights as slices of a pie.
  • There are N original particles and N original weights. There are N pie slices. We are going to take N random samples from these N pie slices.
  • We'd like all particles to have a chance of being chosen. If we never visit a particle it cannot be chosen. We want to assure that we can most likely visit each particle at least once.
  • We can't predict how large the pie slices will be (we can't predict what the distribution of weights will be). We don't know how far to advance with each step, on average, around the circumference of the pie so all particles can be evaluated for selection. If we advance by too small an amount we'll never make it fully around the pie and we'll miss visiting some particles.
  • How far should we advance? The worst case is when all the pie slices are the same size (uniform distribution). Remember, there are N pie slices and N evaluations. To visit each pie slice we have to advance by (at least) the size of a uniform pie slice each time. Stated another way, we have to advance on average by max(w) with each step. On average we could take a larger step but could not take a smaller step. When the distribution is not uniform, max(w) is bigger than the other pie slices so we certainly won't be taking steps that are too small.
  • We're taking N steps. In each step, max(w) is scaled by a random number between zero and one (on average, each step is scaled by 1/2). We need to average advancement by a full max(w), not by half of max(w) , so we multiply by 2.
link

answered 08 Mar '12, 21:21

melr's gravatar image

melr
3.6k193041

edited 10 Mar '12, 10:49

Here's another way to look at it.

The weights divide up the wheel into N parts, each of which has a size proportional to the weight. If you picked N uniformly distributed random points on the edge of the wheel, each point would select a wedge with a probability proportional to the size (or weight), which is exactly what we want. The problem is that, to figure out which wedge is picked, you'd have to know the sum of the weights before it -- this is what we're trying to avoid.

Instead, we could look at the sequence of weights as a repeating signal - a waveform, like a radio signal - and sample it sufficiently often. The highest frequency in the signal corresponds to the largest weight, so the Nyquist sampling theorem says we only have to sample up to twice that to faithfully represent the original signal. Taking uniformly distributed samples at intervals that are always smaller than the critical frequency means we're sampling the signal faithfully, which means our output distribution is proportional to the signal.

link

answered 08 Mar '12, 23:45

Scott%20Brickner's gravatar image

Scott Brickner
8225

Very interesting interpretation!! Congratulations!!

I like it a lot so I upvote this answer, although I know this will be difficult to understand to most of the people. It is required some theoretical knowledge about "Theory of communication" or about "Discrete signals and systems".

In fact, I'm a Telecom Engineer who has passed with good marks subjects like "Linear Systems" (discrete and continuous, the Fourier Transform, the Laplace Transform, and so on), "Communication Systems", "Statistics" (covering stochastic processes, probability distributions and so on), "Discrete signal processing", etc. And even with that knowledge I find difficult to decide if this interpretation is very suitable here. I mean, you have a voice signal and make it pass through a Low-Pass filter of about 4 KHz... Then you sample at a rate of 8 KHz (8000 samples per second) and you have a perfect representation of the original continuous signal. However, in ths case we don't have a "signal" (a variable that has quite precise values, changing over time) but a random process or stochastic process. I see the relation between frequencies (of signals) and probabilities (which are often seen as "relative frequencies"), even the w "for weight" look a lot like "lowercase omegas " but they are not exactly the same, so I have to think further about it. It could even have a connection to Quantum Mechanics and the duality Wave-Particle that is present in electromagnetic waves.

(09 Mar '12, 01:32) Alberto Cid Alberto%20Cid's gravatar image

This wheel implementation is just an optimization. It saves on computer time. If you already understand that some particles should be represented more than others based on their weights, you are getting the idea. If you know how to randomly resample particles based on their weights by some other method, then you've learned what you need to know about resampling. For you the wheel implementation may be saving on computer time while making you feel you are wasting your own time.

The wheel implementation is supposed to take less computer time than if you directly try to figure out which particle is represented by a random number from zero to one. The random number basically tells you what percentage of the way through the sum of all the weights you should go to locate the selected particle. Since weights for particles are not the same, for the most basic implementation you have to add particle weights until you get to the random percentage you've been tasked to find. Do this over and over again so particle filters can converge on a winning particle and it will consume computer time. Repeat the process numerous times through a journey and even more computer time gets consumed. Optimizing what you do often does add up to savings.

You may want to look at this detailed discussion among the answers here, and this discussion at another question for additional details.

link

answered 08 Mar '12, 18:04

melr's gravatar image

melr
3.6k193041

edited 09 Mar '12, 01:38

I don't understand where the constant 2 in 2 * max(w) comes from in the implementation. Could it be 1 or any number bigger than 2?

link

answered 08 Mar '12, 18:53

Takanzi's gravatar image

Takanzi
6.2k82463

1

I think that 2 is the smallest number that guarantees a full trip around the wheel.

Imagine it is 1... then, if all the weights are equal wmax would be 1/N and on average the random number Beta would be (1/N)/2 (the mean). After N resamplings, on average you are at 1/2 ... So, you lost half of the wheel !!! (half of the particles tipically would be ignored)

Now imagine it is 2 ... When Wmax is minimum you do a complete trip around the wheel on average.

Now imagine it is > 2 ... It's ok, but the algorithm becomes slower. (you would do more iterations to get the index in every step)

(08 Mar '12, 19:13) Alberto Cid Alberto%20Cid's gravatar image

I asked myself the same and I believe, that 2 * max(w) is a number that gives you a good chance to move at least once all around the wheel. But I don't think that any number can guarantee this, since a random number could theoretically always be close to zero.

So if you take 3 or 4 times max(w) I guess it would not make a big difference.

(08 Mar '12, 19:35) Ralf Knothe-1 Ralf%20Knothe-1's gravatar image

All right, there's no guarantee, I should have used other words to express the idea. Maybe: "the smallest number that makes the average at least equal to a full trip around the wheel"

4 * w_max works ok, but it uses a double ammount of time (on average) than 2 * w_max

(08 Mar '12, 22:17) Alberto Cid Alberto%20Cid's gravatar image

Yes, the Roulette Wheel could just be divided into equal segments...let's say 25 segments that are 15 degrees apart....so, you would have equal weighting and probability.
Or, you could just draw a random direction and then subdivide the wheel into equal increments from the random point.
The Resampling Roulette Wheel just implies a probability weight distribution; there are several ways of implementing it.

link

answered 08 Mar '12, 17:40

J%20Hal%20Grossman's gravatar image

J Hal Grossman
9751620

I would think, if you want to minimize the number of value tested, but still give every particle on the wheel an statically equal chance then you should pick a step value so that your total re-sampling picks would makes one trip around the wheel on average.

Wsum = sum(w) # Sum of Wheel Values

N = #number of re-samples needed

StepAvg = Wsum / N #Average step value needed to make 1 loop around the wheel

Smax = 2 * StepAvg #The average random result would be half the the max value. Use Smax in place of 2*Wmax

.

If you would like to use a multiple number of loops around wheel then:

Smax = 2 * StepAvg * NumberOfWheelLoopsWanted.

(Two loops around the wheel would probability be a good balance between speed and a fair distribution)

link

answered 10 Mar '12, 12:17

Glenn%20Hembree's gravatar image

Glenn Hembree
142

Your answer
Question text:

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×5,185
×89
×4

Asked: 08 Mar '12, 17:31

Seen: 306 times

Last updated: 10 Mar '12, 12:17