I searched for some resampling algorithms here and didn't find one similar enough to the one I'm going to post here. I don't think it's the fastest but it is simple enough to give it a shot.
Anyway, I will show first the "basic" version, then a slight modification to make the "upgraded" one. I haven't benchmarked it but I think the "basic" is O(N logN) and the "upgraded" is O(N). I could be wrong though.
It can be seen that the slow part is
Leaving the rest almost untouched, making the algorithm faster (because the array is already sorted). One problem is that it doesn't follow the premise of "random" sampling of particles. Nonetheless, for N big it's not a problem I think, keeping the resampling fairly accurate.
Finally, there are other parts that can be optimized. For example tweaking the code a bit, it isn't necessary to normalize w. But that's for another discussion :)
This question is marked "community wiki".