Mindblowingly Boring Star Map

A case study in procedural generation

Krisztián Pintér, 2018
pinterkr@gmail.com

In this paper, we discuss a solution to a somewhat contrived problem which nevertheless could be the basis of a real game world with some enhancements, so it is not entirely useless. This time, however, we will use this problem to showcase a particular solution which we believe is quite clever, and you might learn from it.

First, we are going to give a description of the problem. Then we will need to explore three concepts, two of which are more general, until we finally put all the parts together, so be patient.

At the end, we will top the meal up with two light desserts: a neat optimization, and some general insights to the role of procedural generation. So it is worth sticking around. Let's get started.

The Problem ⇡

Imagine a huge grid, for example 2⁶⁴✕2⁶⁴. We can go higher with this algorithm, but this is big enough to prevent most naive approaches. Place 2⁶⁴ stars pseudo-randomly in this grid, in uniform distribution. So a star goes into any one of the 2¹²⁸ grid cells randomly, more then one can go into a cell (although it will not really happen with these numbers, we allow this for simplicity only), and cells of course can be empty. The number, again, is arbitrary, we can handle much more stars. With these settings, most cells will be empty, since we have much more of them than stars. Thus the grid really will resemble a Star Map with stars scattered around.

Develop an algorithm that can give you the number of stars in any given rectangular region. Make it so that we can have our result in a reasonable timeframe without sending all our money to the cloud. The reason for this requirement is to be able to quickly zoom out, that is, draw large portions of the world with millions or billions of stars behind each pixel.

Random Access Random ⇡

The first piece of the puzzle we are going to use is not too difficult, and probably already used in many places, unknowingly perhaps, where procedural generation is required. If we want a huge world, the information content is huge, but also largely independent from part to part. The information is spread over the world, and when you want to deal with one part, it is not affected by other parts. For example in a top down exploration game, you only care about the land visible on the screen. You want to find if a cell is water or land, grass or desert, has some ore or not, and so on. Characteristics of an area far away are of no concern.

If we hand crafted the map, we can just store it in an array. An array is a structure from which you can easily acquire the data for any given coordinates, and it is quick even if the map is enormous. If we want an algorithm to create this world, it is a perfectly valid approach to use a PRNG to fill this array, and store the same way. But if you decide that you either don't want or simply not able to store it, you will need to create it part by part. And herein lies the problem. Most PRNG-s work in a way that they give you the values in sequence. There might be subtleties to this, but this is the gist of it.

R₀ = PRNG(seed)
R₁ = PRNG(R₀)
R₂ = PRNG(R₁)
…

This might be problematic if you want to access R_545645814359. It does not matter how fast the algorithm is, for sufficiently big worlds, it will be preventively slow. Sometimes these generators have a "fast forward" option, but usually in some very limited way. Luckily, we have PRNGs that work differently. They are used like this:

R₀ = PRNG(seed, 0)
R₁ = PRNG(seed, 1)
R₂ = PRNG(seed, 1)
…

Such generators tend to be slower, and sometimes give you a block of data at a time, so R_i is not an integer, but for example 64 bytes. Slower maybe, but at least they are usable for our purposes, unlike the sequential ones. One such PRNG is a cryptographic primitive Chacha20 (or its predecessor Salsa20). A faster but smaller one is SplitMix. The former takes a seed and an index, 384 bits total, and generates 512 bits of random data. The latter takes a seed and an index, both 64 bits, and generates 64 bits

With such tools in hand, we can create almost infinite Virtual ROMs. The PRNG function acts as a READ function for this Virtual ROM, and have a certain cost to "retrieve" one block. You can even use the blocky nature to optimize performance by trying to put attributes usually accessed together on the same block. The seed acts like a selector for different ROMs. So basically you have a Virtual ROM Jukebox, and you can pick any one of the ROMs by specifying its seed.

As a side note, when we say random access PRNGs are slow, we mean relative to sequential generators. In absolute terms, they are pretty fast. In particular, for this purpose, a reduced version of Chacha20 can be used, which is called Chacha20/8. It can, after some optimization, give you a whopping 2GB/s, or approximately 30ns per block. SplitMix uses 3 64 bit integer multiplications and a handful of xors. For some applications, it is still too slow, but for others it is fine.

Uniform Encoding ⇡

If you use the Virtual ROM described in the previous segment to encode your game world, you will run into an obvious problem: not all data leads to a sensible world. Consider the example we used earlier: a 2D adventure game. If all blocks are totally random, you will have a desert block right next to a water block, right next to a bedroom and so on. This is not particularly useful.

The solution is not very difficult conceptually, but the devil is always in the details. Let's start with a very simple example: say we want every cell to have some ore with 7% probability. With a handmade map, you just set up a boolean field HasOre, and set 7% of cells to True. However, the Virtual ROM can only have uniformly random values, which would mean a 50% ore probability. A possible solution is what probably most readers already figured out: use an entire byte, and if the value is smaller than 18, have an ore there. It is not precisely 7% but who cares, really. If you do care, you can go arbitrarily close by choosing a larger number, say 16 or 32 bit. Theoretically the minimum number of bits to represent this probability is 3.8, so even 8 but especially 16 is a waste. That's a compromise we can live with though, being a little wasteful with our randomness is usually acceptable. Another solution could be to split the map to smaller regions say 16✕16 (a total of 256 cells), and have a fixed list of 18 coordinates where the ores are. One cell can be addressed with 2 coordinates of 4 bits each, thus 18 bytes will describe the ore locations. There is a chance that two or more locations will be identical, which changes the probability a little. But again, this is something we can probably live with.

As another example, we might want a parameter to vary with gaussian distribution, and not uniformly in a range. All we need to do is to take the uniform random value, and feed it to the inverse of the cumulative distribution function, in our case √2 erf^-1(2x-1), where erf is the so called error function, which you can approximate. If your x's are uniformly distributed on the (0, 1) interval, y's will be normally distributed around 0. Getting random numbers in the (0, 1) range with uniform distribution is easy, we just take any b bit integer, and divide by 2^b, which is the same as treating your b bit number as a fixed point real with b fractional bits.

As the world gets more complex, the algorithm gets more complex too. It is something to be expected, on a deeper level. Since you start with noise that does not have structure at all, and you want to end up with something having structure, you can expect the algorithm itself to somehow represent the structure. To put in layman's terms, your algorithm is a description of what a valid world looks like. Imagine you need to describe a world to a friend. A simple world can easily be described, a more complex world takes more verbosity. After a while, you pick up this point of view: whatever features there are in your world, there has to be a corresponding element in the algorithm. Every distribution, categorization or structure needs to be represented in the code.

If you are familiar with Generative Adversarial Networks, the concept is pretty similar. In a GAN, one neural net is asked to generate, say, an image of some object, from random noise, so that another network (or human) can't tell if it's a real world input or generated by the algorithm. Such a network needs to learn, in a way, what defines the object in question, so it can generate instances that match the definition. GANs are part of machine learning, while we usually manually code our games, but the task is the same. We would not be too much surprised to learn that GANs are already used in world generators, existing or being developed.

illustrations of transformations from uniform to structured random

So to conclude this section, your goal is to find an encoding that can take any uniformly random input, and transform it to something reasonable. It is okay for the algorithm to be a little wasteful with the randomness. It is also okay if your algorithm very rarely produces weird results as long as they are not game breaking or especially program breaking. You can always add safeguards to the algorithm to cover any unwanted cases that might occur, but sometimes it is also acceptable to just ignore some possibilities. A failure occurring with probability 2^-128 is not going to happen, unless the user can somehow force it.

Detour ⇡

What we have covered so far is rather general to many kinds of procedural generation. The next one is more specific to our problem, in particular it is specific to truly huge worlds that you want to look at at aggregated levels, that is, some average or sum over larger areas. If you have a large amount of fine grain information, enumerating would take preventively long time, easily more than the expected lifetime of the universe. Even for much smaller datasets, we often use auxiliary structures like indexes or BSP trees to make finding or aggregating faster. But with datasets of our size here, neither calculating nor storing these structures are even remotely possible.

But before we advance, we are going to take a detour, and solve another problem that might look equally or even more complex, but in fact relatively easy. Let's draw a maze! There are many ways to generate random mazes, we are going to use a very basic one. The algorithm goes as this: first take a rectangle. Then divide it randomly with a vertical line, and put one hole on the line at a random position. So one division requires a pair of numbers: one telling where the line is, and other telling where the hole is. The line divides the rectangle to two smaller rectangles. Recursively do the same splitting on the two child rectangles but this time the perpendicular direction, in an alternating fashion. The recursion ends when the rectangle is only one unit wide/tall.

first three steps of splitting a rectangle to eventually get a maze

You can visualize the data required as a ragged binary tree, in which every split is followed by two other splits. It is ragged, because we hit size one at different points, based on previous choices. We can easily get rid of the raggedness by simply putting zeros in the empty positions. Actually the value does not matter, since we will just stop traversing the tree if we hit size one, further nodes on that branch will be ignored. The numbers at each node will need to be in the proper range, so not uniformly distributed. This is also easy to fix though, because we can use a larger number and scale it as needed. The nice thing about a balanced binary tree with normalized values is that it can be represented as an array in our Virtual ROM.

binary tree showing splits as nodes, splits being pairs of numbers

With all this at hand, we can now develop an efficient algorithm to discover whether a given segment is a wall or a hole. We start by observing that with every split, we can throw away at least one side. If the segment we are looking for is on the divider line, we are done, we just need to check if this is the hole or not. If the segment is not on the line, it is on either side of it. The other side can be ignored. So we traverse the tree down, always taking the appropriate direction, keeping track of the current rectangle in absolute coordinates, descending until we find the split line that contains the required segment. As binary trees work, the time is proportional to the logarithm of the size. This enables us to make enormous labyrinths. With some added cleverness, we can even draw the right path with a red line.

Role Reversal ⇡

The labyrinth problem was easy because the generating algorithm itself served as a search tree, namely a BSP tree. As we generated the maze, we could simply skip large portions of the generation based on our location of interest. On the other hand, the naive way to generate a number of random stars is either to store a large matrix of bits representing whether a certain cell is a star or empty, or to store a list of coordinates. With the matrix approach, it is easy to draw only a portion of the world, the array can be accessed directly for every coordinate. Summing over large areas is slow though. The list approach is even worse, you can't even draw a portion without enumerating every star. We need some other way.

Let's start with describing a method for the smaller case, when we can enumerate the stars in reasonable time, and can store accelerator structures. Let's pick a simple method: store additional grids at lower resolutions. Say, the entire grid is 65536✕65536. We group every 2✕2 tiles and store the resulting 32768✕32768 grid. Then do the same with that grid, so in the next layer we have 16384✕16384 cells, each of them covers 4✕4 of the original grid. And so on, until we get to the top. With such a structure, we can approximate any rectangles by first dealing with the largest stored blocks that fit inside, then adding the ones on the next lower layer, and doing it until we think we are close enough.

However, this storage can not be just chosen at random. First, the numbers are larger as we go up. On the bottom layer, cells will typically have the value of 0, 1 or a some small number. As we go up, the numbers are getting bigger and bigger. Second, and more importantly, the numbers need to add up. If we have the value 123 in one block, the four subblocks it is composed of must be such that the sum of them is 123, say 23, 25, 40, 35. And the same equality must hold on the bottom level, the aggregated cells must match with the actual star positions. The first problem is easier to solve, because we know the distribution of numbers on each level, and as we know we have techniques to go from uniform distribution to anything we want. On the other hand, the summing problem is pretty bad.

All right, enough of the problems, let's talk solutions! The proposal is this: reverse the roles of the actual data and the accelerator constructs. Let the accelerator construct be the data, and let it spit out the fine grain data at the end. With the above described structure, it would look like this: we already have the total, or just pick one. Then split the total to 2✕2 quarters. Then split all quarters into 2✕2 again, and so on. When we split, we need to consider the probability distribution, in this case, the binomial distribution. We need to use this distribution or else the result will be blocky and uneven. We also want the four quarters to add up to the total. This is easily doable. For example we take three random numbers in the range (0, 1), use the first to split the block vertically, and two more to split the halves horizontally. For each split, you calculate one half according to the binomial distribution, and the other half receives the rest.

We can represent the required numbers as a 4-way tree of 3-tuples. Unlike the maze, this tree is naturally balanced. It is possible to hit zero at some point, after which all the sub-regions will be zero, but this is not an exception here, it is handled by the algorithm just fine. 4-way trees, like binary trees, translate nicely to arrays.

So there you have it. We represented the star field with a random array. We can easily calculate any rectangle of the form I✕2^R, J✕2^R, (I+1)✕2^R-1, (J+1)✕2^R-1, that is, a square of size 2^R located on 2^R boundaries. To get one of these squares, you have to traverse the tree from the root, and take (64-R) steps, always in the right direction. Since R is logarithmic in size, we can work with enormous maps.

schematics of splitting the big square to quarters based on the random ROM

Grid ⇡

In this section we introduce an algorithm that calculates a region of the entire map, that is, a grid with a certain origin, width, height and cell size. We already have an algorithm to calculate a rectangle. But it might take too much time to calculate thousands or millions of rectangles individually to display on the screen. This method brings imaging in the realm of reality.

The algorithm is actually pretty simple, we could call it "zoom and crop". We start with the whole area. We do one level of splitting, doubling the current image size, and then discard the squares outside the zone we need. We repeat these steps until satisfied with the accuracy. This algorithm will always give us squares of size 2^R on 2^R boundaries, so if the grid does not nicely aligned, we will need some cheating. We can take as all the stars of the block were sitting at the center, and include in the right pixel. Or split proportionally to the area. We can increase the accuracy by going a few levels deeper. But we can also just align our grid to the boundaries provided by the algorithm.

illustration of zoom and crop, outside blocks are ignored for the rest of the process

Let's observe that we can preallocate the final grid and work inside it. As can be seen on the above image, the resolution increases monotone with each iteration. We started with 1✕1, then advanced to 2✕2 then 4✕3 and 7✕5. On top of that, we can always find an order in which we can split cells and place their results in their final places in a way that we don't override cells we need later. For example we can make sure that every image is located in the top left corner of your array, and start the next round in the bottom right corner.

With this algorithm, it is pretty straightforward to implement your supersized star map in some actual video game, with the zoom feature enabled. In the next section, we will explain why you shouldn't.

Lowering Expectations ⇡

We're very sorry for not providing you with the software itself, or any images generated with it. It is a longer story. Instead, we just tell you what you can expect. It will be a disappointment in many sense, but there is an important lesson to learn here.

We will describe the results to you with enough detail that you can imagine. On the bottom, you see a star field that looks like all star fields do: dots without any structure at all. If you look for it, you might identify a cart, a stick figure, some animal, but of course it is more our brain's doing than actual shapes in the stars. As you zoom out, the sheer lack of structure becomes painfully apparent. Initially the stars blend into a noise, then into a uniform gray paint. All the features of our stellar neighborhood, like nebulae, clusters, anything of the sort are absent. Why our virtual world is so boring?

Actually we've told you already why. As was said: "whatever features there are in your world, there has to be a corresponding element in the algorithm". But we don't have anything in our algorithm other than uniform distribution. We told the system to generate uniform output, so it did. We have no right to lament, we got what we've asked for. We hoped that some interesting structure will show up, but it is real life, wishes don't work around here. If we want something to look at, we need to code it.

This extends to every feature we might add. Make it not uniform but more dense near the center: you get exactly that, a boring sea of stars that just gets more crowded near the center. Same difference. If you add waves, you get waves. If you add arms, you get arms. It will be not any less boring. The reason why our real galaxy is rich in features is that it has a lot of rules, the laws of physics, a lot of initial variations, and long long time to evolve. A lot of computational complexity went into the universe. You will not generate an interesting world with a few lines of code. Randomness will not do your job. It just changes what your job is. Remember that when you decide to include procedural generation in your game, or else you might get involved in the biggest debacle in video game history. You know the one I mean.