Skip to content

Alternative seed searcher implementation, take 2#336

Closed
moulins wants to merge 13 commits into
toolbox4minecraft:masterfrom
moulins:seed-filter
Closed

Alternative seed searcher implementation, take 2#336
moulins wants to merge 13 commits into
toolbox4minecraft:masterfrom
moulins:seed-filter

Conversation

@moulins
Copy link
Copy Markdown
Contributor

@moulins moulins commented Jun 23, 2017

Here's my new implementation for the seed search feature (issue #120). It allows for the creation of filters complex nested criteria, following this syntax.

It isn't quite ready to be merged, as some features are still missing, but I would like to have some opinion on it.

How does it work ?

Each criterion present in the .json file is represented by an object implementing the Criterion interface.

To determine if a seed matches the filter, a simple approach would be to validate each filter one by one, bailing out as soon as a required filter doesn't match. However, this method has a serious drawback: if several criteria cover the same area, the corresponding biome data will be computed several times (we could cache the whole area, but it could be arbitrarily big, and we don't want to use too much memory).

To have better performance, this implementation divides the area to search into fragments, and submit these fragments to all criteria at once. For each fragment, the biome data is implicitely cached by an instance of CachedBiomeDataOracle.

As the criteria classes are immutable, each criterion tracks its progress with an auxiliary class implementing CriterionResult; these results are grouped inside a ResultsMap which holds all the state relative to the currently matching filter.

To avoid submitting fragments which won't influence the final result, instead of creating the fragment list
beforehand, the filter is queried each time for the next fragment to consider:

//rough algorithm
ResultMap results = filter.createResults();
while (!filter.isDone()) {
  Region region = filter.getNextRegionToCheck(results);
  filter.checkRegion(results, world, region);
}

Other changes

While adding this features, I also made some other changes to the codebase, mainly:

  • CoordinatesInWorld are now called Coordinates and use ints instead of longs (there's no risk of overflow, as coordinates are always between -30,000,000 and 30,000,000)
  • JSON parsing is more strict: all JSON classes must be explicitely annotated with @GsonObject, and parsing will fail by default if unknown fields are encountered
  • Instead of being stored in a short[][], biome data is now stored in a instance of BiomeData, which internally holds a int[]. This allows us to directly use the arrays returned by the MinecraftInterface, without any unnecessary copying
  • and some other small stuff (roughly one feature by commit)

Things left to do

  • Improve performance of structure filters by caching the results
  • Add a scoring system
  • Add support for structure clusters
  • Add a better reporting interface
  • Try to be more efficient when selecting the next region to search
  • and maybe more...

moulins added 13 commits June 23, 2017 12:20
This type bundles the WorldSeed, the WorldType and the String
configuration of a World into a single object
Replace @GsonConstructor by @GsonObject; add @JsonField.
All json classes without custom parsing code must be annotated with
@GsonConstructor; the parsing will give an error if the json object
contains non-existing fields.
The Region class represents a rectangular or circular region of the
world.
This class encapsulates the biome array returned by the
MinecraftInterface. This has the added benefit of avoiding copies when
unnecessary.
This class caches biome data in a particular region between invocations
of getBiomeData.
@stefandollase
Copy link
Copy Markdown
Contributor

First of all, please have a look at my previous rant. Now that you know that I am not a fan of yet another implementation of the seed searcher that is mainly different, lets have a closer look at the actual changes in this pull request:

I see you added a CachedBiomeDataOracle. A while ago, I tried to do the same thing for the normal usage of the BiomeDataOracle, but failed with a proper cache invalidation. I then decided that the Fragment system that is used by the GUI already works like a cache for the full resolution. The quarter resolution should be quick enough so it does not need to be cached. However, we might actually need a CachedBiomeDataOracle for the Seed Searcher.

I don't see a reason to rename CoordinatesInWorld to Coordinates. The name CoordinatesInWorld was chosen to clearly separate it from coordinates on the screen. Indeed, the risk of mixing these up might by now be significantly lower due to other code improvements, however I don't see a strong reason for the change.

You basically added a small self made library for JSON validation. I did not feel there was an issue, so I am not sure whether the new code carries its own weight. I would rather omit it.

You skipped the copying from int[] to short[][]. I also made that change a while ago, but changed it back for a simple reason: int is 4 byte while short is 2 byte. Thus, this change doubles the memory consumption for the biome data which is actually the main memory consumption of Amidst, so this is not a good idea.

You switched the data type for coordinates from long to int. I am quite nervous about that change, because of bugs like #143. At least the structure algorithms actually use the overflow of mathematical operations. Again, I would rather omit this change.

I like the change about special biome variants. I actually never noticed that the specified colors are exactly the same. I checked it and there is only one exception to this rule. Ice plains is already pure white (255, 255, 255) and thus ice plains spikes cannot simply lighten this color. Instead, it uses a completely different color. I would like to merge this change, if you also have a solution for the ice plains (spikes) issue. Also, after merging this, we should update the Biome Color Table in the wiki. There is a dev-tool to regenerate it.

I like the new class called WorldOptions. This change was actually already on my list for the implementation of another feature, so this works out nicely.

You merged CoordinatesInWorld with CoordinateUtils. This merge seems to work out nicely, so I like it.


As you can see, my feedback is mixed. I don't think there is a chance to merge this pull request. However, I would be happy to merge individual changes if you create separate pull requests for them:

  • special biome variants
  • WorldOptions
  • merge of CoordinatesInWorld and CoordinateUtils

@moulins
Copy link
Copy Markdown
Contributor Author

moulins commented Aug 30, 2017

I see you added a CachedBiomeDataOracle. A while ago, I tried to do the same thing for the normal usage of the BiomeDataOracle, but failed with a proper cache invalidation. I then decided that the Fragment system that is used by the GUI already works like a cache for the full resolution. The quarter resolution should be quick enough so it does not need to be cached. However, we might actually need a CachedBiomeDataOracle for the Seed Searcher.

The CachedBiomeDataOracle is actually only used for the seed searcher. At first, I tried to thread the biome data for the current region through all the code (structures, etc), but it was too cumbersome I decided to use a global variable of sorts (the cache). This allows to leave the structure code unchanged, but the cache management must be done by hand.

I don't see a reason to rename CoordinatesInWorld to Coordinates. The name CoordinatesInWorld was chosen to clearly separate it from coordinates on the screen. Indeed, the risk of mixing these up might by now be significantly lower due to other code improvements, however I don't see a strong reason for the change.

This was simply laziness on my part, when I started using extensively the CoordinatesInWorld API. ;) For me, since the type is already inside a subpackage world, having world.CoordinatesInWorld seems redondant.

You basically added a small self made library for JSON validation. I did not feel there was an issue, so I am not sure whether the new code carries its own weight. I would rather omit it.

I added this to have better error messages when reading .json search files. You don't want the engine to silently use the default value for a parameter because you mispelled its name.

You skipped the copying from int[] to short[][]. I also made that change a while ago, but changed it back for a simple reason: int is 4 byte while short is 2 byte. Thus, this change doubles the memory consumption for the biome data which is actually the main memory consumption of Amidst, so this is not a good idea.

Small nitpick: the BufferedImages inside the Fragments already use 4 bytes by pixel, so using a int[] instead of a short[] only increase the memory usage from 6 bytes per pixel to 8 bytes per pixel (it still may be too much for you).

However, I think that encapsulating the biome data is still a good idea, for two reasons:

  • We can switch the internal representation to a flat array, which simplify the copying from the Minecraft int[]s;
  • The CachedBiomeDataOracle relies on the BiomeData.view() API to avoid copying data already in its cache (especially useful when doing biome checks around structures).

Should I open a separate issue to discuss this?

You switched the data type for coordinates from long to int. I am quite nervous about that change, because of bugs like #143. At least the structure algorithms actually use the overflow of mathematical operations. Again, I would rather omit this change.

I remember being very careful when doing this change; and I never found regressions on the worlds I tested. In fact, the minecraft code use int coordinates everywhere, so I don't think this change could impact the algorithms.

I like the change about special biome variants. I actually never noticed that the specified colors are exactly the same. I checked it and there is only one exception to this rule. Ice plains is already pure white (255, 255, 255) and thus ice plains spikes cannot simply lighten this color. Instead, it uses a completely different color. I would like to merge this change, if you also have a solution for the ice plains (spikes) issue. Also, after merging this, we should update the Biome Color Table in the wiki. There is a dev-tool to regenerate it.

Woups, I didn't see the special case for Ice Plains (Spikes) biome, but I don't think adding it ad hoc would be a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants