Deprecate deduplicate setting for SOBOL Generator; Set GenerationNode.should_deduplicate=True by default#4887
Open
sunnyshen321 wants to merge 1 commit intofacebook:mainfrom
Open
Conversation
…nNode.should_deduplicate=True` by default Summary: Previously, we had dedup settings on both `Generator` and `GenerationNode` level, which is unnecessary and error-prone for accidentally using 2 different dedup settings at the 2 levels (see D92094386) Given that only `SobolGenerator` has special dedup logic, we've decided to deprecate `deduplicate` setting in `Generator` and only rely on `GenerationNode` level deduplication. Furthermore, since we generally expect dedup to be True for OSS use cases and AutoML use cases, we set `GenerationNode.should_deduplicate=True` by default while explicitly overwriting GNode dedup to False for online use cases. **One Potential Issue** Currently `RandomAdapter._gen` solely relies on SOBOL generator to dedup (by passing in `generated_points` to sobol generator and then `rejection_sample`- https://fburl.com/code/zmqfhn8g) and if we remove that, anyone that's using standalone `RandomAdapter` (without a GNode) may get duplicated trials if they call `RandomAdapter.gen` multiple times. This is partially mitigated by advancing `init_position` on sobol generator -- there shouldn't be duplicated points for continuous parameters, but if there are discrete parameters we can be producing duplicated points after rounding. The recommended path is through GenerationStrategy/GenerationNode, where dedup is properly handled. But there's no enforcement on never using standalone RandomAdapter. (ps. `TorchAdapter` passes pending_observations through to the BoTorch acquisition function as X_pending so it doesn't have the same problem) Differential Revision: D92884352
|
@sunnyshen321 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D92884352. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary:
Previously, we had dedup settings on both
GeneratorandGenerationNodelevel, which is unnecessary and error-prone for accidentally using 2 different dedup settings at the 2 levels (see D92094386)Given that only
SobolGeneratorhas special dedup logic, we've decided to deprecatededuplicatesetting inGeneratorand only rely onGenerationNodelevel deduplication.Furthermore, since we generally expect dedup to be True for OSS use cases and AutoML use cases, we set
GenerationNode.should_deduplicate=Trueby default while explicitly overwriting GNode dedup to False for online use cases.One Potential Issue
Currently
RandomAdapter._gensolely relies on SOBOL generator to dedup (by passing ingenerated_pointsto sobol generator and thenrejection_sample- https://fburl.com/code/zmqfhn8g) and if we remove that, anyone that's using standaloneRandomAdapter(without a GNode) may get duplicated trials if they callRandomAdapter.genmultiple times. This is partially mitigated by advancinginit_positionon sobol generator -- there shouldn't be duplicated points for continuous parameters, but if there are discrete parameters we can be producing duplicated points after rounding. The recommended path is through GenerationStrategy/GenerationNode, where dedup is properly handled. But there's no enforcement on never using standalone RandomAdapter. (ps.TorchAdapterpasses pending_observations through to the BoTorch acquisition function as X_pending so it doesn't have the same problem)Differential Revision: D92884352