Integrating MongoDB MVP by Gawthaman · Pull Request #383 · AccelerationConsortium/ac-dev-lab

Gawthaman · 2025-07-07T21:48:55Z

Integrating MongoDB MVP

Copilot

Pull Request Overview

This PR integrates a MongoDB-backed MVP for persisting and replaying Ax optimization trials for the Branin benchmark.

Adds a MongoDB client connection and collections for storing experiments and trials
Implements loading of existing trials and replaying them before starting new ones
Persists each completed trial back to MongoDB

Comments suppressed due to low confidence (1)

scripts/mongodbintegrationmvp.py:22

[nitpick] The variable name tmongo_client is unclear. Rename it to mongo_client for better readability and consistency.

tmongo_client = MongoClient("mongodb://localhost:27017/")

scripts/mongodbintegrationmvp.py

sgbaird · 2025-07-07T22:39:36Z

@Gawthaman thanks! Could you have a look at facebook/Ax#2975 (comment)? I think you may need to use the built in serialization function directly.

…nd enhance MongoDB integration

sgbaird

While this avoids the issue of never switching to the next generation strategy, there are some other issues with it related to preserving space-filling properties. Let's try with Ax's built-in method (see comment).

sgbaird · 2025-07-09T09:08:43Z

scripts/mongodbintegrationmvp.py

+    remaining_sobol = max(SOBOL_TRIALS - n_existing, 0)
+
+    if remaining_sobol > 0:
+        generation_strategy = GenerationStrategy([


Please give this a try using Ax's built-in serialization method (save_to_json_file) and using the JSON snapshots that get sent to and reloaded from. In the current implementation, it doesn't preserve the Sobol sequence.

It should follow the same logic in https://colab.research.google.com/drive/1A2p1oUSsD8Edlu2haaB-FBSSjim3HTLS?usp=sharing

Please also add to each MongoDB document the timestamp of when the snapshot is saved and use the timestamp to load the most recent snapshot each time.

sgbaird

Not sure if you were ready for a review, but I took a look and made a couple comments. Makes sense to have the save and load helper functions. Overall the logic looks solid to me

scripts/mongodbintegrationmvp.py

sgbaird · 2025-07-09T16:54:09Z

scripts/mongodbintegrationmvp.py

+
+if ax_client is None:
+    # Create new experiment
+    generation_strategy = create_generation_strategy(SOBOL_TRIALS)


Please unwrap this function

…ration strategy creation

…tegration

…on script

sgbaird

@Gawthaman nice! Overall seems to capture the right logic. Many print statements and try-excepts can be removed. See other comments.

Once addressed, could you record a video demonstrating the usage of the script and share? (Running the campaign, artificially/ungracefully stopping partway at different points in the script, restarting it, etc.).

Probably upload to YouTube as unlisted and paste here (video would probably be over 100 MB). Narrated or unnarrated is fine.

sgbaird · 2025-07-09T22:14:31Z

scripts/mongodbintegrationmvp.py

+except errors.ServerSelectionTimeoutError:
+    print("Failed to connect to MongoDB. Is MongoDB running?")
+    exit(1)
+except Exception as e:


Do we need this explicit error handling? Seems like it would just bubble up naturally (unless you found that the error that bubbled up naturally was non-descript).

As a note for later, we'll set this up with a MongoDB Atlas cluster

sgbaird · 2025-07-24T20:12:40Z

scripts/mongodbintegrationmvp.py

+SOBOL_TRIALS = 5
+
+
+def save_ax_snapshot_to_mongodb(ax_client, experiment_name):


Should this return the database ID of the snapshot?

sgbaird · 2025-07-24T20:15:40Z

scripts/mongodbintegrationmvp.py

+        if record:
+            # Save snapshot data to temporary file
+            temp_file = f"temp_{experiment_name}_snapshot.json"
+            with open(temp_file, "w") as f:


How does this handle the case where the file already exists? Worth adding the database ID to the filename in addition to experiment name to avoid write conflicts?

sgbaird · 2025-07-24T20:17:18Z

scripts/mongodbintegrationmvp.py

+
+
+# Load existing experiment or create new one
+ax_client = load_ax_snapshot_from_mongodb(obj1_name)


Probably don't reuse obj1_name like this. Instead define a separate variable. Could incorporate obj1_name via f-string. Probably good to also add a hard-coded set of 4 characters (randomly generated externally) to it.

sgbaird · 2025-07-24T20:19:02Z

scripts/mongodbintegrationmvp.py

+                max_parallelism=5,
+                model_kwargs={"seed": 999},  # For reproducibility
+            ),
+            GenerationStep(


Since you're using snapshots, probably ok to leave to defaults and not specify an explicit generation strategy

sgbaird · 2025-07-24T20:19:25Z

scripts/mongodbintegrationmvp.py

+    print(f"Created new experiment with {SOBOL_TRIALS} Sobol trials")
+
+    # Save initial snapshot
+    save_ax_snapshot_to_mongodb(ax_client, obj1_name)


Same comment about experiment name

sgbaird · 2025-07-24T20:21:50Z

scripts/mongodbintegrationmvp.py

+    print("Resuming existing experiment")
+
+# Get current trial count to determine how many more trials to run
+current_trials = ax_client.get_trials_data_frame()


Ah, good point about needing to handle max trials in the for loop. I suppose an alternative would be to have a budget variable stored in MongoDB that gets updated, but ignore that for now. More of a note to self/musing.

sgbaird · 2025-07-24T20:23:16Z

scripts/mongodbintegrationmvp.py

+                f"Trial {trial_index}: result={results:.3f} | "
+                f"Best so far: {best_value:.3f}"
+            )
+        except Exception:


Probably remove these try-excepts

sgbaird · 2025-07-24T20:23:38Z

scripts/mongodbintegrationmvp.py

+        print(f"Total trials completed: {len(trials_df)}")
+        print(f"Best objective value: {trials_df[obj1_name].min():.6f}")
+
+except Exception as e:


Again, no need for try except here

sgbaird · 2025-07-24T20:24:34Z

scripts/mongodbintegrationmvp.py

+    print(f"Error getting best parameters: {e}")
+
+# Clean up MongoDB connection
+mongo_client.close()


Good to close the client. I suppose a top-level try-except-finally could be implemented at some point if we find that lots of mongo connections are being leftover during restarts, but I think probably not an issue. Again, just a musing

sgbaird · 2025-07-24T20:36:42Z

Also good to switch to using a free-tier MongoDB Atlas instance prior to recording video, and showing the snapshots in the cloud.

Some (not all) bits from https://github.com/ACC-HelloWorld/5-data-logging#mongodb-setup-and-github-secrets can help with setup. Ignore AWS setup for example.

Gawthaman · 2025-07-28T18:21:33Z

https://youtu.be/HtDo0snpmq4

sgbaird

Nice! Made a few comments. Overall looks clean and minimal. Thanks for making the edits I requested related to unwrapping

Please put this under a subdir called hitl-bo within https://github.com/Gawthaman/ac-training-lab/tree/main/scripts

sgbaird · 2025-07-28T18:33:16Z

scripts/check_persistence.py

Effectively a helper script for your demo video, correct?

sgbaird · 2025-07-28T18:36:08Z

scripts/mongodbintegration.py

+    return ''.join(random.choices(string.ascii_lowercase + string.digits, k=length))
+
+
+def get_user_choice():


I.e., if the script gets interrupted after a user has already begun their wetlab experiment, they get to say whether they want to provide an update of results from the previous experiment or not?

Yes, whenever the script is run, it asks the user if they want to start a new experiment or continue from the last experiment. It mainly works as a workaround since I dont think the script knows if the last experiment was finished or not, so if it was finished, then the user can just start a new experiment. It also gives the user the chance to start a new experiment from scratch if they decide they don't want to start from the last (interrupted) experiment.

sgbaird · 2025-07-28T18:36:48Z

scripts/mongodbintegration.py

+
+
+# Get user choice and setup configuration
+user_choice = get_user_choice()


Oh, nvm - experiment in the Ax definition (i.e., the whole campaign). This seems fine.

sgbaird · 2025-07-28T18:42:14Z

scripts/mongodbintegration.py

+    print(f"Trial {trial_index}: x1={x1:.3f}, x2={x2:.3f}")
+
+    # Save snapshot before running experiment (preserves pending trial)
+    save_ax_snapshot_to_mongodb(ax_client, experiment_id)


Good that you're doing before and after snapshots. Not sure if we'd want to distinguish this difference in the payload (or filename)

scripts/mongodbintegration.py

sgbaird · 2025-07-28T18:45:46Z

scripts/mongodbintegrationmvp.py

+obj1_name = "branin"
+MAX_TRIALS = 19  # Configuration constant
+
+# Experiment identifier (separate from objective name)


Suggested change

# Experiment identifier (separate from objective name)

# Experiment identifier (separate from objective name) with hardcoded unique ID

sgbaird · 2025-07-28T18:59:24Z

scripts/test_persistence.py

+
+    # Start the experiment as a subprocess
+    process = subprocess.Popen([
+        sys.executable, "Untitled-1.py"


I see lots of references to "Untitled-1.py"

I was initially working on this in a different temp folder to test out different scenarios, but I forgot to fix file names, I'll fix that asap

…ation scripts - Deleted old persistence testing scripts: simple_test.py and test_persistence.py. - Added new scripts for manual and automated testing of MongoDB persistence: check_persistence.py, mongodbintegration.py, mongodbintegrationmvp.py, simple_test.py, and test_persistence.py. - Enhanced MongoDB connection handling and error reporting in all scripts. - Implemented functionality to save and load experiment snapshots to/from MongoDB. - Added user prompts for experiment continuation or creation in the new integration scripts. - Improved trial management and recovery testing in the new test scripts.

…in persistence testing scripts

sgbaird

We'll also need to think about what happens if someone decides to change the optimization setup later on. For example, adding a new parameter to explore. This script more or less assumes that the campaign doesn't change from start to end, and likewise that it starts from scratch (that's fine, this is meant to be a MWE, and the ability to stop/restart is well-received).

The previously saved AxClient objects will serve as a record regardless. While using these as checkpoints is convenient, it also reduces some of the flexibility.

Integrating MongoDB MVP

cfae831

Copilot AI review requested due to automatic review settings July 7, 2025 21:48

Copilot AI reviewed Jul 7, 2025

View reviewed changes

scripts/mongodbintegrationmvp.py Outdated Show resolved Hide resolved

scripts/mongodbintegrationmvp.py Outdated Show resolved Hide resolved

scripts/mongodbintegrationmvp.py Outdated Show resolved Hide resolved

scripts/mongodbintegrationmvp.py Outdated Show resolved Hide resolved

sgbaird reviewed Jul 7, 2025

View reviewed changes

scripts/mongodbintegrationmvp.py Outdated Show resolved Hide resolved

sgbaird linked an issue Jul 7, 2025 that may be closed by this pull request

HiTL BO tutorial - Integrate BO with MongoDB instead of relying solely on a locally stored save_to_json_file #381

Open

Refactor experiment setup to include dynamic Sobol trial management a…

4353f4f

…nd enhance MongoDB integration

sgbaird requested changes Jul 9, 2025

View reviewed changes

Fixing MongdDB Integration Using Built In Serialization Method

0d5ae9d

sgbaird reviewed Jul 9, 2025

View reviewed changes

Gawthaman added 4 commits July 9, 2025 18:10

Refactor snapshot handling in MongoDB integration and streamline gene…

fe004b8

…ration strategy creation

Remove temporary file handling from MongoDB snapshot saving function

633f4b2

Refactor import statements and improve code formatting for MongoDB in…

c5c7a97

…tegration

Clean up whitespace and improve code readability in MongoDB integrati…

8c499f9

…on script

sgbaird reviewed Jul 24, 2025

View reviewed changes

Gawthaman added 2 commits July 25, 2025 12:58

Refactor MongoDB connection handling and improve snapshot saving logic

932d37b

Add scripts for MongoDB persistence testing and recovery demonstration

454d471

Gawthaman requested a review from sgbaird July 28, 2025 18:18

sgbaird reviewed Jul 28, 2025

View reviewed changes

Gawthaman added 2 commits August 11, 2025 14:22

Update references to 'Untitled-1.py' with 'mongodbintegrationmvp.py' …

89b7183

…in persistence testing scripts

sgbaird reviewed Sep 5, 2025

View reviewed changes

		SOBOL_TRIALS = 5


		def save_ax_snapshot_to_mongodb(ax_client, experiment_name):



		# Load existing experiment or create new one
		ax_client = load_ax_snapshot_from_mongodb(obj1_name)

		return ''.join(random.choices(string.ascii_lowercase + string.digits, k=length))


		def get_user_choice():



		# Get user choice and setup configuration
		user_choice = get_user_choice()

	# Experiment identifier (separate from objective name)
	# Experiment identifier (separate from objective name) with hardcoded unique ID

Conversation

Gawthaman commented Jul 7, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sgbaird commented Jul 7, 2025

Uh oh!

sgbaird left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sgbaird left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sgbaird left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sgbaird commented Jul 24, 2025

Uh oh!

Gawthaman commented Jul 28, 2025

Uh oh!

sgbaird left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sgbaird left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

sgbaird left a comment •

edited

Loading