Skip to content

Hotel Receptionist Scenario Expansion#6186

Merged
tinalenguyen merged 9 commits into
mainfrom
ShayneP/hotel-scenarios-2
Jun 24, 2026
Merged

Hotel Receptionist Scenario Expansion#6186
tinalenguyen merged 9 commits into
mainfrom
ShayneP/hotel-scenarios-2

Conversation

@ShayneP

@ShayneP ShayneP commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

This PR:

  • Extends the simulation set to 100 scenarios
    • This also adds policies and tools to accomplish the tasks set out in these scenarios
  • Splits tools and instructions out into their own files so they're easier to grok
  • Optimizes the agent persona to answer as many of the scenarios correctly as possible

Next:

  • Comprehensive README explaining the architecture and design choices of the Hotel Receptionist
  • Performance tuning

ShayneP added 3 commits June 16, 2026 16:08
Adds 11 new scenarios (check-out early, dinner move/cancel, wake-up move,
red-eye hold, valuables/liability, local-area, callback-to-finish, hostile
free-night, can't-verify change, room/floor confirm) plus the example logic
they exercise: view-based room moves (agent.py, hotel_db.py, modify_booking.py)
and restaurant-reservation modification (hotel_db.py), and two new policy docs
(local_area.md, safe_deposit.md). Ported on top of the benchmark PR branch so
the gradeable expected_state versions of shared scenarios are preserved.
@ShayneP ShayneP requested a review from a team as a code owner June 22, 2026 18:43
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

Open in Devin Review

Comment on lines +39 to +48
try:
report_dict = report.to_dict()
report_dict["tags"] = sorted(ctx.tagger.tags)
report_dict["evaluations"] = ctx.tagger.evaluations
report_dict["outcome"] = ctx.tagger.outcome
report_dict["outcome_reason"] = ctx.tagger.outcome_reason
with open(os.path.join(report_dir, f"session_report-{room}.json"), "w") as f:
json.dump(report_dict, f, indent=2)
except Exception:
logger.exception("error dumping session report")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 run_artifacts.py references potentially new SessionReport/Tagger API surface

dump_run_artifacts calls report.to_dict(), ctx.tagger.evaluations, ctx.tagger.outcome, and ctx.tagger.outcome_reason (run_artifacts.py:40-44). These may be newer SDK APIs not present in older versions. The entire block is wrapped in a try/except so a missing attribute wouldn't crash the session, but the artifact dump would silently fail. Worth verifying these APIs exist in the targeted SDK version.

Open in Devin Review

Was this helpful? React with πŸ‘ or πŸ‘Ž to provide feedback.

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

Open in Devin Review

Comment on lines +50 to +61
async def start_restaurant_booking(self, ctx: RunContext[Userdata]) -> str | None:
"""Start the restaurant-reservation flow. Call it the moment the caller wants a table - the flow collects date, party size, time, name, and phone itself. Its return is the FINAL result of the reservation: relay it and move on - nothing further to confirm or call afterwards."""
reservation = await BookRestaurantTask(
db=ctx.userdata.db, chat_ctx=speech_only(self.chat_ctx)
)
return (
f"You're set for {speak_time(reservation.time)} on "
f"{reservation.date.strftime('%A, %B %-d')} for "
f"{reservation.party_size} guest{'s' if reservation.party_size != 1 else ''}. "
f"Confirmation code: {_speak_code(reservation.code)}. "
"| reservation complete - relay this to the caller; no further tool call is needed."
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Asymmetric duplicate-prevention: room bookings guarded but restaurant bookings are not

The PR adds a duplicate-prevention guard for start_room_booking (tools_rooms.py:183-195) using last_room_booking and caller_turns_at_last_booking in Userdata. No equivalent guard exists for start_restaurant_booking (tools_restaurant.py:50-61). The Userdata class in common.py has no last_restaurant_booking field. This is presumably intentional β€” the room booking flow is longer and more prone to model re-entry than the restaurant flow β€” but it creates an asymmetry. If the same re-entry problem occurs with restaurant bookings, it would silently double-book a table.

Open in Devin Review

Was this helpful? React with πŸ‘ or πŸ‘Ž to provide feedback.

@tinalenguyen tinalenguyen merged commit 04f3fdd into main Jun 24, 2026
22 of 23 checks passed
@tinalenguyen tinalenguyen deleted the ShayneP/hotel-scenarios-2 branch June 24, 2026 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants