Skip to content

Commit 4abb2c3

Browse files
Add CI, tests, and improve SDK docs and schemas
Co-authored-by: Shri Sukhani <shrisukhani@users.noreply.github.com>
1 parent 003db16 commit 4abb2c3

11 files changed

Lines changed: 500 additions & 99 deletions

File tree

.github/workflows/ci.yml

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
name: CI
2+
3+
on:
4+
pull_request:
5+
push:
6+
branches:
7+
- main
8+
- master
9+
- "cursor/**"
10+
11+
jobs:
12+
lint-test-build:
13+
runs-on: ubuntu-latest
14+
strategy:
15+
fail-fast: false
16+
matrix:
17+
python-version: ["3.9", "3.12"]
18+
19+
steps:
20+
- name: Checkout
21+
uses: actions/checkout@v4
22+
23+
- name: Set up Python
24+
uses: actions/setup-python@v5
25+
with:
26+
python-version: ${{ matrix.python-version }}
27+
28+
- name: Install dependencies
29+
run: |
30+
python -m pip install --upgrade pip
31+
python -m pip install . pytest ruff build
32+
33+
- name: Lint
34+
run: python -m ruff check .
35+
36+
- name: Test
37+
run: python -m pytest -q
38+
39+
- name: Build package
40+
run: python -m build

README.md

Lines changed: 91 additions & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -1,105 +1,128 @@
11
# Hyperbrowser Python SDK
22

3-
Checkout the full documentation [here](https://hyperbrowser.ai/docs)
3+
Python SDK for the Hyperbrowser API.
44

5-
## Installation
5+
- Full docs: https://hyperbrowser.ai/docs
6+
- Package: https://pypi.org/project/hyperbrowser/
67

7-
Currently Hyperbrowser supports creating a browser session in two ways:
8+
## Requirements
89

9-
- Async Client
10-
- Sync Client
10+
- Python `>=3.9`
1111

12-
It can be installed from `pypi` by running :
12+
## Installation
1313

14-
```shell
14+
```bash
1515
pip install hyperbrowser
1616
```
1717

1818
## Configuration
1919

20-
Both the sync and async client follow similar configuration params
20+
You can pass credentials directly, or use environment variables.
21+
22+
```bash
23+
export HYPERBROWSER_API_KEY="your_api_key"
24+
export HYPERBROWSER_BASE_URL="https://api.hyperbrowser.ai" # optional
25+
```
26+
27+
## Clients
2128

22-
### API Key
23-
The API key can be configured either from the constructor arguments or environment variables using `HYPERBROWSER_API_KEY`
29+
The SDK provides both sync and async clients with mirrored APIs:
2430

25-
## Usage
31+
- `Hyperbrowser` (sync)
32+
- `AsyncHyperbrowser` (async)
2633

27-
### Async
34+
### Sync quickstart
35+
36+
```python
37+
from hyperbrowser import Hyperbrowser
38+
39+
with Hyperbrowser(api_key="your_api_key") as client:
40+
session = client.sessions.create()
41+
print(session.id, session.ws_endpoint)
42+
client.sessions.stop(session.id)
43+
```
44+
45+
### Async quickstart
2846

2947
```python
3048
import asyncio
31-
from pyppeteer import connect
3249
from hyperbrowser import AsyncHyperbrowser
3350

34-
HYPERBROWSER_API_KEY = "test-key"
35-
36-
async def main():
37-
async with AsyncHyperbrowser(api_key=HYPERBROWSER_API_KEY) as client:
51+
async def main() -> None:
52+
async with AsyncHyperbrowser(api_key="your_api_key") as client:
3853
session = await client.sessions.create()
54+
print(session.id, session.ws_endpoint)
55+
await client.sessions.stop(session.id)
3956

40-
ws_endpoint = session.ws_endpoint
41-
browser = await connect(browserWSEndpoint=ws_endpoint, defaultViewport=None)
57+
asyncio.run(main())
58+
```
4259

43-
# Get pages
44-
pages = await browser.pages()
45-
if not pages:
46-
raise Exception("No pages available")
60+
## Main manager surface
4761

48-
page = pages[0]
62+
Both clients expose:
4963

50-
# Navigate to a website
51-
print("Navigating to Hacker News...")
52-
await page.goto("https://news.ycombinator.com/")
53-
page_title = await page.title()
54-
print("Page title:", page_title)
64+
- `client.sessions`
65+
- `client.scrape` (+ `client.scrape.batch`)
66+
- `client.crawl`
67+
- `client.extract`
68+
- `client.web` (+ `client.web.batch_fetch`, `client.web.crawl`)
69+
- `client.agents` (`browser_use`, `cua`, `claude_computer_use`, `gemini_computer_use`, `hyper_agent`)
70+
- `client.profiles`
71+
- `client.extensions`
72+
- `client.team`
73+
- `client.computer_action`
5574

56-
await page.close()
57-
await browser.disconnect()
58-
await client.sessions.stop(session.id)
59-
print("Session completed!")
75+
## Job polling (`start_and_wait`)
6076

61-
# Run the asyncio event loop
62-
asyncio.get_event_loop().run_until_complete(main())
63-
```
64-
### Sync
77+
Long-running APIs expose `start_and_wait(...)`.
78+
79+
These methods now support explicit polling controls:
80+
81+
- `poll_interval_seconds` (default `2.0`)
82+
- `max_wait_seconds` (default `600.0`)
83+
84+
Example:
6585

6686
```python
67-
from playwright.sync_api import sync_playwright
6887
from hyperbrowser import Hyperbrowser
88+
from hyperbrowser.models import StartExtractJobParams
89+
90+
with Hyperbrowser(api_key="your_api_key") as client:
91+
result = client.extract.start_and_wait(
92+
StartExtractJobParams(
93+
urls=["https://hyperbrowser.ai"],
94+
prompt="Extract the main headline",
95+
),
96+
poll_interval_seconds=1.5,
97+
max_wait_seconds=300,
98+
)
99+
print(result.status, result.data)
100+
```
69101

70-
HYPERBROWSER_API_KEY = "test-key"
102+
## Error handling
71103

72-
def main():
73-
client = Hyperbrowser(api_key=HYPERBROWSER_API_KEY)
74-
session = client.sessions.create()
104+
SDK errors are raised as `HyperbrowserError`.
75105

76-
ws_endpoint = session.ws_endpoint
77-
78-
# Launch Playwright and connect to the remote browser
79-
with sync_playwright() as p:
80-
browser = p.chromium.connect_over_cdp(ws_endpoint)
81-
context = browser.new_context()
82-
83-
# Get the first page or create a new one
84-
if len(context.pages) == 0:
85-
page = context.new_page()
86-
else:
87-
page = context.pages[0]
88-
89-
# Navigate to a website
90-
print("Navigating to Hacker News...")
91-
page.goto("https://news.ycombinator.com/")
92-
page_title = page.title()
93-
print("Page title:", page_title)
94-
95-
page.close()
96-
browser.close()
97-
print("Session completed!")
98-
client.sessions.stop(session.id)
106+
```python
107+
from hyperbrowser import Hyperbrowser
108+
from hyperbrowser.exceptions import HyperbrowserError
99109

100-
# Run the asyncio event loop
101-
main()
110+
try:
111+
with Hyperbrowser(api_key="invalid") as client:
112+
client.team.get_credit_info()
113+
except HyperbrowserError as exc:
114+
print(exc)
102115
```
116+
117+
## Development
118+
119+
```bash
120+
pip install -e . pytest ruff build
121+
python -m ruff check .
122+
python -m pytest -q
123+
python -m build
124+
```
125+
103126
## License
104127

105-
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
128+
MIT see [LICENSE](LICENSE).

hyperbrowser/client/sync.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,3 +40,9 @@ def __init__(
4040

4141
def close(self) -> None:
4242
self.transport.close()
43+
44+
def __enter__(self):
45+
return self
46+
47+
def __exit__(self, exc_type, exc_val, exc_tb):
48+
self.close()

hyperbrowser/py.typed

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+

hyperbrowser/tools/schema.py

Lines changed: 21 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -32,12 +32,7 @@ def get_scrape_options(formats: Optional[List[scrape_types]] = None):
3232
"description": "Whether to only return the main content of the page. If true, only the main content of the page will be returned, excluding any headers, navigation menus,footers, or other non-main content.",
3333
},
3434
},
35-
"required": [
36-
"include_tags",
37-
"exclude_tags",
38-
"only_main_content",
39-
"formats",
40-
],
35+
"required": [],
4136
"additionalProperties": False,
4237
}
4338

@@ -51,7 +46,7 @@ def get_scrape_options(formats: Optional[List[scrape_types]] = None):
5146
},
5247
"scrape_options": get_scrape_options(),
5348
},
54-
"required": ["url", "scrape_options"],
49+
"required": ["url"],
5550
"additionalProperties": False,
5651
}
5752

@@ -103,15 +98,7 @@ def get_scrape_options(formats: Optional[List[scrape_types]] = None):
10398
},
10499
"scrape_options": get_scrape_options(),
105100
},
106-
"required": [
107-
"url",
108-
"max_pages",
109-
"follow_links",
110-
"ignore_sitemap",
111-
"exclude_patterns",
112-
"include_patterns",
113-
"scrape_options",
114-
],
101+
"required": ["url"],
115102
"additionalProperties": False,
116103
}
117104

@@ -130,15 +117,18 @@ def get_scrape_options(formats: Optional[List[scrape_types]] = None):
130117
"description": "A prompt describing how you want the data structured, or what you want to extract from the urls provided. Can also be used to guide the extraction process. For multi-source queries, structure this prompt to request unified, comparative, or aggregated information across all provided URLs.",
131118
},
132119
"schema": {
133-
"type": "string",
134-
"description": "A strict json schema you want the returned data to be structured as. For multi-source extraction, design this schema to accommodate information from all URLs in a single structure. Ensure that this is a proper json schema, and the root level should be of type 'object'.",
120+
"anyOf": [
121+
{"type": "object"},
122+
{"type": "string"},
123+
],
124+
"description": "A strict JSON schema for the response shape. This can be either a JSON object schema or a JSON string that can be parsed into an object schema. For multi-source extraction, design this schema to accommodate information from all URLs in a single structure.",
135125
},
136126
"max_links": {
137127
"type": "number",
138128
"description": "The maximum number of links to look for if performing a crawl for any given url in the urls list.",
139129
},
140130
},
141-
"required": ["urls", "prompt", "schema", "max_links"],
131+
"required": ["urls"],
142132
"additionalProperties": False,
143133
}
144134

@@ -147,12 +137,19 @@ def get_scrape_options(formats: Optional[List[scrape_types]] = None):
147137
"enum": [
148138
"gpt-4o",
149139
"gpt-4o-mini",
140+
"gpt-4.1",
141+
"gpt-4.1-mini",
142+
"gpt-5",
143+
"gpt-5-mini",
144+
"claude-sonnet-4-5",
145+
"claude-sonnet-4-20250514",
150146
"claude-3-7-sonnet-20250219",
151147
"claude-3-5-sonnet-20241022",
152148
"claude-3-5-haiku-20241022",
153149
"gemini-2.0-flash",
150+
"gemini-2.5-flash",
154151
],
155-
"default": "gemini-2.0-flash",
152+
"default": "gemini-2.5-flash",
156153
}
157154

158155
BROWSER_USE_SCHEMA = {
@@ -164,27 +161,21 @@ def get_scrape_options(formats: Optional[List[scrape_types]] = None):
164161
},
165162
"llm": {
166163
**BROWSER_USE_LLM_SCHEMA,
167-
"description": "The language model (LLM) instance to use for generating actions. Default to gemini-2.0-flash.",
164+
"description": "The language model (LLM) instance to use for generating actions. Defaults to gemini-2.5-flash.",
168165
},
169166
"planner_llm": {
170167
**BROWSER_USE_LLM_SCHEMA,
171-
"description": "The language model to use specifically for planning future actions, can differ from the main LLM. Default to gemini-2.0-flash.",
168+
"description": "The language model to use specifically for planning future actions, can differ from the main LLM. Defaults to gemini-2.5-flash.",
172169
},
173170
"page_extraction_llm": {
174171
**BROWSER_USE_LLM_SCHEMA,
175-
"description": "The language model to use for extracting structured data from webpages. Default to gemini-2.0-flash.",
172+
"description": "The language model to use for extracting structured data from webpages. Defaults to gemini-2.5-flash.",
176173
},
177174
"keep_browser_open": {
178175
"type": "boolean",
179176
"description": "When enabled, keeps the browser session open after task completion.",
180177
},
181178
},
182-
"required": [
183-
"task",
184-
"llm",
185-
"planner_llm",
186-
"page_extraction_llm",
187-
"keep_browser_open",
188-
],
179+
"required": ["task"],
189180
"additionalProperties": False,
190181
}

pyproject.toml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,14 +10,15 @@ homepage = "https://github.com/hyperbrowserai/python-sdk"
1010
repository = "https://github.com/hyperbrowserai/python-sdk"
1111

1212
[tool.poetry.dependencies]
13-
python = "^3.8"
13+
python = ">=3.9,<4.0"
1414
pydantic = ">=2.0,<3"
1515
httpx = ">=0.23.0,<1"
1616
jsonref = ">=1.1.0"
1717

1818

1919
[tool.poetry.group.dev.dependencies]
2020
ruff = "^0.3.0"
21+
pytest = "^8.0.0"
2122

2223

2324
[build-system]

0 commit comments

Comments
 (0)