Skip to content

Commit e556f95

Browse files
authored
Merge branch 'tembo/kernel-442-update-gemini-cua-integration-page' into main
2 parents cd9c593 + 45adb9c commit e556f95

File tree

3 files changed

+19
-19
lines changed

3 files changed

+19
-19
lines changed

integrations/computer-use/anthropic.mdx

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,6 @@ title: "Anthropic"
44

55
[Computer Use](https://docs.claude.com/en/docs/agents-and-tools/tool-use/computer-use-tool) is Anthropic's groundbreaking capability that enables Claude to interact with computers the way humans do—by looking at screens, moving cursors, clicking buttons, and typing text. This powerful feature allows AI agents to control web browsers, navigate interfaces, and perform complex tasks across applications.
66

7-
With Computer Use, Claude can:
8-
- **Navigate websites and applications** by interpreting visual interfaces
9-
- **Click buttons and fill forms** just like a human would
10-
- **Take screenshots** to understand and verify its actions
11-
- **Perform multi-step workflows** that span multiple applications or web pages
12-
137
By integrating Computer Use with Kernel, you can run these AI-powered browser automations on cloud-hosted infrastructure, eliminating the need for local browser management and enabling scalable, reliable AI agents.
148

159
## Quick setup with Computer Use

integrations/computer-use/gemini.mdx

Lines changed: 16 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2,30 +2,34 @@
22
title: "Gemini"
33
---
44

5-
Google's [Gemini 2.5 Computer Use model](https://blog.google/technology/google-deepmind/gemini-computer-use-model/) is a specialized model built on Gemini 2.5 Pro's capabilities to power agents that can interact with user interfaces.
5+
[Gemini 2.5 Computer Use](https://blog.google/technology/google-deepmind/gemini-computer-use-model/) is Google's groundbreaking capability that enables AI models to interact with computers the way humans do—by looking at screens, moving cursors, clicking buttons, and typing text. This powerful feature allows AI agents to control web browsers, navigate interfaces, and perform complex tasks across applications.
66

77
By integrating Gemini 2.5 Computer Use with Kernel, you can run these AI-powered browser automations on cloud-hosted infrastructure, eliminating the need for local browser management and enabling scalable, reliable AI agents.
88

9-
## Quick setup with our example template
9+
## Quick setup with Computer Use
1010

11-
Get started quickly with our TypeScript template that demonstrates Gemini 2.5 Computer Use with Kernel.
11+
Get started with Gemini Computer Use and Kernel using our pre-configured app template:
1212

13-
Check out the [Open-source Gemini Template](https://github.com/onkernel/ts-stagehand-google-cua-agent) repository for a complete working example that shows how to:
14-
- Set up Gemini 2.5 Computer Use with Kernel
15-
- Use Stagehand for browser automation
16-
- Run AI-powered web interactions on cloud infrastructure
13+
```bash
14+
npx @onkernel/create-kernel-app my-computer-use-app
15+
```
1716

18-
## Benefits of using Kernel with Gemini Computer Use
17+
Choose `TypeScript` as the programming language and then select `gemini-cua` as the template.
18+
19+
Then follow the [Quickstart guide](/quickstart/) to deploy and run your Computer Use automation on Kernel's infrastructure.
20+
21+
## Benefits of using Kernel with Computer Use
1922

2023
- **No local browser management**: Run Computer Use automations without installing or maintaining browsers locally
21-
- **Scalability**: Launch multiple browser sessions in parallel for concurrent automations
22-
- **Stealth mode**: Built-in anti-detection features for web interactions
24+
- **Scalability**: Launch multiple browser sessions in parallel for concurrent AI agents
25+
- **Stealth mode**: Built-in anti-detection features for reliable web interactions
2326
- **Session persistence**: Maintain browser state across automation runs
24-
- **Live view**: Debug your automations with real-time browser viewing
27+
- **Live view**: Debug your Computer Use agents with real-time browser viewing
28+
- **Cloud infrastructure**: Run computationally intensive AI agents without local resource constraints
2529

2630
## Next steps
2731

28-
- Check out [live view](/browsers/live-view) for debugging your automations
32+
- Check out [live view](/browsers/live-view) for debugging your Computer Use automations
2933
- Learn about [stealth mode](/browsers/stealth) for avoiding detection
3034
- Learn how to properly [terminate browser sessions](/browsers/termination)
3135
- Learn how to [deploy](/apps/deploy) your Computer Use app to Kernel

integrations/computer-use/openai.mdx

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,9 @@
22
title: "OpenAI"
33
---
44

5-
[Computer Use](https://openai.com/index/computer-using-agent/) is OpenAI's feature that enables AI models to interact with computers like humans do - through screen observation, cursor movement, and keyboard input. By integrating with Kernel, you can run Computer Use automations with cloud-hosted browsers, allowing your AI agents to navigate websites, fill forms, and interact with web applications autonomously.
5+
[Computer Use](https://openai.com/index/computer-using-agent/) is OpenAI's feature that enables AI models to interact with computers the way humans do—by looking at screens, moving cursors, clicking buttons, and typing text. This powerful feature allows AI agents to control web browsers, navigate interfaces, and perform complex tasks across applications.
6+
7+
By integrating Computer Use with Kernel, you can run these AI-powered browser automations on cloud-hosted infrastructure, eliminating the need for local browser management and enabling scalable, reliable AI agents.
68

79
## Quick setup with our Computer Use example app
810

0 commit comments

Comments
 (0)