fix: use correct unscoped agent-browser package name (#77)

* fix: use correct unscoped agent-browser package name The @anthropic-ai/agent-browser package does not exist on npm. The correct package is the unscoped 'agent-browser' from Vercel Labs (v0.7.6). Fixes the issue noted in PR #75 review comment. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: correct agent-browser API to match actual package exports - Change AgentBrowser to BrowserManager (correct export name) - Update documentation to show actual CLI-based usage pattern - Add snapshot + refs system examples (the real agent-browser API) - Add programmatic BrowserManager API for advanced use cases - Show CLI wrapper approach for script integration The agent-browser package from Vercel is primarily a CLI tool, not a JavaScript library with semantic methods. This corrects the documentation to match the actual package API. Fixes CodeRabbit review comment about incorrect import/export. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-19 15:12:48 +00:00 · 2026-01-25 22:04:47 -05:00
parent 5c63fa9006
commit 56ff5d444b
1 changed files with 58 additions and 20 deletions
--- a/agents/e2e-runner.md
+++ b/agents/e2e-runner.md
@@ -21,34 +21,72 @@ You are an expert end-to-end testing specialist. Your mission is to ensure criti

 ### Agent Browser Setup
 ```bash
-# Install agent-browser
-npm install @anthropic-ai/agent-browser
-# or
-pnpm add @anthropic-ai/agent-browser
+# Install agent-browser globally
+npm install -g agent-browser
+
+# Install Chromium (required)
+agent-browser install
 ```

-### Agent Browser Usage
+### Agent Browser CLI Usage (Primary)
+
+Agent Browser uses a snapshot + refs system optimized for AI agents:
+
+```bash
+# Open a page and get a snapshot with interactive elements
+agent-browser open https://example.com
+agent-browser snapshot -i  # Returns elements with refs like [ref=e1]
+
+# Interact using element references from snapshot
+agent-browser click @e1                      # Click element by ref
+agent-browser fill @e2 "user@example.com"   # Fill input by ref
+agent-browser fill @e3 "password123"        # Fill password field
+agent-browser click @e4                      # Click submit button
+
+# Wait for conditions
+agent-browser wait visible @e5               # Wait for element
+agent-browser wait navigation                # Wait for page load
+
+# Take screenshots
+agent-browser screenshot after-login.png
+
+# Get text content
+agent-browser get text @e1
+```
+
+### Agent Browser in Scripts
+
+For programmatic control, use the CLI via shell commands:
+
 ```typescript
-import { AgentBrowser } from '@anthropic-ai/agent-browser'
+import { execSync } from 'child_process'

-const browser = new AgentBrowser()
+// Execute agent-browser commands
+const snapshot = execSync('agent-browser snapshot -i --json').toString()
+const elements = JSON.parse(snapshot)

-// Semantic navigation - describe what you want
+// Find element ref and interact
+execSync('agent-browser click @e1')
+execSync('agent-browser fill @e2 "test@example.com"')
+```
+
+### Programmatic API (Advanced)
+
+For direct browser control (screencasts, low-level events):
+
+```typescript
+import { BrowserManager } from 'agent-browser'
+
+const browser = new BrowserManager()
+await browser.launch({ headless: true })
 await browser.navigate('https://example.com')
-await browser.click('the login button')
-await browser.fill('email input', 'user@example.com')
-await browser.fill('password field', 'securepassword')
-await browser.click('submit button')

-// Wait for semantic conditions
-await browser.waitFor('dashboard to load')
-await browser.waitFor('user avatar to appear')
+// Low-level event injection
+await browser.injectMouseEvent({ type: 'mousePressed', x: 100, y: 200, button: 'left' })
+await browser.injectKeyboardEvent({ type: 'keyDown', key: 'Enter', code: 'Enter' })

-// Take screenshots
-await browser.screenshot('after-login.png')
-
-// Extract data semantically
-const username = await browser.getText('the username in the header')
+// Screencast for AI vision
+await browser.startScreencast()  // Stream viewport frames
 ```

 ### Agent Browser with Claude Code