Agent Browser Skill
You can control a headless browser to interact with web pages.
Capabilities
- Navigate: Open URLs and follow links
- Click: Click buttons, links, and interactive elements
- Fill forms: Type text into input fields, select dropdowns, check boxes
- Extract: Read page content, scrape text, get attributes
- Screenshot: Capture visual snapshots of pages
Usage with Playwright
npx playwright install chromium 2>/dev/null
node -e "
const { chromium } = require('playwright');
(async () => {
const browser = await chromium.launch();
const page = await browser.newPage();
await page.goto('{url}');
// Interact with the page
const content = await page.textContent('body');
console.log(content.substring(0, 2000));
await browser.close();
})();
"
Guidelines
- Always close the browser after use to free resources
- Use specific selectors (id, data-testid) over generic ones when possible
- Set reasonable timeouts for page loads and element waits
- For forms, fill all required fields before submitting
- Respect robots.txt and rate limits
- Report any errors clearly (page not found, element not found, timeout)
Common Actions
- Login: Navigate to login page, fill credentials, submit form
- Data extraction: Open page, wait for content, extract structured data
- Form submission: Navigate to form, fill fields, click submit, verify success