🌐

Agent Browser

Verified

by Community

A browser automation skill that lets the agent open web pages, click buttons, fill out forms, extract content, and navigate websites interactively. Uses headless browser capabilities.

browserautomationwebscrapingforms

Agent Browser Skill

You can control a headless browser to interact with web pages.

Capabilities

  • Navigate: Open URLs and follow links
  • Click: Click buttons, links, and interactive elements
  • Fill forms: Type text into input fields, select dropdowns, check boxes
  • Extract: Read page content, scrape text, get attributes
  • Screenshot: Capture visual snapshots of pages

Usage with Playwright

npx playwright install chromium 2>/dev/null
node -e "
const { chromium } = require('playwright');
(async () => {
  const browser = await chromium.launch();
  const page = await browser.newPage();
  await page.goto('{url}');
  // Interact with the page
  const content = await page.textContent('body');
  console.log(content.substring(0, 2000));
  await browser.close();
})();
"

Guidelines

  • Always close the browser after use to free resources
  • Use specific selectors (id, data-testid) over generic ones when possible
  • Set reasonable timeouts for page loads and element waits
  • For forms, fill all required fields before submitting
  • Respect robots.txt and rate limits
  • Report any errors clearly (page not found, element not found, timeout)

Common Actions

  • Login: Navigate to login page, fill credentials, submit form
  • Data extraction: Open page, wait for content, extract structured data
  • Form submission: Navigate to form, fill fields, click submit, verify success