ScrapeSpace

AI Agent

How ScrapeSpace's AI agent controls a browser to automate tasks.

Overview

The AI agent is the core of ScrapeSpace. It controls a real cloud browser using screenshots and DOM analysis — the same way a human would browse, but faster and more precise.

What the agent can do

Navigation

  • Navigate to any URL, go back, reload, and switch between browser tabs
  • Switch into iframes to interact with embedded content
  • Wait for elements or navigation to complete before continuing

Interaction

  • Click, double-click, and hover over any element
  • Type into inputs, search boxes, and forms
  • Select options from dropdowns
  • Press keyboard keys and shortcuts (Enter, Tab, Escape, Ctrl+A, etc.)
  • Scroll pages, or scroll directly to a specific element
  • Expand collapsible sections and accordions
  • Low-level mouse and keyboard control for elements that resist normal interaction

Data extraction

  • Extract text or HTML attributes (href, src, etc.) from individual elements
  • Extract structured data from repeated lists (tables, cards, search results)
  • Run JavaScript directly on the page for complex or bulk extraction
  • Make HTTP requests to a site's API endpoints
  • Handle pagination — next buttons, infinite scroll, page numbers

Authentication & security

  • Log in to websites using your stored credentials
  • Generate one-time passwords (TOTP) for two-factor authentication
  • Solve CAPTCHAs automatically — Cloudflare Turnstile, hCaptcha, reCAPTCHA v2/v3, FunCaptcha, GeeTest v3/v4, Amazon WAF, and MTCaptcha
  • Save and load cookies to maintain sessions

What the agent cannot do

The agent controls a real browser, so it can do anything you could do manually. However, it will reject prompts that have nothing to do with browser automation:

  • Math problems, trivia, or general knowledge questions
  • Writing essays, emails, or other text
  • Anything that doesn't involve a website

Live activity

While the agent is working, you can watch its progress in real-time via the activity log. Each action is logged with what the agent did and why, so you can understand its decision-making process.