docs: new cloud ui workflows + new captcha page (#4698)

Co-authored-by: Ritik Sahni <ritiksahni0203@gmail.com>
Co-authored-by: Suchintan <suchintan@users.noreply.github.com>
This commit is contained in:
Naman
2026-02-11 18:15:37 +05:30
committed by GitHub
parent 0c7b18cf56
commit 5e9c1f4f68
32 changed files with 1324 additions and 106 deletions

View File

@@ -1,16 +1,14 @@
---
title: CAPTCHA & Bot Detection
subtitle: How Skyvern detects, solves, and avoids CAPTCHAs and anti-bot systems
subtitle: How Skyvern handles CAPTCHAs and avoids triggering anti-bot systems
slug: going-to-production/captcha-bot-detection
---
Websites use CAPTCHAs and bot detection to block automated traffic. Skyvern handles both — it detects CAPTCHAs using vision, solves them automatically on Skyvern Cloud, and configures browsers to avoid triggering anti-bot systems in the first place.
Skyvern detects CAPTCHAs using its vision model and solves them automatically.
---
However, CAPTCHA solving is not guaranteed. Skyvern's solvers can fail on novel challenges or rate-limited IPs.
## CAPTCHA detection
Skyvern detects CAPTCHAs using its LLM vision model. During each step, the AI analyzes the page screenshot and identifies whether a CAPTCHA is present and what type it is.
This guide helps you detect CAPTCHA failures and implement fallbacks to keep your automation running smoothly.
**Supported CAPTCHA types:**
@@ -24,17 +22,11 @@ Skyvern detects CAPTCHAs using its LLM vision model. During each step, the AI an
| `TEXT_CAPTCHA` | Distorted text/number images with an input field |
| `OTHER` | Unrecognized CAPTCHA types |
When the AI detects a CAPTCHA, it emits a `SOLVE_CAPTCHA` action with the identified `captcha_type`. What happens next depends on your deployment.
---
## CAPTCHA solving
## Use [`error_code_mapping`](/going-to-production/error-handling#step-3-use-error_code_mapping)
### Skyvern Cloud (automatic)
On [Skyvern Cloud](https://app.skyvern.com), CAPTCHAs are solved automatically. When the AI detects a CAPTCHA, Skyvern's solver service handles it in the background. No configuration needed — it works out of the box for all supported CAPTCHA types.
If the solver cannot resolve the CAPTCHA (rare edge cases or novel CAPTCHA types), the task continues with a `SOLVE_CAPTCHA` action failure. Handle this with [error code mapping](/going-to-production/error-handling):
When a CAPTCHA blocks your automation, this gives you a consistent error code instead of parsing free-text failure messages.
<CodeGroup>
```python Python
@@ -42,7 +34,7 @@ result = await client.run_task(
prompt="Submit the application form. COMPLETE when you see confirmation.",
url="https://example.com/apply",
error_code_mapping={
"captcha_failed": "Return this if a CAPTCHA blocks progress after multiple attempts",
"cant_solve_captcha": "return this error if captcha isn't solved after multiple retries",
},
)
```
@@ -53,7 +45,7 @@ const result = await client.runTask({
prompt: "Submit the application form. COMPLETE when you see confirmation.",
url: "https://example.com/apply",
error_code_mapping: {
captcha_failed: "Return this if a CAPTCHA blocks progress after multiple attempts",
cant_solve_captcha: "return this error if captcha isn't solved after multiple retries",
},
},
});
@@ -67,111 +59,64 @@ curl -X POST "https://api.skyvern.com/v1/run/tasks" \
"prompt": "Submit the application form. COMPLETE when you see confirmation.",
"url": "https://example.com/apply",
"error_code_mapping": {
"captcha_failed": "Return this if a CAPTCHA blocks progress after multiple attempts"
"cant_solve_captcha": "return this error if captcha isn't solved after multiple retries"
}
}'
```
</CodeGroup>
### Human Interaction block (workflows)
When a CAPTCHA blocks the run, the response includes `output.error = "cant_solve_captcha"`. You can branch your logic on this code.
For workflows where you want a human to solve the CAPTCHA manually, use the **Human Interaction** block. The workflow pauses and notifies you, then resumes after you solve it.
<Info>
For more on error code mapping and handling failures programmatically, see [Error Handling](/going-to-production/error-handling).
</Info>
```yaml
blocks:
- block_type: navigation
label: fill_form
url: https://example.com/form
navigation_goal: Fill out the registration form
## Take Control from the browser stream
- block_type: human_interaction
label: solve_captcha
message: "Please solve the CAPTCHA and click Continue"
In the Cloud UI, head to Runs > Click on *your run* > Click on the "Take Control" button over the browser stream.
- block_type: navigation
label: submit_form
navigation_goal: Click the submit button
```
This lets you manually solve CAPTCHAs.
<img src="/images/take-control.png" />
Release control when you're done and the agent resumes from where you left off.
---
## Bot detection avoidance
## Built-in bot detection avoidance
Skyvern automatically configures the browser to reduce bot detection triggers. These protections apply to every run — no configuration needed.
Skyvern automatically configures the browser to reduce bot detection triggers. These protections apply to every run. No configuration needed.
### Browser fingerprinting
Skyvern launches Chromium with settings that remove common automation signals:
- **`AutomationControlled` disabled** — Removes the Blink feature flag that marks the browser as automated
- **`navigator.webdriver` hidden** — The `enable-automation` flag is suppressed so JavaScript detection scripts don't see a webdriver
- **Viewport and user agent** — Set to match real consumer browsers
- **Locale and timezone** — Automatically matched to the proxy location (see [Proxy & Geolocation](/going-to-production/proxy-geolocation))
### Reducing detection risk
Beyond fingerprinting, how you structure your automation affects detection. Three patterns that help:
**1. Use residential proxies for sensitive sites.** Datacenter IPs are the most common bot signal. Residential proxies route through real ISP addresses. See [Proxy & Geolocation](/going-to-production/proxy-geolocation).
**2. Reuse browser sessions for multi-step flows.** Creating a fresh browser for every step looks suspicious. A persistent session maintains cookies, cache, and history — appearing as a returning user. See [Browser Sessions](/optimization/browser-sessions).
**3. Use browser profiles for repeat visits.** Profiles save browser state from a previous session. Starting with an existing profile means the site sees a known browser with familiar cookies, not a blank slate. See [Browser Profiles](/optimization/browser-profiles).
- **`AutomationControlled` disabled**: Removes the Blink feature flag that marks the browser as automated
- **`navigator.webdriver` hidden**: The `enable-automation` flag is suppressed so JavaScript detection scripts don't see a webdriver
- **Viewport and user agent**: Set to match real consumer browsers
- **Locale and timezone**: Automatically matched to the proxy location (see [Proxy & Geolocation](/going-to-production/proxy-geolocation))
---
## Self-hosted deployment
## Reducing detection risk
<Note>
The sections above apply to Skyvern Cloud. If you're running Skyvern locally, the following differences apply.
</Note>
Beyond fingerprinting, how you structure your automation affects detection:
### CAPTCHA solving
- **Use residential proxies for sensitive sites**: Datacenter IPs are the most common bot signal. Residential proxies route through real ISP addresses. Set `proxy_location="RESIDENTIAL"` or use `RESIDENTIAL_ISP` for static IPs.
- **Reuse browser sessions for multi-step flows**: Creating a fresh browser for every step looks suspicious. A persistent session maintains cookies, cache, and history. See [Browser Sessions](/optimization/browser-sessions).
- **Use browser profiles for repeat visits**: Profiles save browser state from a previous session. The site sees a known browser with familiar cookies instead of a blank slate. See [Browser Profiles](/optimization/browser-profiles).
- **Add wait blocks between rapid actions**: Instant actions can trigger behavioral detection. A short pause between steps looks more human.
The open-source version does **not** include automatic CAPTCHA solving. When a CAPTCHA is detected, the agent pauses for 30 seconds to allow manual intervention (e.g., solving it in the browser window yourself), then continues.
### If you get blocked
To handle CAPTCHAs in self-hosted workflows, use the Human Interaction block as described above.
### Browser extensions
Self-hosted deployments can load Chrome extensions for additional stealth or functionality:
```bash
# .env
EXTENSIONS=extension1,extension2
EXTENSIONS_BASE_PATH=/path/to/extensions
```
Extensions are loaded automatically when the browser launches.
### Proxies
Self-hosted deployments need their own proxy infrastructure. The `proxy_location` parameter is not available — configure proxies at the network level or via environment variables.
- **Increase `max_steps`**: Some bot challenges (like Cloudflare) loop through multiple verification pages. More steps give the solver more attempts.
- **Switch to a residential proxy**: `RESIDENTIAL_ISP` provides a static IP that services are more likely to trust.
- **Use a browser profile that previously passed**: If a profile has already cleared a Cloudflare challenge on a domain, it's more likely to pass again.
- **Load Chrome extensions**: Extensions can add additional stealth capabilities. Set `EXTENSIONS` and `EXTENSIONS_BASE_PATH` in your environment.
---
## Troubleshooting
## Self-hosted deployments
### CAPTCHA blocks the run
Automatic CAPTCHA solving is not available for self-hosted deployments.
**On Skyvern Cloud:** This is rare. If it happens, the CAPTCHA type may be unsupported or the site changed its challenge. Add an `error_code_mapping` entry to detect the failure, and contact [support@skyvern.com](mailto:support@skyvern.com).
**Self-hosted:** Use a Human Interaction block, or solve it manually within the 30-second window.
### Bot detection triggered (access denied)
1. Switch to a residential proxy — `proxy_location="RESIDENTIAL"` or `RESIDENTIAL_ISP` for static IPs
2. Reuse a browser session instead of creating fresh browsers
3. Use a browser profile with existing cookies
4. Add `wait` blocks between rapid actions to reduce behavioral signals
### Cloudflare challenge page loops
Cloudflare sometimes loops through multiple challenges. If a task gets stuck:
- Increase `max_steps` to give the solver more attempts
- Use `RESIDENTIAL_ISP` for a static IP that Cloudflare is more likely to trust
- Use a browser profile that has previously passed the Cloudflare challenge on that domain
Instead, when a CAPTCHA is detected, the agent pauses for 30 seconds to allow manual intervention. Solve it in the browser window yourself, then the agent continues automatically.
---