docs: new cloud ui workflows + new captcha page (#4698)
Co-authored-by: Ritik Sahni <ritiksahni0203@gmail.com> Co-authored-by: Suchintan <suchintan@users.noreply.github.com>
This commit is contained in:
@@ -1,16 +1,14 @@
|
||||
---
|
||||
title: CAPTCHA & Bot Detection
|
||||
subtitle: How Skyvern detects, solves, and avoids CAPTCHAs and anti-bot systems
|
||||
subtitle: How Skyvern handles CAPTCHAs and avoids triggering anti-bot systems
|
||||
slug: going-to-production/captcha-bot-detection
|
||||
---
|
||||
|
||||
Websites use CAPTCHAs and bot detection to block automated traffic. Skyvern handles both — it detects CAPTCHAs using vision, solves them automatically on Skyvern Cloud, and configures browsers to avoid triggering anti-bot systems in the first place.
|
||||
Skyvern detects CAPTCHAs using its vision model and solves them automatically.
|
||||
|
||||
---
|
||||
However, CAPTCHA solving is not guaranteed. Skyvern's solvers can fail on novel challenges or rate-limited IPs.
|
||||
|
||||
## CAPTCHA detection
|
||||
|
||||
Skyvern detects CAPTCHAs using its LLM vision model. During each step, the AI analyzes the page screenshot and identifies whether a CAPTCHA is present and what type it is.
|
||||
This guide helps you detect CAPTCHA failures and implement fallbacks to keep your automation running smoothly.
|
||||
|
||||
**Supported CAPTCHA types:**
|
||||
|
||||
@@ -24,17 +22,11 @@ Skyvern detects CAPTCHAs using its LLM vision model. During each step, the AI an
|
||||
| `TEXT_CAPTCHA` | Distorted text/number images with an input field |
|
||||
| `OTHER` | Unrecognized CAPTCHA types |
|
||||
|
||||
When the AI detects a CAPTCHA, it emits a `SOLVE_CAPTCHA` action with the identified `captcha_type`. What happens next depends on your deployment.
|
||||
|
||||
---
|
||||
|
||||
## CAPTCHA solving
|
||||
## Use [`error_code_mapping`](/going-to-production/error-handling#step-3-use-error_code_mapping)
|
||||
|
||||
### Skyvern Cloud (automatic)
|
||||
|
||||
On [Skyvern Cloud](https://app.skyvern.com), CAPTCHAs are solved automatically. When the AI detects a CAPTCHA, Skyvern's solver service handles it in the background. No configuration needed — it works out of the box for all supported CAPTCHA types.
|
||||
|
||||
If the solver cannot resolve the CAPTCHA (rare edge cases or novel CAPTCHA types), the task continues with a `SOLVE_CAPTCHA` action failure. Handle this with [error code mapping](/going-to-production/error-handling):
|
||||
When a CAPTCHA blocks your automation, this gives you a consistent error code instead of parsing free-text failure messages.
|
||||
|
||||
<CodeGroup>
|
||||
```python Python
|
||||
@@ -42,7 +34,7 @@ result = await client.run_task(
|
||||
prompt="Submit the application form. COMPLETE when you see confirmation.",
|
||||
url="https://example.com/apply",
|
||||
error_code_mapping={
|
||||
"captcha_failed": "Return this if a CAPTCHA blocks progress after multiple attempts",
|
||||
"cant_solve_captcha": "return this error if captcha isn't solved after multiple retries",
|
||||
},
|
||||
)
|
||||
```
|
||||
@@ -53,7 +45,7 @@ const result = await client.runTask({
|
||||
prompt: "Submit the application form. COMPLETE when you see confirmation.",
|
||||
url: "https://example.com/apply",
|
||||
error_code_mapping: {
|
||||
captcha_failed: "Return this if a CAPTCHA blocks progress after multiple attempts",
|
||||
cant_solve_captcha: "return this error if captcha isn't solved after multiple retries",
|
||||
},
|
||||
},
|
||||
});
|
||||
@@ -67,111 +59,64 @@ curl -X POST "https://api.skyvern.com/v1/run/tasks" \
|
||||
"prompt": "Submit the application form. COMPLETE when you see confirmation.",
|
||||
"url": "https://example.com/apply",
|
||||
"error_code_mapping": {
|
||||
"captcha_failed": "Return this if a CAPTCHA blocks progress after multiple attempts"
|
||||
"cant_solve_captcha": "return this error if captcha isn't solved after multiple retries"
|
||||
}
|
||||
}'
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
### Human Interaction block (workflows)
|
||||
When a CAPTCHA blocks the run, the response includes `output.error = "cant_solve_captcha"`. You can branch your logic on this code.
|
||||
|
||||
For workflows where you want a human to solve the CAPTCHA manually, use the **Human Interaction** block. The workflow pauses and notifies you, then resumes after you solve it.
|
||||
<Info>
|
||||
For more on error code mapping and handling failures programmatically, see [Error Handling](/going-to-production/error-handling).
|
||||
</Info>
|
||||
|
||||
```yaml
|
||||
blocks:
|
||||
- block_type: navigation
|
||||
label: fill_form
|
||||
url: https://example.com/form
|
||||
navigation_goal: Fill out the registration form
|
||||
## Take Control from the browser stream
|
||||
|
||||
- block_type: human_interaction
|
||||
label: solve_captcha
|
||||
message: "Please solve the CAPTCHA and click Continue"
|
||||
In the Cloud UI, head to Runs > Click on *your run* > Click on the "Take Control" button over the browser stream.
|
||||
|
||||
- block_type: navigation
|
||||
label: submit_form
|
||||
navigation_goal: Click the submit button
|
||||
```
|
||||
This lets you manually solve CAPTCHAs.
|
||||
|
||||
<img src="/images/take-control.png" />
|
||||
|
||||
Release control when you're done and the agent resumes from where you left off.
|
||||
|
||||
---
|
||||
|
||||
## Bot detection avoidance
|
||||
## Built-in bot detection avoidance
|
||||
|
||||
Skyvern automatically configures the browser to reduce bot detection triggers. These protections apply to every run — no configuration needed.
|
||||
Skyvern automatically configures the browser to reduce bot detection triggers. These protections apply to every run. No configuration needed.
|
||||
|
||||
### Browser fingerprinting
|
||||
|
||||
Skyvern launches Chromium with settings that remove common automation signals:
|
||||
|
||||
- **`AutomationControlled` disabled** — Removes the Blink feature flag that marks the browser as automated
|
||||
- **`navigator.webdriver` hidden** — The `enable-automation` flag is suppressed so JavaScript detection scripts don't see a webdriver
|
||||
- **Viewport and user agent** — Set to match real consumer browsers
|
||||
- **Locale and timezone** — Automatically matched to the proxy location (see [Proxy & Geolocation](/going-to-production/proxy-geolocation))
|
||||
|
||||
### Reducing detection risk
|
||||
|
||||
Beyond fingerprinting, how you structure your automation affects detection. Three patterns that help:
|
||||
|
||||
**1. Use residential proxies for sensitive sites.** Datacenter IPs are the most common bot signal. Residential proxies route through real ISP addresses. See [Proxy & Geolocation](/going-to-production/proxy-geolocation).
|
||||
|
||||
**2. Reuse browser sessions for multi-step flows.** Creating a fresh browser for every step looks suspicious. A persistent session maintains cookies, cache, and history — appearing as a returning user. See [Browser Sessions](/optimization/browser-sessions).
|
||||
|
||||
**3. Use browser profiles for repeat visits.** Profiles save browser state from a previous session. Starting with an existing profile means the site sees a known browser with familiar cookies, not a blank slate. See [Browser Profiles](/optimization/browser-profiles).
|
||||
- **`AutomationControlled` disabled**: Removes the Blink feature flag that marks the browser as automated
|
||||
- **`navigator.webdriver` hidden**: The `enable-automation` flag is suppressed so JavaScript detection scripts don't see a webdriver
|
||||
- **Viewport and user agent**: Set to match real consumer browsers
|
||||
- **Locale and timezone**: Automatically matched to the proxy location (see [Proxy & Geolocation](/going-to-production/proxy-geolocation))
|
||||
|
||||
---
|
||||
|
||||
## Self-hosted deployment
|
||||
## Reducing detection risk
|
||||
|
||||
<Note>
|
||||
The sections above apply to Skyvern Cloud. If you're running Skyvern locally, the following differences apply.
|
||||
</Note>
|
||||
Beyond fingerprinting, how you structure your automation affects detection:
|
||||
|
||||
### CAPTCHA solving
|
||||
- **Use residential proxies for sensitive sites**: Datacenter IPs are the most common bot signal. Residential proxies route through real ISP addresses. Set `proxy_location="RESIDENTIAL"` or use `RESIDENTIAL_ISP` for static IPs.
|
||||
- **Reuse browser sessions for multi-step flows**: Creating a fresh browser for every step looks suspicious. A persistent session maintains cookies, cache, and history. See [Browser Sessions](/optimization/browser-sessions).
|
||||
- **Use browser profiles for repeat visits**: Profiles save browser state from a previous session. The site sees a known browser with familiar cookies instead of a blank slate. See [Browser Profiles](/optimization/browser-profiles).
|
||||
- **Add wait blocks between rapid actions**: Instant actions can trigger behavioral detection. A short pause between steps looks more human.
|
||||
|
||||
The open-source version does **not** include automatic CAPTCHA solving. When a CAPTCHA is detected, the agent pauses for 30 seconds to allow manual intervention (e.g., solving it in the browser window yourself), then continues.
|
||||
### If you get blocked
|
||||
|
||||
To handle CAPTCHAs in self-hosted workflows, use the Human Interaction block as described above.
|
||||
|
||||
### Browser extensions
|
||||
|
||||
Self-hosted deployments can load Chrome extensions for additional stealth or functionality:
|
||||
|
||||
```bash
|
||||
# .env
|
||||
EXTENSIONS=extension1,extension2
|
||||
EXTENSIONS_BASE_PATH=/path/to/extensions
|
||||
```
|
||||
|
||||
Extensions are loaded automatically when the browser launches.
|
||||
|
||||
### Proxies
|
||||
|
||||
Self-hosted deployments need their own proxy infrastructure. The `proxy_location` parameter is not available — configure proxies at the network level or via environment variables.
|
||||
- **Increase `max_steps`**: Some bot challenges (like Cloudflare) loop through multiple verification pages. More steps give the solver more attempts.
|
||||
- **Switch to a residential proxy**: `RESIDENTIAL_ISP` provides a static IP that services are more likely to trust.
|
||||
- **Use a browser profile that previously passed**: If a profile has already cleared a Cloudflare challenge on a domain, it's more likely to pass again.
|
||||
- **Load Chrome extensions**: Extensions can add additional stealth capabilities. Set `EXTENSIONS` and `EXTENSIONS_BASE_PATH` in your environment.
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
## Self-hosted deployments
|
||||
|
||||
### CAPTCHA blocks the run
|
||||
Automatic CAPTCHA solving is not available for self-hosted deployments.
|
||||
|
||||
**On Skyvern Cloud:** This is rare. If it happens, the CAPTCHA type may be unsupported or the site changed its challenge. Add an `error_code_mapping` entry to detect the failure, and contact [support@skyvern.com](mailto:support@skyvern.com).
|
||||
|
||||
**Self-hosted:** Use a Human Interaction block, or solve it manually within the 30-second window.
|
||||
|
||||
### Bot detection triggered (access denied)
|
||||
|
||||
1. Switch to a residential proxy — `proxy_location="RESIDENTIAL"` or `RESIDENTIAL_ISP` for static IPs
|
||||
2. Reuse a browser session instead of creating fresh browsers
|
||||
3. Use a browser profile with existing cookies
|
||||
4. Add `wait` blocks between rapid actions to reduce behavioral signals
|
||||
|
||||
### Cloudflare challenge page loops
|
||||
|
||||
Cloudflare sometimes loops through multiple challenges. If a task gets stuck:
|
||||
|
||||
- Increase `max_steps` to give the solver more attempts
|
||||
- Use `RESIDENTIAL_ISP` for a static IP that Cloudflare is more likely to trust
|
||||
- Use a browser profile that has previously passed the Cloudflare challenge on that domain
|
||||
Instead, when a CAPTCHA is detected, the agent pauses for 30 seconds to allow manual intervention. Solve it in the browser window yourself, then the agent continues automatically.
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user