REST API
The Syttra API is plain HTTP + JSON. Async-first: create a job, poll status, download results. All timestamps are ISO 8601 UTC.
https://api.syttra.com · Auth header: Authorization: Bearer sk_live_...Crawl modes
Every job picks one of three modes. Each has a dedicated guide with worked examples:
singleCrawl exactly one URL. Fastest mode — good for quick lookups, snippet extraction, recurring checks of one page.
Open guidefullStart at one URL and follow internal links via BFS up to max_depth. Bounded by max_pages. Best when you want everything.
Open guideselectCrawl an explicit list of URLs you've picked from the sitemap. Lets you target only the pages worth quota.
Open guideAuthentication
Every request must include your API key as a Bearer token. Keys are SHA-256 hashed at rest — we never store the plaintext. You can create and revoke keys from your dashboard.
curl https://api.syttra.com/v1/jobs \
-H "Authorization: Bearer sk_live_abcdef123..." \
-H "Content-Type: application/json"Requests without a valid key return 401 Unauthorized. Requests from an over-quota account return 402 Payment Required.
Job lifecycle
Every crawl runs as a Job with one of five states:
- queued — accepted, waiting for a worker
- running — actively crawling
- completed — done, results available
- failed — a non-recoverable error
- cancelled — cancelled by the user
Single-page jobs typically complete in <5 seconds. Full-site crawls scale with max_pages and site responsiveness.
Get job status
/v1/jobs/{id}200 Returns the current state of a job, including progress counters and any error details. Same shape regardless of mode.
Example response
{
"id": "9a47f5c1-...",
"status": "running",
"created_at": "2026-04-29T13:42:00Z",
"started_at": "2026-04-29T13:42:03Z",
"completed_at": null,
"pages_discovered": 12,
"pages_crawled": 7,
"pages_failed": 0,
"url": "https://example.com",
"mode": "full",
"error": null,
"links": { "self": "/v1/jobs/...", "result": "/v1/jobs/.../result" }
}Polling pattern
Poll every 2-5 seconds for small jobs, backing off to 15-30s for large full-site crawls. A websocket / SSE stream endpoint is on the post-launch roadmap.
Download results
/v1/jobs/{id}/result200 Streams the scraped content back. Multi-page jobs (full / select) return one document per crawled page concatenated with --- separators.
Query parameters
| Parameter | Type | Default |
|---|---|---|
format Which format to download if the job produced multiple. Defaults to the first in export_formats. | string | — |
Results are cached for 24 hours after completion, then permanently deleted. After that the endpoint returns 410 Gone. Filename: result.md for single-page jobs, crawl-{id}.md for multi-page (so the difference is obvious on disk).
List your jobs
/v1/jobs200 Returns a paginated list of jobs you own. Cursor-based pagination.
Query parameters
| Parameter | Type | Default |
|---|---|---|
status Filter by state: queued, running, completed, failed, cancelled. | string enum | — |
limit Page size. Max 100. | integer | 20 |
cursor Opaque cursor from the previous response's next_cursor. | string | — |
Cancel or delete a job
/v1/jobs/{id}204 For a running job: cancels it. For a completed job: deletes the cached result immediately. Metadata (ID, timing) is kept for 30 days either way.
No response body. Idempotent — calling on an already-cancelled job still returns 204.
Error envelope
Every error response uses a consistent shape:
{
"error": {
"code": "quota_exceeded",
"message": "You've used 100 / 100 pages this month.",
"request_id": "req_01HQ..."
}
}Common codes
400 validation_error— bad request body401 unauthorized— missing or invalid API key402 quota_exceeded— over your monthly page limit403 forbidden— job belongs to a different user404 not_found— job or result does not exist410 gone— result cache expired (after 24h)413 payload_too_large— request body over 64 KiB422 unsafe_url— URL resolves to a private / internal address422 tos_blocked— target site's ToS forbids automated access429 rate_limited— slow down5xx internal_error— it's us, not you. The request_id helps debugging.
Rate limits
During private beta:
- 60 requests/minute on mutating endpoints (
POST,DELETE) - 120 requests/minute on read endpoints (
GET) - See your usage page for your current monthly page allowance.
Every response carries X-RateLimit-Remaining and X-RateLimit-Reset headers. Rate limits tighten or loosen with your plan at public launch.