What does Warpstack replace in my data stack?

Warpstack replaces your entire data acquisition pipeline: scrapers (Scrapy, Playwright, Puppeteer), orchestration (Airflow, Dagster), storage (Postgres, MongoDB, S3), serving infrastructure (FastAPI, Express, GraphQL), and proxy services (Bright Data, Oxylabs). One platform, one vendor, one bill.

How quickly can I get data pipelines into production?

Most teams ship their first production pipeline in an afternoon. Define your schema in natural language, point to your data sources, and Warpstack handles acquisition, structuring, hosting, serving, and monitoring automatically. No infrastructure to provision.

How does Warpstack handle complex websites and anti-bot measures?

Warpstack Vision Agents read pages like humans—using computer vision instead of brittle CSS selectors. They browse through our global proxy network across 190+ countries, naturally handling CAPTCHAs, JavaScript rendering, Canvas, WebGL, and dynamic content.

Is Warpstack compliant for enterprise use?

Yes. Warpstack provides full audit trails, forensic logging, and respects robots.txt. Enterprise plans include SSO, dedicated infrastructure, custom SLAs, and private cloud deployment options. Users are responsible for ensuring their use cases comply with applicable laws and website terms of service.

What happens when source websites change their layout?

Unlike scrapers that break on HTML changes, Warpstack agents understand semantic intent and adapt automatically. This adaptive capability means no selector maintenance, no broken pipelines, and your API endpoints stay online without manual intervention.

Getting Started

Name: Warpstack
Author: Warpstack

Introduction

Warpstack is the Data Acquisition Cloud — built for alternative datasets. Add new data sources in hours, stop maintaining scrapers, and ship with full audit trails from day one.

The full stack, handled for you:

Acquisition: AI agents connect to any source—websites, APIs, databases—and extract the data you need
Structuring: Raw data is automatically cleaned, validated, and transformed to your schema
Hosting: Your data lives in our cloud with low-latency access and automatic backups
Serving: Production-ready REST APIs with filtering, sorting, and pagination—ready from day one
Monitoring: Complete audit trails, observability, and adaptive handling when sources change

Our AI agents use vision to "see" pages like a human—reading charts, handling dynamic JavaScript, and adapting to layout changes without breaking. But Warpstack isn't just about extraction. It's about replacing your entire data stack with a single platform.

How It Works

From description to production API in minutes:

Describe your data: Tell us what you need in plain English—no code, no selectors
We generate your schema: AI creates a structured OpenAPI schema from your description
Deploy: Click "Deploy Warp" and we handle acquisition, structuring, and hosting
Access your API: Your data is immediately available via a production REST endpoint
Set a schedule: Keep data fresh with automatic hourly, daily, or weekly refreshes

Why Warpstack?

Alternative data vendors spend 30-60% of engineering time maintaining scrapers. Warpstack eliminates that burden:

Ship more datasets: Add new sources in hours, not months—expand coverage faster
Zero maintenance: AI agents adapt when sites change—no more 3am pages
Built-in compliance: Full audit trails for every record—answer "where did this data come from?" in seconds
Production APIs included: Every dataset is a REST API with filtering, sorting, and pagination
Managed hosting: We host your data with automatic backups and retention policies

Your Prompts and Schemas Matter

Quality in, quality out. Vague prompts or incorrect schemas lead to poor results. Like any AI system, Warpstack performs best when you're specific about what you want.

Take time to craft clear prompts and accurate schemas. See Best Practices for guidance.

Current Limitations

Warpstack works best with public data sources. Current limitations:

Login-protected content: Cannot authenticate to gated pages (coming soon)
CAPTCHA challenges: Cannot reliably solve CAPTCHAs
Real-time streaming: Designed for periodic acquisition, not sub-second updates
File processing: Cannot download or parse PDFs, spreadsheets, or documents

Core Concepts

Understanding these key concepts will help you get the most out of Warpstack.

Warp

A warp is an autonomous data extraction and serving job. Each warp is configured with:

Target URL: The website page to extract data from
Prompt: Natural language description of what data to extract
Schema: OpenAPI definition of the data structure
Schedule: How often to run (manual, hourly, daily, weekly)
Intelligence tier: Which AI model powers the extraction

Schema

The schema is the most important part of your warp. It serves two purposes:

API contract: Defines the JSON structure your API returns
Extraction instructions: Field descriptions guide the agent on what to extract

Schemas are written in OpenAPI 3.1.0 format and follow Structured Outputs rules. The quality of your prompt directly determines the quality of your schema, which determines extraction accuracy.

Stargate API

Stargate is the data serving layer. Once your warp extracts data, it's instantly available via your unique API endpoint:

https://{your-warp-id}.stargate.warpstack.ai

Stargate supports SQL-like filtering, sorting, pagination, column selection, and multiple output formats (JSON, CSV). Endpoints are public by default, but can be secured with access tokens.

Warp States

Warps move through different states during their lifecycle:

DRAFT: Being configured. Schema is editable, not yet deployed.
DEPLOYING: Setting up infrastructure for the first extraction.
ACTIVE: Deployed and serving data. Schema is locked.
RUNNING: Currently executing an extraction job.
SCHEDULED: Active with a recurring schedule configured.
PAUSED: Temporarily stopped. API still serves existing data.

Intelligence Tiers

Choose the AI model that powers your extraction:

V1-Lite: For simple, straightforward pages.
V1-Base: Recommended. For complex sites with dynamic content.

See Intelligence Tiers for details.

Quickstart

Get your first warp running in under 5 minutes. This guide walks you through the essential steps.

Step 1: Create a Warp

From your dashboard, click "Create Warp". Enter the URL of the page you want to extract data from.

Step 2: Write Your Prompt

Describe what data you want to extract in natural language. Be specific about the fields you need.

Example prompt:

"Extract all pricing plans from this page. For each plan, get the plan name, monthly price as a number, yearly price if shown, and the list of features included."

Step 3: Generate Schema

Click "Generate Schema". The AI will analyze your prompt and create an OpenAPI schema. This takes 10-30 seconds. Review the generated fields to ensure they match your needs.

Step 4: Deploy

Click "Deploy Warp" to deploy your warp. The first extraction starts immediately. You can watch the agent work in the Monitor tab.

Step 5: Access Your Data

Once extraction completes, your data is available via the Stargate API:

bash

curl "https://{your-warp-id}.stargate.warpstack.ai/?limit=10"

That's it! Your data is now accessible as a REST API. Set a schedule to keep it fresh, or trigger manual runs when you need updates.

What's Next

Read the full User Guide for detailed step-by-step instructions
Learn about schema design to improve extraction accuracy
Explore Stargate API for filtering, sorting, and pagination

Usage & Billing

Warpstack offers custom enterprise pricing based on your data volume and requirements. Contact our sales team for a tailored quote.

Enterprise Plan

All enterprise agreements include full platform access, dedicated support, custom SLAs, and compliance features.

Full audit trail: Chain of custody logging for every data point.
PII detection & redaction: Automatic detection and redaction at the edge.
Custom SLAs: Uptime guarantees and response time commitments.
Dedicated support: Priority support with dedicated account manager.

Usage Components

Enterprise usage is charged based on three components. Rates are included in your custom enterprise agreement.

Extraction

Token-based usage during extraction. The rate is a blended measure combining input tokens (page content) and output tokens (extracted data).

V1-Lite: For simple, straightforward pages
V1-Base: For complex sites with dynamic content

Expect 20,000+ tokens per simple extraction. Complex pages use more.

Storage

Billed monthly based on total data stored across all warps.

Included: Generous storage allocation in enterprise plans

Serving (Bandwidth)

Based on data transferred when your applications call the Stargate API.

Included: API bandwidth in enterprise plans

How Enterprise Billing Works

Custom agreements: Pricing tailored to your volume and requirements.
Invoiced billing: Monthly invoicing with NET 30 terms.
Usage tracking: Monitor your usage in the dashboard.
Dedicated support: Contact your account manager for any billing inquiries.

Usage Example

Extracting products from an e-commerce site using V1-Base:

Extraction: ~25,000 tokens
Storage: ~1 MB per extraction
Serving: 10 API calls, ~100 KB each

Request a demo to discuss pricing based on your volume requirements.

Intelligence Tier Comparison

V1-Lite: For simple, straightforward pages.
V1-Base: Recommended. For complex sites with dynamic content.

Start with V1-Base for new warps. See Intelligence Tiers for details.

Account & Authentication

Creating Your Account

Creating a Warpstack account takes less than a minute. Once registered, you'll have immediate access to the platform and can start building your first data acquisition agent right away.

Enterprise Onboarding

Enterprise accounts are provisioned by our sales team. Contact us to get started with custom pricing and dedicated onboarding support.

Getting Started

Request a Demo

Visit warpstack.ai and click "Request a Demo" to see the platform in action and discuss your requirements.

Enter Your Information

Fill out the registration form with your details:

Full NameRequired

Your display name used in the dashboard and for billing purposes.

Phone NumberRequired

Your mobile phone number for SMS verification. This is your primary verification method.

• Include country code (e.g., +1 for US, +44 for UK)
• Must be a mobile number capable of receiving SMS
• A 6-digit verification code will be sent via SMS

Email AddressRequired

Your primary contact email for login, email verification, and important notifications like extraction failures or billing alerts.

PasswordRequired

Create a secure password for your account:

• Minimum 8 characters
• Mix of letters and numbers recommended
• Special characters allowed (!@#$%^&*)

Confirm PasswordRequired

Re-enter your password to confirm it matches.

Accept Terms & Create Account

Check the box to agree to our Terms of Service and Privacy Policy, then click "Create Account." Your account will be created immediately and an SMS verification code will be sent.

Verify Your Phone (Step 1)

You'll receive a 6-digit verification code via SMS within seconds. Enter this code to verify your phone number. This is the first of two verification steps.

Didn't receive it? You can request a new code after 2 minutes.

Verify Your Email (Step 2)

After phone verification, check your inbox for a verification email from Warpstack. Click the verification link to complete account setup.

Can't find it? Check your spam folder or use the "Resend Verification" option.

Two-Step Verification

Warpstack uses two-factor verification to secure your account:

Step 1Phone Verification

6-digit SMS code sent via Twilio. Confirms you have access to the mobile number.

Step 2Email Verification

Secure link sent to email. Confirms email ownership for notifications and recovery.

What Happens After Registration

SMS Code Sent

A 6-digit verification code is sent to your phone immediately. You'll be taken to the phone verification screen to enter this code.

Two-Step Verification

Complete phone verification first (via SMS), then email verification (via link). Both steps are required for full platform access.

Enterprise Account Active

Your enterprise account is provisioned with custom billing. Contact your account manager for usage details and billing inquiries.

Ready to Create

After both verifications are complete, you're ready to create your first warp. Head to the dashboard and start building your data acquisition agent.

Registration Errors

Common issues and how to resolve them:

"Email already exists": An account with this email already exists. Try logging in or use password reset.
"Phone number already exists": This phone number is already associated with another account.
"Invalid phone number": Ensure you've entered a valid mobile number with country code (e.g., +1 for US).
"Password too short": Password must be at least 8 characters long.
"Passwords do not match": Ensure your password confirmation matches exactly.

Phone Verification

Phone verification is the first step of Warpstack's two-factor verification process. A 6-digit verification code is sent via SMS to confirm you have access to the mobile number you provided during registration.

Powered by Twilio

Warpstack uses Twilio's Verify API for secure, reliable SMS delivery. Verification codes are automatically generated and validated, ensuring a seamless verification experience.

How Phone Verification Works

SMS Code Sent Automatically

Immediately after creating your account, a 6-digit verification code is sent to your phone via SMS. This typically arrives within seconds.

Enter the 6-Digit Code

Enter the verification code from the SMS into the verification screen. The code is validated automatically as you type.

Phone Verified

Once verified, your phone number is confirmed and you'll proceed to email verification. Your phone verification status is stored securely in your account.

Resending Verification Code

If you didn't receive the SMS or the code expired, you can request a new one:

Rate Limiting

You can request a new code every 2 minutes. This prevents abuse while ensuring you can always get a fresh code if needed. A countdown timer shows when you can resend.

Change Phone Number

Made a typo? You can update your phone number and receive a new code. Previous codes are automatically invalidated when you change your number or request a resend.

Phone Number Format

Phone numbers must be in E.164 international format:

✓ Valid Examples

+1 (555) 123-4567 (US)
+44 7911 123456 (UK)
+49 151 12345678 (Germany)
+61 412 345 678 (Australia)

✗ Invalid Examples

555-123-4567 (missing country code)
1234567890 (no + prefix)
+1 (800) 555-1234 (toll-free not supported)

Troubleshooting SMS Issues

SMS not arriving: Ensure your phone has signal and isn't blocking unknown senders. Check if SMS from short codes are enabled.
Wrong phone number: Use the "Change Phone Number" option to update and receive a new code.
Code expired: Verification codes expire after 10 minutes. Request a new one using "Resend Code."
International issues: Ensure your carrier supports international SMS. Some carriers may delay messages.

Email Verification

Email verification is the second step of Warpstack's two-factor verification process, completed after phone verification. This confirms your email address for notifications, billing, and account recovery.

Why Verification is Required

Account Security

Confirms you own the email address, preventing unauthorized account creation with your email.

Critical Notifications

Enables alerts for extraction failures, schedule changes, and important account updates.

Billing Communications

Required for payment receipts, usage alerts, and subscription management notifications.

Support Access

Provides verified contact for account recovery, support tickets, and security updates.

Verification Process

Check Your Inbox

After registration, a verification email is sent immediately. Look for an email from noreply@warpstack.aiwith the subject "Verify Your Email Address."

Click the Verification Link

The email contains a secure, one-time-use verification link. Click it to confirm your email address.

Confirmation

You'll see a success message and receive a confirmation email. Your account now has full access to all Warpstack features.

Resending the Verification Email

If you didn't receive the verification email or the link expired, you can request a new one:

Check spam/junk folder — Verification emails sometimes get filtered.

Log in to your account — You'll see a verification prompt if not yet verified.

Click "Resend Verification Email" — A new verification link will be sent. You can also visit /resend-verification directly.

Token Security

Verification links are single-use and securely hashed. Once used, the same link cannot be clicked again. Each resend generates a fresh token, invalidating any previous links.

Logging In

Access your Warpstack account using your registered email address and password. Sessions remain active for 60 minutes of inactivity before requiring re-authentication.

How to Log In

Visit the Login Page

Go to warpstack.ai and click "Log In" or navigate directly to app.warpstack.ai/login.

Enter Your Credentials

Provide your registered email address and password. Optionally check "Remember me" to stay logged in longer.

Access Your Dashboard

Upon successful authentication, you'll be redirected to your dashboard where you can view and manage your warps.

Unverified Email Restriction

You can log in with an unverified email, but you'll be prompted to verify before creating warps or accessing most features. A banner will appear at the top of the dashboard with a link to resend verification.

Session Management

Session Duration

Sessions last 60 minutes. Activity extends your session automatically. After inactivity, you'll need to log in again.

Secure Tokens

Authentication uses JWT tokens stored securely in your browser. Tokens are refreshed automatically during active sessions.

Password Management

Keep your account secure by using a strong password and changing it periodically. Warpstack provides secure password change and reset functionality.

Password Requirements

Minimum 8 characters
Letters and numbers recommended
Special characters allowed (!@#$%^&*)
No similarity to email or name suggested

Changing Your Password

Change your password anytime from the Settings page while logged in:

Go to Settings → Security — Find password options in your account settings.

Enter your current password — This verifies you're the account owner.

Enter and confirm your new password — Make sure both fields match exactly.

Click "Change Password" — Your password is updated immediately. Current sessions remain active.

Forgot Your Password?

If you've forgotten your password, you can reset it via email:

Click "Forgot Password?"

On the login page, click the "Forgot password?" link below the password field.

Enter Your Email

Provide the email address associated with your account. For security, you'll receive a success message regardless of whether the email exists in our system.

Check Your Email

If an account exists, you'll receive an email with a secure reset link. This link expires after 30 minutes.

Set Your New Password

Click the link in the email, enter your new password (minimum 8 characters), confirm it, and submit. You'll receive a confirmation email once complete.

Password Reset Security

Password reset tokens are:

• Single-use: Can only be used once, then invalidated
• Time-limited: Expire after 30 minutes
• Securely stored: Only hashed versions are saved, never plain text
• User-specific: Each token is tied to a specific user ID

User Guide

Creating Your First Warp

A warp is an autonomous data extraction and serving job that acquires structured data from any website and serves it as a production REST API. The quality of your warp depends entirely on the quality of your schema, which depends on the quality of your prompt. This guide will walk you through each step in detail.

The Quality Chain — Most Failures Start Here

Your prompt → Your schema → Extracted data → Your API

Like any AI tool, Warpstack is only as good as the instructions you give it. A vague prompt produces a generic schema, which produces unreliable data. Most extraction failures trace back to imprecise prompts or misconfigured schemas—not the agent itself. Invest time upfront to be specific.

Step 1: Enter the Target URL

Start by entering the URL of the page you want to extract data from. This should be the specific page containing the data, not just the homepage.

Be specific: Use example.com/products not just example.com
Public pages: The agent cannot log in or bypass authentication
Static content: Works best on pages where data loads with the page, not infinite scroll

Step 2: Write Your Prompt

The prompt is the most important input. It tells the AI what data structure to generate and instructs the browser agent on what to extract. Be as specific and detailed as possible.

What Makes a Good Prompt

A good prompt answers these questions:

What data do you need? List every field explicitly (name, price, description, image URL)
What type is each field? Specify if something should be a number, boolean, or date
Where is the data located? Mention sections of the page (product grid, pricing table, sidebar)
What are the edge cases? Describe variations (items with/without discounts, optional fields)
What should be excluded? Mention what to skip (ads, related products, navigation)

Prompt Examples

Good prompt:

"Extract all products from the catalog page. For each product, get: product name (string), current price as a number (e.g., 29.99), original price if discounted (number, null if not discounted), stock status as a boolean (true if in stock), rating as a number from 0-5, number of reviews (integer), and the product image URL. Ignore the 'Recommended for you' section and any sponsored listings."

Bad prompt:

"Get all the product info"

Too vague. What is "info"? What fields? What types? The AI will guess, and it will likely miss fields you need or include fields you don't.

More Prompt Examples by Use Case

SaaS Pricing:
"Extract pricing plans from the /pricing page. For each plan: name, monthly price (number), yearly price (number), list of features as an array of strings, whether it's marked as 'Popular' or 'Recommended' (boolean), and any limits mentioned (e.g., '10 users', '50GB storage')."
Job Listings:
"Extract job postings from the careers page. Get: job title, department, location (city and country as separate fields), employment type (full-time/part-time/contract as an enum), salary range if shown (min and max as numbers), and the date posted in ISO format."
News Articles:
"Extract articles from the news homepage. For each: headline (string), summary/description (string), author name, publication date as ISO datetime, category/section, and the article URL. Limit to the main news grid, exclude sidebar widgets and advertisements."

Step 3: Generate the Schema

Click "Generate Schema" to create an OpenAPI schema from your prompt. This typically takes 10-30 seconds. You'll see the schema stream in real-time as the AI builds it.

What Happens During Generation

Input Cleaning: Your prompt is refined to remove filler words and clarify intent
Schema Generation: AI creates field names, types, descriptions, and validations
Validation: The schema is validated against OpenAPI 3.1.0 and Structured Outputs rules
Preview Data: Sample data is generated to show what your API will return

If Generation Fails

Schema generation can fail for several reasons:

Validation errors: The schema violated Structured Outputs rules. Try simplifying your prompt.
Ambiguous prompt: The AI couldn't determine what you wanted. Be more specific.
Complex nesting: Too many nested objects. Flatten your data structure.
Retry: Sometimes it's transient — click "Generate" again.

Step 4: Review the Generated Schema

Always review the generated schema before deploying. Check that:

All fields you need are present: The AI may have missed something from your prompt
Field types are correct: Prices should be numbers, not strings. Booleans for yes/no values.
Field names make sense: They should be descriptive and consistent (snake_case recommended)
Descriptions are accurate: The agent reads these to understand what to extract
No unnecessary fields: Remove fields you won't use — they add cost

Review the Preview Data

The preview panel shows sample data based on your schema. This is AI-generated example data, not actual extracted data, but it helps you visualize what your API response will look like.

Check that the data structure matches your expectations
Verify nested arrays and objects look correct
Ensure field types match what the site actually has

Step 5: Edit the Schema (Optional)

If the generated schema isn't quite right, you can edit it before deploying. You have two options:

Visual Mode (Recommended for Beginners)

A no-code interface where you can:

Click on fields to rename them or change their type
Add new fields using the "Add Field" button
Delete fields you don't need
Edit descriptions to give the agent better instructions
Add validations like enums, patterns, or min/max values

Code Mode (Advanced Users)

Direct JSON editing with syntax highlighting. Use this when you need:

Complex nested structures
Advanced OpenAPI features ($ref, allOf, anyOf)
Precise control over validations
Bulk editing (copy/paste from existing schemas)

Edit with AI

Type natural language instructions to modify your schema:

"Add a 'category' field with enum values: electronics, clothing, home"
"Change 'price' to a number type with minimum 0"
"Remove the 'description' field"
"Add a nested 'seller' object with 'name' and 'rating' fields"

Step 6: Configure Settings

Before deploying, configure how your warp will run:

Intelligence Tier

Choose the AI model that powers your extraction agent:

V1-Lite: For simple, straightforward pages.
V1-Base (recommended): For complex sites with dynamic content.

See Intelligence Tiers for details.

Refresh Schedule

Set how often the agent should run and extract fresh data:

Manual: Only runs when you trigger it. Best for testing or one-off extractions.
Hourly: 24 runs per day. For real-time data like stock prices or inventory.
Daily: Once per day. Most common choice for catalogs, listings, news.
Weekly: Once per week. For slowly-changing data like documentation or reports.

Step 7: Deploy Your Warp

Click "Deploy Warp" to deploy your warp. Several things happen:

Schema locks: Your schema becomes immutable to protect API consumers
First extraction starts: The agent immediately begins extracting data
API endpoint activates: Your unique Stargate URL becomes live
Schedule begins: If you set a schedule, the warp will run automatically

After deployment, you'll be redirected to the Monitor tab where you can watch the agent work in real-time.

Your API Endpoint

Your warp gets a unique URL: https://{warp-id}.stargate.warpstack.ai. This is a read-only endpoint that returns your extracted data as JSON. Endpoints are public by default, but you can secure them with access tokens from the Monitor page.

Schema Editor Guide

The schema editor is where you refine your data structure. Your schema defines both the API contract (what consumers will receive) and the agent instructions (what to extract). Getting it right is critical.

Schema Review Checklist

Before deploying, ensure your schema is optimized:

Precise field descriptions — Tell the agent exactly what to look for
Correct data types — Use number for prices, boolean for availability, etc.
Enums for known values — Constrain fields to valid options (e.g., sizes, categories)
Patterns for formats — Use regex for phone numbers, SKUs, dates
Nullable fields — Mark optional data as ["string", "null"]

Visual Mode

A no-code interface for editing schemas. Switch to Visual mode using the toggle at the top of the editor.

Adding a Field

Click "Add Field"

Click the Add Field button to create a new property in your schema.

Enter Field Name

Use snake_case (e.g., product_name).

Select Field Type

Choose from string, number, integer, boolean, array, or object.

Write Description

Write a clear description — the agent uses this to understand what to extract.

Add Validations (Optional)

Optionally add enum values, patterns, or min/max constraints.

Field Types

String: Text data (names, descriptions, URLs)
Number: Decimal numbers (prices, ratings, percentages)
Integer: Whole numbers (counts, quantities, years)
Boolean: True/false values (in_stock, is_featured, has_discount)
Array: Lists of items (features, images, tags)
Object: Nested structures (address with city/state/zip)

String Validations

Enum: Fixed list of allowed values (e.g., "monthly", "yearly")
Pattern: Regex validation (max 500 characters)
Format: Predefined formats — date-time, date, time, email, hostname, ipv4, ipv6, uuid

Number Validations

Minimum: Lowest allowed value (e.g., 0 for prices)
Maximum: Highest allowed value (e.g., 5 for ratings)
Multiple Of: Decimal precision (e.g., 0.01 for prices = $29.99)

Code Mode

Direct JSON editing for advanced users. The schema follows OpenAPI 3.1.0 specification.

Editor Features

Syntax highlighting for JSON
Real-time validation with error highlighting
Line numbers and bracket matching
Find and replace (Cmd/Ctrl + F)

Common Code Mode Tasks

Adding $ref for reusable component definitions
Using anyOf for union types (string or null)
Setting complex regex patterns
Bulk editing field descriptions

AI Schema Modification

Use natural language to edit your schema. Click "Edit with AI" and describe what you want to change:

"Add a 'currency' field with enum: USD, EUR, GBP"
"Change 'price' from string to number with 2 decimal precision"
"Make 'discount_price' nullable (can be null if no discount)"
"Add a nested 'dimensions' object with width, height, depth as numbers"
"Remove all fields related to shipping"

Writing Good Field Descriptions

Field descriptions are critical — the extraction agent reads them to understand what to look for. A good description tells the agent exactly what to extract and where to find it.

Description Examples

Good:
"The current sale price displayed in the product card, as a decimal number without currency symbol (e.g., 29.99)"
Bad:
"Price"
Good:
"Boolean indicating if the product is currently in stock. True if 'In Stock' or 'Available', false if 'Out of Stock' or 'Sold Out'"
Bad:
"Stock status"

Understanding Schemas

Your schema is the foundation of your warp. It defines two things simultaneously: the structure of your API (what developers consume) and the instructions for the extraction agent (what to look for). A well-designed schema leads to accurate, consistent extractions.

The Schema Is the Source of Truth

The agent follows your schema exactly. Every field description, every type constraint, every validation rule directly influences what gets extracted and how. Review your schema carefully before deploying.

Better schemas = lower costs + higher quality: Precise field descriptions reduce agent confusion (fewer retries, less tokens). Adding enums, patterns, and constraints ensures clean data. This is especially important for recurring scheduled jobs where small improvements multiply over hundreds of runs.

Schema Structure

Warpstack schemas follow the OpenAPI 3.1.0 specification. Here's the basic structure:

info.title: A descriptive name for your API
info.description: High-level instructions for the agent about what to extract
paths: Defines your API endpoint (always "/" for Warpstack)
components.schemas: Your data models with field definitions

The info.description Field

This is crucial — the agent reads this to understand its overall mission. Include:

What type of data you're extracting
Where on the page to look
What to include and exclude
How to handle edge cases

Structured Outputs Rules

Warpstack uses OpenAI's Structured Outputs feature, which has strict schema requirements. Violating any of these rules will cause validation errors.

Rule 1: All Properties Must Be Required

Every property you define must be listed in the required array. Optional fields are not supported — if a field might not exist, use anyOf with null.

Rule 2: additionalProperties Must Be False

Every object (root and nested) must include "additionalProperties": false. This prevents the agent from adding unexpected fields.

Rule 3: Arrays Must Define Items

Every array type must have an items property defining the element type.

Rule 4: Forbidden Keywords

Do not use these OpenAPI keywords — they're not supported by Structured Outputs:

example — use description instead
default — not supported
minLength / maxLength — use pattern instead
format: "uri" — use "hostname" or no format

Allowed String Formats

These are the only valid format values for strings:

date-time — ISO 8601 datetime (2025-01-15T10:30:00Z)
date — ISO 8601 date (2025-01-15)
time — ISO 8601 time (10:30:00)
duration — ISO 8601 duration (PT1H30M)
email — Email address
hostname — Domain name
ipv4 — IPv4 address
ipv6 — IPv6 address
uuid — UUID format

Schema Limits

Nesting: Maximum 5 levels deep (prefer 3 or fewer)
Properties per object: Maximum 100 (prefer under 50)
Total fields: Maximum 500
Regex patterns: Maximum 500 characters

Monitoring Your Warps

After deploying a warp, you can watch the extraction agent work in real-time. The Monitor tab provides full observability into what your agent is doing.

Live Agent View

See exactly what your agent sees. The browser view shows real-time screenshots as the agent navigates the target site:

Browser screenshots: Updated every few seconds during extraction
Highlighted actions: See what the agent is clicking or reading
Navigation steps: Watch the agent move through pages

Activity Log

The activity log shows the agent's reasoning and actions as it extracts data:

Step-by-step actions: "Navigating to URL", "Extracting products", "Found 24 items"
Agent reasoning: Why it made certain decisions
Timestamps: How long each step took
Errors: What went wrong if extraction fails

Execution Status

Key metrics displayed during and after extraction:

Status: Running, Completed, or Failed
Records: Number of items extracted
Duration: Total extraction time
Cost: Tokens consumed × price per token

Manage Access Token

From the Monitor page, you can secure your API endpoint with an access token. Click the link icon next to the API URL at the top of the page to open the access token manager.

No token configured: Your endpoint is publicly accessible (default)
Generate Access Token: Creates a secure token to require authentication
Copy token: Token is shown once—copy it immediately
Delete token: Revokes access and makes endpoint public again

See Access Tokens in the Stargate API section for complete usage details.

Understanding Token Usage

Token usage during extraction depends on several factors:

Page complexity: More content = more tokens to process
Schema size: More fields = more output tokens
Navigation depth: Multi-page extractions use more tokens
Visual analysis: Processing screenshots adds ~20-30% more tokens

Expect 20,000+ tokens for simple extractions. Complex sites with lots of content or multi-step navigation will use significantly more.

Managing Deployed Warps

Once a warp is deployed, the schema is locked to protect API consumers from breaking changes. You can still make changes, but you need to follow a specific process.

Schema Locking

When you deploy a warp, its schema becomes locked. This means:

Your API continues working: While you edit, consumers keep getting the old schema
No accidental changes: You can't accidentally break your API
Explicit deployment: Changes only go live when you click "Deploy Warp"

Editing an Active Warp

To modify a deployed warp's schema:

Click "Edit Schema"

In the warp editor, click Edit Schema to unlock the schema for modifications.

Make Your Changes

Edit the schema in visual or code mode as needed.

Preview Changes

Review the diff to understand what will change in your API.

Deploy the New Schema

Click "Deploy Warp" to deploy your updated schema.

Breaking Changes to Avoid

Some changes will break existing API consumers. Avoid these unless you're coordinating with all consumers:

Renaming fields: Old field names will no longer exist
Changing field types: String to number will break JSON parsing
Removing fields: Consumers expecting them will error
Restructuring: Moving fields or changing nesting

Manual Runs

You can trigger an extraction manually at any time, regardless of schedule. Click "Run Now" in the warp header. Use this for:

Testing after schema changes
Getting fresh data immediately
Debugging extraction issues
One-off data collection (when schedule is set to Manual)

Pausing a Warp

Pausing stops scheduled runs but keeps everything else active:

Your API endpoint remains available
Existing data is still served
No new extractions occur (no new costs)
Resume anytime to restart the schedule

Deleting a Warp

Deleting a warp is permanent:

Your API endpoint stops working immediately
Data is retained for 30 days, then permanently deleted
This action cannot be undone

Scheduling

Scheduling controls how often your warp runs and extracts fresh data. Choose based on how frequently the source data changes and your cost tolerance.

Optimize Before You Schedule

Recurring jobs amplify everything—good and bad. A small inefficiency in your prompt or schema becomes significant when multiplied by hundreds of runs.

Before setting up hourly or daily schedules: review your schema thoroughly, add field validations (enums, patterns, constraints), write precise field descriptions, and test manually until you're satisfied with the output quality. An extra hour optimizing upfront can save significant costs over time.

Available Schedules

Manual

No automatic runs. The warp only extracts data when you click "Run Now". Best for:

Testing and development
One-off data collection projects
Infrequently updated sources

Hourly

Runs every hour, 24 times per day. Best for:

Real-time price monitoring
Inventory tracking
Breaking news feeds
Stock data (during market hours)

Daily

Runs once per day at a consistent time. The most common choice, balancing freshness and cost. Best for:

Product catalogs
Pricing pages
Job listings
News aggregation

Weekly

Runs once per week. Best for:

Market research reports
Documentation pages
Reference data
Slowly changing content

Schedule Cost Impact

Schedule frequency is the biggest cost driver. Consider carefully:

Hourly: 720 runs/month — ~30x the cost of daily
Daily: 30 runs/month — baseline cost
Weekly: ~4 runs/month — ~1/7 the cost of daily

Always choose the lowest frequency that meets your data freshness requirements.

Intelligence Tiers

Intelligence tiers determine which AI model powers your extraction agent. Different models have different capabilities and costs. The "per 1M tokens" price is a blended rate combining both input tokens (page content) and output tokens (extracted data).

Both tiers use vision AI to see and understand web pages like a human. The difference is in capability and cost.

V1-Lite

Fast and cost-efficient. Best for straightforward pages with clear structure.

Best For

Product listings and directories
Blog posts and articles
Simple contact pages
High-volume extraction where cost matters

V1-Base — Recommended

More powerful AI for complex sites. Handles dynamic content and tricky interfaces.

Best For

Dynamic JavaScript rendering
Infinite scroll and pagination
JavaScript-heavy applications
Complex layouts and navigation
Charts, dashboards, and visual data

Choosing the Right Tier

Start with V1-Base for most use cases. It's only 2x the cost of V1-Lite but handles a much wider range of sites. Only use V1-Lite if:

The target site has straightforward, simple pages
You've tested and V1-Lite provides acceptable accuracy
You're running high-volume extractions and need to minimize cost

Stargate Data API

API Overview

The Stargate API provides high-performance, read-only access to your extracted data with SQL-like querying capabilities.

Key Features

• Public by default, optionally secured with access tokens
• Sub-second response times
• SQL-like filtering, sorting, pagination
• Nested field access with dot notation
• JSON and CSV output formats
• Unique URL per warp: https://{warp-id}.stargate.warpstack.ai

Endpoints

Get Data

http

GET https://{warp-id}.stargate.warpstack.ai/

Returns your extracted data. Supports filtering, sorting, pagination, and column selection.

Response Structure

json

{
  "data": [...],              // Your data array
  "total": 156,               // Total record count
  "available_columns": [...], // All available column names
  "count": 25                 // Records in this response
}

Get Statistics

http

GET https://{warp-id}.stargate.warpstack.ai/stats

Returns dataset statistics and metadata.

Response Structure

json

{
  "total_records": 156,
  "storage_bytes": 524288,
  "total_sessions": 12,
  "total_sessions_duration_ms": 540000
}

Access Tokens

By default, Stargate endpoints are publicly accessible. You can secure your endpoint by generating an access token, which requires all API requests to include authentication.

When to Use Access Tokens

Enable access tokens when your data is sensitive or you want to control who can access your API. For public datasets or internal testing, leaving endpoints public is often more convenient.

Generating an Access Token

Navigate to Monitor Page

Open your warp and go to the Monitor tab.

Click the Link Icon

Next to your API URL at the top, click the link icon (🔗) to open the access token manager.

Generate Access Token

Click "Generate Access Token" to create a new token. Your token will be displayed once—copy it immediately and store it securely.

Using Access Tokens

Once a token is configured, all requests to your endpoint must include authentication. There are two methods:

Option 1: Authorization Header (Recommended)

bash

curl https://{warp-id}.stargate.warpstack.ai \
  -H "Authorization: Bearer {your-token}"

Option 2: Query Parameter

http

GET https://{warp-id}.stargate.warpstack.ai?access_token={your-token}

Note: Using the header is preferred as query parameters may be logged in server access logs.

Revoking Access

To revoke an access token, open the access token manager and click the delete button. Once deleted, the endpoint becomes publicly accessible again.

Important Notes

• Token changes may take up to 10 minutes to propagate
• Tokens are shown once at creation—store them securely
• Each warp supports one active access token
• Requests without valid authentication return 401 Unauthorized

Column Selection

Use the select parameter to specify which columns to return. If omitted, returns the first 50 columns.

Basic Selection

http

?select=id,name,email

Returns only these three columns

Nested Field Selection

Access nested object fields using dot notation:

http

?select=user.name,user.email,address.city

Returns fields from within nested objects

Array Element Projection

Extract specific fields from array elements:

http

?select=items[].name,items[].price

Returns only name and price from each item in the array

Filtering

Filter data using filter=column:operator:value syntax. Combine multiple filters with commas (AND logic).

http

?filter=status:eq:active,price:lt:100

Available Operators

Operator

Description

Example

Equals

status:eq:active

Not equals

status:ne:deleted

Greater than

price:gt:50

gte

Greater or equal

price:gte:50

Less than

price:lt:100

lte

Less or equal

price:lte:100

Pattern (case-sensitive)

name:like:%Premium%

ilike

Pattern (case-insensitive)

email:ilike:%@gmail%

Filter Examples

Find active items under $100:

?filter=status:eq:active,price:lt:100

Find Gmail users:

?filter=email:ilike:%@gmail.com

Find products in stock with rating above 4:

?filter=in_stock:eq:true,rating:gte:4

Sorting

Sort results using sort=column:direction. Direction is asc (ascending) or desc (descending).

Sort by price (lowest first):

?sort=price:asc

Sort by created date (newest first):

?sort=created_at:desc

Multi-column sort (price desc, then name asc):

?sort=price:desc,name:asc

Pagination

Control how many records to return. Default limit is 100, maximum is 10,000.

Offset-Based

http

?limit=25&offset=50

Skip first 50 records, return next 25.

Best for: Shallow pagination (pages 1-10)

Cursor-Based

http

?limit=25&cursor=eyJpZCI6MTIzfQ

Use cursor from previous response for next page.

Best for: Deep pagination, large datasets

Using Cursor Pagination

The response includes a `cursor` field for the next page:

json

{
  "data": [...],
  "cursor": "eyJpZCI6MTUwfQ",  // Use this for next request
  "total": 500
}

For the next page: ?limit=25&cursor=eyJpZCI6MTUwfQ

Output Formats

JSON (Default)

http

?output=json

Structured JSON with full nested data support.

Best for programmatic access, apps, dashboards

CSV

http

?output=csv

Spreadsheet-ready format. Nested data is flattened.

Best for Excel, Google Sheets, data analysis

Usage Examples

cURL

Basic query:

bash

curl "https://my-warp.stargate.warpstack.ai/?limit=10"

With filtering and sorting:

bash

curl "https://my-warp.stargate.warpstack.ai/?filter=price:lt:100&sort=price:asc&limit=20"

Export to CSV:

bash

curl "https://my-warp.stargate.warpstack.ai/?output=csv" > data.csv

JavaScript / Fetch API

javascript

const response = await fetch(
  'https://my-warp.stargate.warpstack.ai/?limit=10&sort=created_at:desc'
);
const { data, total } = await response.json();

console.log(`Retrieved ${data.length} of ${total} total records`);
data.forEach(item => console.log(item));

Python

python

import requests

# Basic query
response = requests.get(
    'https://my-warp.stargate.warpstack.ai/',
    params={
        'limit': 10,
        'sort': 'created_at:desc',
        'filter': 'status:eq:active'
    }
)
data = response.json()

print(f"Retrieved {len(data['data'])} records")
for item in data['data']:
    print(item)

Combining Parameters

You can combine multiple query features for powerful data access:

http

?select=name,price,rating&filter=price:lt:100,rating:gte:4&sort=price:asc&limit=20&offset=0

Returns name, price, and rating for items under $100 with rating ≥4, sorted by price ascending, first 20 results.

Error Handling

400

Bad Request

Invalid query parameters.

Common causes:

• Unknown column name in select/filter/sort
• Malformed filter syntax
• Invalid operator

403

Forbidden

Path security violation or invalid permissions.

404

Not Found

Warp ID doesn't exist or schema not deployed.

Check your warp ID is correct and warp is ACTIVE.

Workbench

Data Viewer

The Workbench is your data exploration tool. View, filter, sort, and export your extracted data.

Accessing Workbench

Click on any warp in your dashboard, then navigate to the "Workbench" tab to view your data in table format.

Features

• Table view with all extracted records
• Automatic column detection
• Nested data visualization
• Real-time data refresh
• Column-based filtering
• Multi-column sorting
• Pagination controls
• Export to CSV

Filtering & Sorting Data

Column Filters

Click the filter icon in any column header to filter by that column's values.

• Text columns: Contains/equals filters
• Number columns: Range filters (greater than, less than)
• Multiple filters: Combine across columns (AND logic)

Sorting

Click column headers to sort:

• First click: Sort ascending
• Second click: Sort descending
• Third click: Remove sort
• Hold Shift: Multi-column sort

Exporting Data

Export to CSV

Click "Export" button to download your data as CSV.

• Exports current filtered/sorted view
• Nested objects are flattened (dot notation)
• Arrays are comma-separated
• Compatible with Excel, Google Sheets

Export via API

For programmatic exports, use the Stargate API directly:

bash

curl "https://my-warp.stargate.warpstack.ai/?output=csv&limit=10000" > export.csv

Metrics & Costs

Understanding Metrics

Warpstack provides detailed metrics to help you understand your usage and costs. The dashboard displays real-time data on extraction usage, storage consumption, and API serving across customizable time ranges.

Viewing Your Metrics

Access your metrics from the Usage section in the dashboard. You can view usage across different time periods:

Day: Last 24 hours of activity
Week: Rolling 7-day window
Month: Current billing cycle (30 days)
Total: All-time cumulative usage

Key Metrics Tracked

Warpstack tracks three categories of usage that contribute to your costs:

Extraction (Tokens): LLM tokens consumed during data extraction runs
Storage (GB): Total data stored across all your warps
Serving (Bandwidth): Data transferred via Stargate API calls

Cost Breakdown

All Warpstack costs are usage-based. You only pay for what you use, with no hidden fees or minimums.

Extraction Costs (Token-Based Pricing)

Extraction costs are based on LLM token consumption. The "per 1M tokens" price is a blended rate that combines both input tokens (the page content sent to the model) and output tokens (the extracted data returned). This single metric simplifies cost calculation.

Token Usage Expectations

Expect 20,000+ tokens per extraction for simple pages. Complex sites with more content, navigation, or multiple data points will use significantly more. Token usage scales with page complexity, schema size, and the amount of data being extracted.

Intelligence Tiers

V1-Lite: For simple, straightforward pages.
V1-Base: Recommended. For complex sites with dynamic content.

Extraction Usage Example

A simple product extraction using V1-Base:

~25,000 tokens per extraction
Token usage scales with page complexity and schema size
Request a demo to discuss pricing based on your volume requirements

Storage

Storage is billed monthly based on the total data stored across all your warps. Extracted data is stored as optimized JSON and is typically very space-efficient.

100,000 records: ~100 MB
1 million records: ~1 GB
Included: Generous storage allocation in enterprise plans

Serving (API Bandwidth)

Serving is based on the bandwidth consumed when your applications fetch data via the Stargate API. This includes all GET requests to your warp endpoints.

Typical response size: 1-50 KB depending on data volume
Included: API bandwidth in enterprise plans

Cost Optimization

Optimize your Warpstack costs by following these strategies to reduce token usage and improve efficiency.

Choose the Right Intelligence Tier

Start with V1-Base for most sites. If the target is simple and you need to minimize cost, test V1-Lite. See Intelligence Tiers for details.

Optimize Refresh Frequency

The biggest cost lever is how often your warp runs. Choose the minimum frequency that meets your needs:

Hourly: Real-time monitoring, price tracking (highest cost)
Daily: Most common for catalog data, news, listings
Weekly: Slowly changing data, documentation, reference data
Manual: One-time extractions, on-demand updates (lowest cost)

Switching from hourly to daily reduces costs by ~24x. Daily to weekly reduces by another ~7x.

Write Targeted Prompts

Specific, focused prompts reduce token usage by helping the agent work more efficiently:

Include the exact URL path if known (e.g., "Extract from /pricing")
Specify exactly what data you need — avoid "get everything"
Mention where on the page the data appears (header, table, footer)
Describe edge cases upfront to avoid retries

Simplify Your Schema

Every field in your schema adds to extraction complexity and token usage. Only extract what you'll actually use:

Remove fields you don't need in your application
Avoid deeply nested structures when flat data works
Use simple types (string, number) over complex validations when possible

Monitor and Pause Unused Warps

Regularly review your active warps. Pause or delete warps that are no longer needed to stop accumulating extraction costs. You can always re-deploy a paused warp later.

Understanding Your Balance

The Usage section shows your current balance and projected runway:

Month-to-Date: Total spent so far this billing cycle
Burn Rate: Average daily spending based on recent usage
Runway: Days until balance depletes (Balance ÷ Daily Burn Rate)

Activity & Notifications

Activity Feed

Track all warp activity in one timeline. See what's happening across your account.

Run

Warp execution completed successfully

Deploy

Warp deployed or re-deployed with schema changes

Failure

Execution failed - check logs for details

Balance

Funds added or usage charges applied

Notification Center

Stay informed about important events. Access via the bell icon in the top navigation.

Notification Types

• Execution Failures: When a warp run fails
• Deployment Complete: When warp deployment finishes
• Low Balance: When your balance is running low
• System Updates: Platform news and feature releases

Managing Notifications

• Click to view details
• Mark as read individually
• "Mark all as read" button
• Dismiss to remove from list
• Configure email preferences in Settings

Settings

Profile Settings

Customize your account, security, notifications, and preferences.

Profile Settings

Manage your account information and contact details.

Editable Fields

• Full name
• Email address (requires re-verification)
• Company/Organization (optional)

Security Settings

Change your password from the Settings page:

Go to Settings → Security

Navigate to the Settings page and select the Security tab.

Enter Current Password

For security, you must verify your identity by entering your current password.

Enter New Password

Create a new password with at least 8 characters, then confirm it by entering it again.

Click "Change Password"

Your password is updated immediately. Current sessions remain active.

Notification Preferences

Control which email notifications you receive.

Execution Failures

Get notified when warp runs fail

Recommended

System Updates

Platform news, new features, maintenance

Product Updates

New feature announcements and improvements

Weekly Summary

Weekly digest of your warps and usage

User Preferences

Theme

Choose your interface appearance:

Dark

System (Auto)

Timezone

Select your timezone for accurate time displays. Affects when "daily" warps run.

Date Format

Choose how dates are displayed throughout the interface.

Language

Interface language selection.

Troubleshooting

Schema Validation Errors

These are the most common schema errors and how to fix them.

Error: "Properties not in required array"

This is the #1 most common error. It means you defined a property but forgot to list it in the `required` array.

❌ Wrong:

json

{
  "properties": {
    "name": {...},
    "email": {...}
  },
  "required": ["name"]
}

✅ Fixed:

json

{
  "properties": {
    "name": {...},
    "email": {...}
  },
  "required": ["name", "email"]
}

Error: "Array must have items property"

Every array type must define what its elements are using the `items` property.

❌ Wrong:

json

{
  "tags": {
    "type": "array",
    "description": "Tags"
  }
}

✅ Fixed:

json

{
  "tags": {
    "type": "array",
    "description": "Tags",
    "items": {"type": "string"}
  }
}

Error: "Objects must have additionalProperties: false"

Every object (including nested objects) must explicitly set additionalProperties to false.

json

{
  "type": "object",
  "required": ["name"],
  "properties": {
    "name": {"type": "string"}
  },
  "additionalProperties": false  // ✅ Add this
}

Error: "Unsupported property: example" (or default, minLength, etc.)

Structured Outputs forbids certain OpenAPI properties.

Forbidden properties:

• `example` - Put examples in description instead
• `default` - Not supported
• `minLength`, `maxLength` - Use `pattern` with regex instead
• `format: "uri"` - Use `format: "hostname"` instead

Deployment Failures

Warp Stuck in "Deploying"

If your warp stays in DEPLOYING status for more than 2 minutes:

• Refresh the page - status might have updated
• Check for schema validation errors in the editor
• Try deleting and recreating the warp
• Contact support if issue persists

Deployment Error Message

If deployment fails with an error:

• Read the error message carefully - it usually indicates the exact problem
• Check schema validation (often the cause)
• Ensure all required fields are present
• Try simplifying your schema if very complex

Extraction Failures

Agent Couldn't Find Data

When the agent completes but extracts 0 records:

• Check target URL: Is it the correct page?
• Improve prompt: Be more specific about where data appears
• Enable Visual Analysis: Might help with dynamic content
• Check Live View: See what the agent is seeing
• Simplify schema: Start with fewer fields, expand gradually

CAPTCHA or Login Required

Warpstack cannot handle authentication-required pages or CAPTCHA challenges.

Workaround: Try to find public/non-authenticated pages with the same data, or wait for our authenticated browser feature (coming soon).

Timeout Errors

If extraction times out (usually after 5-10 minutes):

• Page might be too slow to load
• Try providing more direct URLs (skip homepage navigation)
• Simplify extraction task
• Use V1-Lite for faster extraction (if applicable)

Getting Help

Support Channels

• Email: support@warpstack.ai
• In-app: Help icon in navigation
• Response time: Usually within 24 hours

When Contacting Support

Include:

• Your warp ID
• Error messages (screenshots helpful)
• What you were trying to do
• Your target URL (if applicable)

Best Practices & Examples

The #1 Cause of Failed Extractions

Imprecise prompts and misconfigured schemas cause the vast majority of extraction failures—not site complexity or agent limitations. Warpstack agents are powerful, but like any AI tool, they need clear, specific instructions. This section teaches you how to set them up for success.

Writing Effective Prompts

Your prompt is the foundation of every warp. A well-written prompt produces a better schema, which produces better data. Spend an extra minute here to save hours of debugging later.

✓ Do This

• Be specific: "Extract product name, price, and rating" not "Get product info"
• Include URL: "From nike.com/shoes extract..." saves navigation tokens
• Mention structure: "For each product card" helps agent identify elements
• Note edge cases: "If discounted, capture both original and sale price"
• Add constraints: "Only products in stock" or "Skip sponsored items"

✗ Avoid This

• Too vague: "Get data from website"
• No context: "Extract prices" - prices of what?
• Ambiguous: "Get product details" - which details?
• Conversational filler: "Um, I need to like get some products..."

Schema Design Tips

Schema Design Best Practices

1. Keep It Simple

Start with essential fields only. Add complexity later if needed. Simpler schemas = faster extraction, lower costs, fewer errors.

2. Write Descriptive Field Descriptions

Field descriptions guide the AI agent. "Product price in USD as decimal (e.g., 29.99)" is better than "Price".

3. Use Appropriate Types

Prices → number with multipleOf: 0.01. Ratings → number with min/max. Status → enum with specific values.

4. Avoid Deep Nesting

Keep nesting to 2-3 levels max. Flat structures are easier to work with and query.

5. Test with Preview

Always generate and review preview data before deploying. Catches schema issues early.

6. Add Validations for Recurring Jobs

For scheduled warps, add enums, patterns, and constraints to ensure data quality. Validations catch extraction errors before they enter your API—critical when running hundreds of automated extractions.

7. Review Your Schema Before Deploy

Schedule:

Weekly (market trends)

Reference

Glossary

Warp: An autonomous data extraction and serving job. Configured with a prompt and optional target URL, runs on a schedule you define, and serves data via a production API.
Stargate: The high-performance API layer that serves your extracted data via unique URLs:https://{warp-id}.stargate.warpstack.ai
Schema: The data contract defining your API structure. Written in OpenAPI 3.1.0 format with Structured Outputs compliance.
Structured Outputs: OpenAI's strict JSON schema format requiring all fields to be required, no additional properties, and specific validation rules. Ensures reliable, structured AI responses.
Intelligence Tier: The AI model used for extraction. V1-Lite for simple pages, V1-Base for complex content.
Visual Analysis: Enables vision AI capabilities allowing the agent to "see" the page like a human. Critical for charts, dynamic dashboards, and visual content.
Extraction: The process of gathering data from web pages using autonomous browser agents.
Deployment: Making a warp active/live. Locks the schema and begins data extraction. Creates your Stargate API endpoint.
Schema Locking: Protection mechanism that prevents accidental schema changes to deployed warps. Ensures API stability for consumers.
Account Balance: Prepaid funds used for usage-based billing. Add funds to your balance, usage costs are deducted in real-time.
Runway: Days until your balance runs out, calculated from current balance and daily burn rate. Helps plan when to add funds.
Burn Rate: Average daily spending based on recent usage patterns. Used to calculate runway.
Tokens: Units of AI model consumption. Roughly 4 characters = 1 token. Used to measure and bill extraction costs.

Frequently Asked Questions

General

What is Warpstack?

Warpstack is the Data Acquisition Cloud — built for alternative datasets. Add new data sources in hours, stop maintaining scrapers, and ship with full audit trails from day one. Our AI agents use vision to "see" pages like humans, making them resilient to layout changes.

How is this different from web scraping?

Traditional scrapers use fragile CSS selectors that break when sites update. Warpstack agents use vision and reasoning to understand page content semantically, making them much more robust. They can also handle dynamic content, Canvas/WebGL visualizations, and complex interactions.

What websites work with Warpstack?

Most public websites work well. Best results with: e-commerce sites, SaaS pricing pages, news sites, product catalogs, directories, and public dashboards. Sites with CAPTCHAs or login requirements don't work yet.

Pricing & Billing

How does billing work?

Enterprise billing is customized based on your agreement. Usage costs (extraction, storage, serving) are included in your monthly invoice.

How do I increase my usage limits?

Contact your account manager to discuss increasing your usage limits or adjusting your enterprise agreement.

What's included in my enterprise agreement?

Enterprise agreements include custom SLAs, dedicated support, compliance features, and usage-based billing. Contact your account manager for details.

How is pricing structured?

Enterprise pricing is customized based on your data volume and requirements. Contact your account manager for details on your specific agreement.

Technical

What is a schema?

A schema defines the structure of your data using OpenAPI 3.1.0 format. It tells the agent what fields to extract and defines your API's response format.

Do I need to verify my email?

Yes. Email verification is required to create warps and use the platform. This ensures account security and enables critical notifications.

How often does my warp run?

Based on your schedule setting: Manual (you trigger), Hourly (every hour), Daily (once per day), or Weekly (once per week). Set this when deploying or in the Monitor tab.

How long is data stored?

Data is stored indefinitely while your warp is active. After deleting a warp, data is retained for 30 days then permanently deleted.

Can I delete my data?

Yes. Deleting a warp removes its API endpoint immediately and schedules data for permanent deletion after 30 days.

Are Stargate APIs public?

By default, yes. Endpoints are publicly accessible to anyone with your warp ID. However, you can secure your endpoint by generating an access token from the Monitor page. Once configured, requests require a Bearer token or query parameter for authentication.

Can I extract data from pages behind login?

Not yet. Currently, Warpstack only works with public pages. Support for authenticated sessions is coming soon.

Troubleshooting

My warp failed, what now?

Check the Monitor tab → Activity Log to see error details. Common fixes:

• URL might be wrong or changed
• Page requires login/CAPTCHA
• Prompt too vague - agent couldn't find data
• Try enabling Visual Analysis
• Simplify schema and try again

Schema validation errors - help!

See the Schema Validation Errors section above. Most common: properties not in required array. Fix: add all properties to the required array.

Agent extracted 0 records - why?

The agent ran but couldn't find matching data. Try:

• Check Live Agent View - what did it see?
• Verify URL is correct
• Make prompt more specific about where data appears
• Ensure page loads publicly (no login required)

How do I get support?

Email support@warpstack.ai with your warp ID, error details, and what you're trying to achieve. Response time is usually within 24 hours (faster for paid plans).

Warpstack Documentation v1.0 Beta

Last updated: January 10, 2026

Home•Pricing•Support