Getting Started
Introduction
Warpstack is the Data Acquisition Cloud — built for alternative datasets. Add new data sources in hours, stop maintaining scrapers, and ship with full audit trails from day one.
The full stack, handled for you:
- Acquisition: AI agents connect to any source—websites, APIs, databases—and extract the data you need
- Structuring: Raw data is automatically cleaned, validated, and transformed to your schema
- Hosting: Your data lives in our cloud with low-latency access and automatic backups
- Serving: Production-ready REST APIs with filtering, sorting, and pagination—ready from day one
- Monitoring: Complete audit trails, observability, and adaptive handling when sources change
Our AI agents use vision to "see" pages like a human—reading charts, handling dynamic JavaScript, and adapting to layout changes without breaking. But Warpstack isn't just about extraction. It's about replacing your entire data stack with a single platform.
How It Works
From description to production API in minutes:
- Describe your data: Tell us what you need in plain English—no code, no selectors
- We generate your schema: AI creates a structured OpenAPI schema from your description
- Deploy: Click "Deploy Warp" and we handle acquisition, structuring, and hosting
- Access your API: Your data is immediately available via a production REST endpoint
- Set a schedule: Keep data fresh with automatic hourly, daily, or weekly refreshes
Why Warpstack?
Alternative data vendors spend 30-60% of engineering time maintaining scrapers. Warpstack eliminates that burden:
- Ship more datasets: Add new sources in hours, not months—expand coverage faster
- Zero maintenance: AI agents adapt when sites change—no more 3am pages
- Built-in compliance: Full audit trails for every record—answer "where did this data come from?" in seconds
- Production APIs included: Every dataset is a REST API with filtering, sorting, and pagination
- Managed hosting: We host your data with automatic backups and retention policies
Your Prompts and Schemas Matter
Quality in, quality out. Vague prompts or incorrect schemas lead to poor results. Like any AI system, Warpstack performs best when you're specific about what you want.
Take time to craft clear prompts and accurate schemas. See Best Practices for guidance.
Current Limitations
Warpstack works best with public data sources. Current limitations:
- Login-protected content: Cannot authenticate to gated pages (coming soon)
- CAPTCHA challenges: Cannot reliably solve CAPTCHAs
- Real-time streaming: Designed for periodic acquisition, not sub-second updates
- File processing: Cannot download or parse PDFs, spreadsheets, or documents
Core Concepts
Understanding these key concepts will help you get the most out of Warpstack.
Warp
A warp is an autonomous data extraction and serving job. Each warp is configured with:
- Target URL: The website page to extract data from
- Prompt: Natural language description of what data to extract
- Schema: OpenAPI definition of the data structure
- Schedule: How often to run (manual, hourly, daily, weekly)
- Intelligence tier: Which AI model powers the extraction
Schema
The schema is the most important part of your warp. It serves two purposes:
- API contract: Defines the JSON structure your API returns
- Extraction instructions: Field descriptions guide the agent on what to extract
Schemas are written in OpenAPI 3.1.0 format and follow Structured Outputs rules. The quality of your prompt directly determines the quality of your schema, which determines extraction accuracy.
Stargate API
Stargate is the data serving layer. Once your warp extracts data, it's instantly available via your unique API endpoint:
https://{your-warp-id}.stargate.warpstack.ai
Stargate supports SQL-like filtering, sorting, pagination, column selection, and multiple output formats (JSON, CSV). Endpoints are public by default, but can be secured with access tokens.
Warp States
Warps move through different states during their lifecycle:
- DRAFT: Being configured. Schema is editable, not yet deployed.
- DEPLOYING: Setting up infrastructure for the first extraction.
- ACTIVE: Deployed and serving data. Schema is locked.
- RUNNING: Currently executing an extraction job.
- SCHEDULED: Active with a recurring schedule configured.
- PAUSED: Temporarily stopped. API still serves existing data.
Intelligence Tiers
Choose the AI model that powers your extraction:
- V1-Lite: For simple, straightforward pages.
- V1-Base: Recommended. For complex sites with dynamic content.
See Intelligence Tiers for details.
Quickstart
Get your first warp running in under 5 minutes. This guide walks you through the essential steps.
Step 1: Create a Warp
From your dashboard, click "Create Warp". Enter the URL of the page you want to extract data from.
Step 2: Write Your Prompt
Describe what data you want to extract in natural language. Be specific about the fields you need.
Example prompt:
"Extract all pricing plans from this page. For each plan, get the plan name, monthly price as a number, yearly price if shown, and the list of features included."
Step 3: Generate Schema
Click "Generate Schema". The AI will analyze your prompt and create an OpenAPI schema. This takes 10-30 seconds. Review the generated fields to ensure they match your needs.
Step 4: Deploy
Click "Deploy Warp" to deploy your warp. The first extraction starts immediately. You can watch the agent work in the Monitor tab.
Step 5: Access Your Data
Once extraction completes, your data is available via the Stargate API:
curl "https://{your-warp-id}.stargate.warpstack.ai/?limit=10"That's it! Your data is now accessible as a REST API. Set a schedule to keep it fresh, or trigger manual runs when you need updates.
What's Next
- Read the full User Guide for detailed step-by-step instructions
- Learn about schema design to improve extraction accuracy
- Explore Stargate API for filtering, sorting, and pagination
Usage & Billing
Warpstack offers custom enterprise pricing based on your data volume and requirements. Contact our sales team for a tailored quote.
Enterprise Plan
All enterprise agreements include full platform access, dedicated support, custom SLAs, and compliance features.
- Full audit trail: Chain of custody logging for every data point.
- PII detection & redaction: Automatic detection and redaction at the edge.
- Custom SLAs: Uptime guarantees and response time commitments.
- Dedicated support: Priority support with dedicated account manager.
Usage Components
Enterprise usage is charged based on three components. Rates are included in your custom enterprise agreement.
Extraction
Token-based usage during extraction. The rate is a blended measure combining input tokens (page content) and output tokens (extracted data).
- V1-Lite: For simple, straightforward pages
- V1-Base: For complex sites with dynamic content
Expect 20,000+ tokens per simple extraction. Complex pages use more.
Storage
Billed monthly based on total data stored across all warps.
- Included: Generous storage allocation in enterprise plans
Serving (Bandwidth)
Based on data transferred when your applications call the Stargate API.
- Included: API bandwidth in enterprise plans
How Enterprise Billing Works
- Custom agreements: Pricing tailored to your volume and requirements.
- Invoiced billing: Monthly invoicing with NET 30 terms.
- Usage tracking: Monitor your usage in the dashboard.
- Dedicated support: Contact your account manager for any billing inquiries.
Usage Example
Extracting products from an e-commerce site using V1-Base:
- Extraction: ~25,000 tokens
- Storage: ~1 MB per extraction
- Serving: 10 API calls, ~100 KB each
Request a demo to discuss pricing based on your volume requirements.
Intelligence Tier Comparison
- V1-Lite: For simple, straightforward pages.
- V1-Base: Recommended. For complex sites with dynamic content.
Start with V1-Base for new warps. See Intelligence Tiers for details.
Account & Authentication
Creating Your Account
Creating a Warpstack account takes less than a minute. Once registered, you'll have immediate access to the platform and can start building your first data acquisition agent right away.
Enterprise Onboarding
Enterprise accounts are provisioned by our sales team. Contact us to get started with custom pricing and dedicated onboarding support.
Getting Started
Request a Demo
Visit warpstack.ai and click "Request a Demo" to see the platform in action and discuss your requirements.
Enter Your Information
Fill out the registration form with your details:
Your display name used in the dashboard and for billing purposes.
Your mobile phone number for SMS verification. This is your primary verification method.
- • Include country code (e.g., +1 for US, +44 for UK)
- • Must be a mobile number capable of receiving SMS
- • A 6-digit verification code will be sent via SMS
Your primary contact email for login, email verification, and important notifications like extraction failures or billing alerts.
Create a secure password for your account:
- • Minimum 8 characters
- • Mix of letters and numbers recommended
- • Special characters allowed (!@#$%^&*)
Re-enter your password to confirm it matches.
Accept Terms & Create Account
Check the box to agree to our Terms of Service and Privacy Policy, then click "Create Account." Your account will be created immediately and an SMS verification code will be sent.
Verify Your Phone (Step 1)
You'll receive a 6-digit verification code via SMS within seconds. Enter this code to verify your phone number. This is the first of two verification steps.
Didn't receive it? You can request a new code after 2 minutes.
Verify Your Email (Step 2)
After phone verification, check your inbox for a verification email from Warpstack. Click the verification link to complete account setup.
Can't find it? Check your spam folder or use the "Resend Verification" option.
Two-Step Verification
Warpstack uses two-factor verification to secure your account:
6-digit SMS code sent via Twilio. Confirms you have access to the mobile number.
Secure link sent to email. Confirms email ownership for notifications and recovery.
What Happens After Registration
SMS Code Sent
A 6-digit verification code is sent to your phone immediately. You'll be taken to the phone verification screen to enter this code.
Two-Step Verification
Complete phone verification first (via SMS), then email verification (via link). Both steps are required for full platform access.
Enterprise Account Active
Your enterprise account is provisioned with custom billing. Contact your account manager for usage details and billing inquiries.
Ready to Create
After both verifications are complete, you're ready to create your first warp. Head to the dashboard and start building your data acquisition agent.
Registration Errors
Common issues and how to resolve them:
- "Email already exists": An account with this email already exists. Try logging in or use password reset.
- "Phone number already exists": This phone number is already associated with another account.
- "Invalid phone number": Ensure you've entered a valid mobile number with country code (e.g., +1 for US).
- "Password too short": Password must be at least 8 characters long.
- "Passwords do not match": Ensure your password confirmation matches exactly.
Phone Verification
Phone verification is the first step of Warpstack's two-factor verification process. A 6-digit verification code is sent via SMS to confirm you have access to the mobile number you provided during registration.
Powered by Twilio
Warpstack uses Twilio's Verify API for secure, reliable SMS delivery. Verification codes are automatically generated and validated, ensuring a seamless verification experience.
How Phone Verification Works
SMS Code Sent Automatically
Immediately after creating your account, a 6-digit verification code is sent to your phone via SMS. This typically arrives within seconds.
Enter the 6-Digit Code
Enter the verification code from the SMS into the verification screen. The code is validated automatically as you type.
Phone Verified
Once verified, your phone number is confirmed and you'll proceed to email verification. Your phone verification status is stored securely in your account.
Resending Verification Code
If you didn't receive the SMS or the code expired, you can request a new one:
Rate Limiting
You can request a new code every 2 minutes. This prevents abuse while ensuring you can always get a fresh code if needed. A countdown timer shows when you can resend.
Change Phone Number
Made a typo? You can update your phone number and receive a new code. Previous codes are automatically invalidated when you change your number or request a resend.
Phone Number Format
Phone numbers must be in E.164 international format:
- +1 (555) 123-4567 (US)
- +44 7911 123456 (UK)
- +49 151 12345678 (Germany)
- +61 412 345 678 (Australia)
- 555-123-4567 (missing country code)
- 1234567890 (no + prefix)
- +1 (800) 555-1234 (toll-free not supported)
Troubleshooting SMS Issues
- SMS not arriving: Ensure your phone has signal and isn't blocking unknown senders. Check if SMS from short codes are enabled.
- Wrong phone number: Use the "Change Phone Number" option to update and receive a new code.
- Code expired: Verification codes expire after 10 minutes. Request a new one using "Resend Code."
- International issues: Ensure your carrier supports international SMS. Some carriers may delay messages.
Email Verification
Email verification is the second step of Warpstack's two-factor verification process, completed after phone verification. This confirms your email address for notifications, billing, and account recovery.
Why Verification is Required
Account Security
Confirms you own the email address, preventing unauthorized account creation with your email.
Critical Notifications
Enables alerts for extraction failures, schedule changes, and important account updates.
Billing Communications
Required for payment receipts, usage alerts, and subscription management notifications.
Support Access
Provides verified contact for account recovery, support tickets, and security updates.
Verification Process
Check Your Inbox
After registration, a verification email is sent immediately. Look for an email from noreply@warpstack.aiwith the subject "Verify Your Email Address."
Click the Verification Link
The email contains a secure, one-time-use verification link. Click it to confirm your email address.
Confirmation
You'll see a success message and receive a confirmation email. Your account now has full access to all Warpstack features.
Resending the Verification Email
If you didn't receive the verification email or the link expired, you can request a new one:
Check spam/junk folder — Verification emails sometimes get filtered.
Log in to your account — You'll see a verification prompt if not yet verified.
Click "Resend Verification Email" — A new verification link will be sent. You can also visit /resend-verification directly.
Token Security
Verification links are single-use and securely hashed. Once used, the same link cannot be clicked again. Each resend generates a fresh token, invalidating any previous links.
Logging In
Access your Warpstack account using your registered email address and password. Sessions remain active for 60 minutes of inactivity before requiring re-authentication.
How to Log In
Visit the Login Page
Go to warpstack.ai and click "Log In" or navigate directly to app.warpstack.ai/login.
Enter Your Credentials
Provide your registered email address and password. Optionally check "Remember me" to stay logged in longer.
Access Your Dashboard
Upon successful authentication, you'll be redirected to your dashboard where you can view and manage your warps.
Unverified Email Restriction
You can log in with an unverified email, but you'll be prompted to verify before creating warps or accessing most features. A banner will appear at the top of the dashboard with a link to resend verification.
Session Management
Session Duration
Sessions last 60 minutes. Activity extends your session automatically. After inactivity, you'll need to log in again.
Secure Tokens
Authentication uses JWT tokens stored securely in your browser. Tokens are refreshed automatically during active sessions.
Password Management
Keep your account secure by using a strong password and changing it periodically. Warpstack provides secure password change and reset functionality.
Password Requirements
- Minimum 8 characters
- Letters and numbers recommended
- Special characters allowed (!@#$%^&*)
- No similarity to email or name suggested
Changing Your Password
Change your password anytime from the Settings page while logged in:
Go to Settings → Security — Find password options in your account settings.
Enter your current password — This verifies you're the account owner.
Enter and confirm your new password — Make sure both fields match exactly.
Click "Change Password" — Your password is updated immediately. Current sessions remain active.
Forgot Your Password?
If you've forgotten your password, you can reset it via email:
Click "Forgot Password?"
On the login page, click the "Forgot password?" link below the password field.
Enter Your Email
Provide the email address associated with your account. For security, you'll receive a success message regardless of whether the email exists in our system.
Check Your Email
If an account exists, you'll receive an email with a secure reset link. This link expires after 30 minutes.
Set Your New Password
Click the link in the email, enter your new password (minimum 8 characters), confirm it, and submit. You'll receive a confirmation email once complete.
Password Reset Security
Password reset tokens are:
- • Single-use: Can only be used once, then invalidated
- • Time-limited: Expire after 30 minutes
- • Securely stored: Only hashed versions are saved, never plain text
- • User-specific: Each token is tied to a specific user ID
User Guide
Creating Your First Warp
A warp is an autonomous data extraction and serving job that acquires structured data from any website and serves it as a production REST API. The quality of your warp depends entirely on the quality of your schema, which depends on the quality of your prompt. This guide will walk you through each step in detail.
The Quality Chain — Most Failures Start Here
Your prompt → Your schema → Extracted data → Your API
Like any AI tool, Warpstack is only as good as the instructions you give it. A vague prompt produces a generic schema, which produces unreliable data. Most extraction failures trace back to imprecise prompts or misconfigured schemas—not the agent itself. Invest time upfront to be specific.
Step 1: Enter the Target URL
Start by entering the URL of the page you want to extract data from. This should be the specific page containing the data, not just the homepage.
- Be specific: Use
example.com/productsnot justexample.com - Public pages: The agent cannot log in or bypass authentication
- Static content: Works best on pages where data loads with the page, not infinite scroll
Step 2: Write Your Prompt
The prompt is the most important input. It tells the AI what data structure to generate and instructs the browser agent on what to extract. Be as specific and detailed as possible.
What Makes a Good Prompt
A good prompt answers these questions:
- What data do you need? List every field explicitly (name, price, description, image URL)
- What type is each field? Specify if something should be a number, boolean, or date
- Where is the data located? Mention sections of the page (product grid, pricing table, sidebar)
- What are the edge cases? Describe variations (items with/without discounts, optional fields)
- What should be excluded? Mention what to skip (ads, related products, navigation)
Prompt Examples
Good prompt:
"Extract all products from the catalog page. For each product, get: product name (string), current price as a number (e.g., 29.99), original price if discounted (number, null if not discounted), stock status as a boolean (true if in stock), rating as a number from 0-5, number of reviews (integer), and the product image URL. Ignore the 'Recommended for you' section and any sponsored listings."
Bad prompt:
"Get all the product info"
Too vague. What is "info"? What fields? What types? The AI will guess, and it will likely miss fields you need or include fields you don't.
More Prompt Examples by Use Case
SaaS Pricing:
"Extract pricing plans from the /pricing page. For each plan: name, monthly price (number), yearly price (number), list of features as an array of strings, whether it's marked as 'Popular' or 'Recommended' (boolean), and any limits mentioned (e.g., '10 users', '50GB storage')."
Job Listings:
"Extract job postings from the careers page. Get: job title, department, location (city and country as separate fields), employment type (full-time/part-time/contract as an enum), salary range if shown (min and max as numbers), and the date posted in ISO format."
News Articles:
"Extract articles from the news homepage. For each: headline (string), summary/description (string), author name, publication date as ISO datetime, category/section, and the article URL. Limit to the main news grid, exclude sidebar widgets and advertisements."
Step 3: Generate the Schema
Click "Generate Schema" to create an OpenAPI schema from your prompt. This typically takes 10-30 seconds. You'll see the schema stream in real-time as the AI builds it.
What Happens During Generation
- Input Cleaning: Your prompt is refined to remove filler words and clarify intent
- Schema Generation: AI creates field names, types, descriptions, and validations
- Validation: The schema is validated against OpenAPI 3.1.0 and Structured Outputs rules
- Preview Data: Sample data is generated to show what your API will return
If Generation Fails
Schema generation can fail for several reasons:
- Validation errors: The schema violated Structured Outputs rules. Try simplifying your prompt.
- Ambiguous prompt: The AI couldn't determine what you wanted. Be more specific.
- Complex nesting: Too many nested objects. Flatten your data structure.
- Retry: Sometimes it's transient — click "Generate" again.
Step 4: Review the Generated Schema
Always review the generated schema before deploying. Check that:
- All fields you need are present: The AI may have missed something from your prompt
- Field types are correct: Prices should be numbers, not strings. Booleans for yes/no values.
- Field names make sense: They should be descriptive and consistent (snake_case recommended)
- Descriptions are accurate: The agent reads these to understand what to extract
- No unnecessary fields: Remove fields you won't use — they add cost
Review the Preview Data
The preview panel shows sample data based on your schema. This is AI-generated example data, not actual extracted data, but it helps you visualize what your API response will look like.
- Check that the data structure matches your expectations
- Verify nested arrays and objects look correct
- Ensure field types match what the site actually has
Step 5: Edit the Schema (Optional)
If the generated schema isn't quite right, you can edit it before deploying. You have two options:
Visual Mode (Recommended for Beginners)
A no-code interface where you can:
- Click on fields to rename them or change their type
- Add new fields using the "Add Field" button
- Delete fields you don't need
- Edit descriptions to give the agent better instructions
- Add validations like enums, patterns, or min/max values
Code Mode (Advanced Users)
Direct JSON editing with syntax highlighting. Use this when you need:
- Complex nested structures
- Advanced OpenAPI features ($ref, allOf, anyOf)
- Precise control over validations
- Bulk editing (copy/paste from existing schemas)
Edit with AI
Type natural language instructions to modify your schema:
- "Add a 'category' field with enum values: electronics, clothing, home"
- "Change 'price' to a number type with minimum 0"
- "Remove the 'description' field"
- "Add a nested 'seller' object with 'name' and 'rating' fields"
Step 6: Configure Settings
Before deploying, configure how your warp will run:
Intelligence Tier
Choose the AI model that powers your extraction agent:
- V1-Lite: For simple, straightforward pages.
- V1-Base (recommended): For complex sites with dynamic content.
See Intelligence Tiers for details.
Refresh Schedule
Set how often the agent should run and extract fresh data:
- Manual: Only runs when you trigger it. Best for testing or one-off extractions.
- Hourly: 24 runs per day. For real-time data like stock prices or inventory.
- Daily: Once per day. Most common choice for catalogs, listings, news.
- Weekly: Once per week. For slowly-changing data like documentation or reports.
Step 7: Deploy Your Warp
Click "Deploy Warp" to deploy your warp. Several things happen:
- Schema locks: Your schema becomes immutable to protect API consumers
- First extraction starts: The agent immediately begins extracting data
- API endpoint activates: Your unique Stargate URL becomes live
- Schedule begins: If you set a schedule, the warp will run automatically
After deployment, you'll be redirected to the Monitor tab where you can watch the agent work in real-time.
Your API Endpoint
Your warp gets a unique URL: https://{warp-id}.stargate.warpstack.ai. This is a read-only endpoint that returns your extracted data as JSON. Endpoints are public by default, but you can secure them with access tokens from the Monitor page.
Schema Editor Guide
The schema editor is where you refine your data structure. Your schema defines both the API contract (what consumers will receive) and the agent instructions (what to extract). Getting it right is critical.
Schema Review Checklist
Before deploying, ensure your schema is optimized:
- Precise field descriptions — Tell the agent exactly what to look for
- Correct data types — Use
numberfor prices,booleanfor availability, etc. - Enums for known values — Constrain fields to valid options (e.g., sizes, categories)
- Patterns for formats — Use regex for phone numbers, SKUs, dates
- Nullable fields — Mark optional data as
["string", "null"]
Visual Mode
A no-code interface for editing schemas. Switch to Visual mode using the toggle at the top of the editor.
Adding a Field
Click "Add Field"
Click the Add Field button to create a new property in your schema.
Enter Field Name
Use snake_case (e.g., product_name).
Select Field Type
Choose from string, number, integer, boolean, array, or object.
Write Description
Write a clear description — the agent uses this to understand what to extract.
Add Validations (Optional)
Optionally add enum values, patterns, or min/max constraints.
Field Types
- String: Text data (names, descriptions, URLs)
- Number: Decimal numbers (prices, ratings, percentages)
- Integer: Whole numbers (counts, quantities, years)
- Boolean: True/false values (in_stock, is_featured, has_discount)
- Array: Lists of items (features, images, tags)
- Object: Nested structures (address with city/state/zip)
String Validations
- Enum: Fixed list of allowed values (e.g., "monthly", "yearly")
- Pattern: Regex validation (max 500 characters)
- Format: Predefined formats — date-time, date, time, email, hostname, ipv4, ipv6, uuid
Number Validations
- Minimum: Lowest allowed value (e.g., 0 for prices)
- Maximum: Highest allowed value (e.g., 5 for ratings)
- Multiple Of: Decimal precision (e.g., 0.01 for prices = $29.99)
Code Mode
Direct JSON editing for advanced users. The schema follows OpenAPI 3.1.0 specification.
Editor Features
- Syntax highlighting for JSON
- Real-time validation with error highlighting
- Line numbers and bracket matching
- Find and replace (Cmd/Ctrl + F)
Common Code Mode Tasks
- Adding
$reffor reusable component definitions - Using
anyOffor union types (string or null) - Setting complex regex patterns
- Bulk editing field descriptions
AI Schema Modification
Use natural language to edit your schema. Click "Edit with AI" and describe what you want to change:
- "Add a 'currency' field with enum: USD, EUR, GBP"
- "Change 'price' from string to number with 2 decimal precision"
- "Make 'discount_price' nullable (can be null if no discount)"
- "Add a nested 'dimensions' object with width, height, depth as numbers"
- "Remove all fields related to shipping"
Writing Good Field Descriptions
Field descriptions are critical — the extraction agent reads them to understand what to look for. A good description tells the agent exactly what to extract and where to find it.
Description Examples
Good:
"The current sale price displayed in the product card, as a decimal number without currency symbol (e.g., 29.99)"
Bad:
"Price"
Good:
"Boolean indicating if the product is currently in stock. True if 'In Stock' or 'Available', false if 'Out of Stock' or 'Sold Out'"
Bad:
"Stock status"
Understanding Schemas
Your schema is the foundation of your warp. It defines two things simultaneously: the structure of your API (what developers consume) and the instructions for the extraction agent (what to look for). A well-designed schema leads to accurate, consistent extractions.
The Schema Is the Source of Truth
The agent follows your schema exactly. Every field description, every type constraint, every validation rule directly influences what gets extracted and how. Review your schema carefully before deploying.
Better schemas = lower costs + higher quality: Precise field descriptions reduce agent confusion (fewer retries, less tokens). Adding enums, patterns, and constraints ensures clean data. This is especially important for recurring scheduled jobs where small improvements multiply over hundreds of runs.
Schema Structure
Warpstack schemas follow the OpenAPI 3.1.0 specification. Here's the basic structure:
- info.title: A descriptive name for your API
- info.description: High-level instructions for the agent about what to extract
- paths: Defines your API endpoint (always "/" for Warpstack)
- components.schemas: Your data models with field definitions
The info.description Field
This is crucial — the agent reads this to understand its overall mission. Include:
- What type of data you're extracting
- Where on the page to look
- What to include and exclude
- How to handle edge cases
Structured Outputs Rules
Warpstack uses OpenAI's Structured Outputs feature, which has strict schema requirements. Violating any of these rules will cause validation errors.
Rule 1: All Properties Must Be Required
Every property you define must be listed in the required array. Optional fields are not supported — if a field might not exist, use anyOf with null.
Rule 2: additionalProperties Must Be False
Every object (root and nested) must include "additionalProperties": false. This prevents the agent from adding unexpected fields.
Rule 3: Arrays Must Define Items
Every array type must have an items property defining the element type.
Rule 4: Forbidden Keywords
Do not use these OpenAPI keywords — they're not supported by Structured Outputs:
example— use description insteaddefault— not supportedminLength/maxLength— use pattern insteadformat: "uri"— use "hostname" or no format
Allowed String Formats
These are the only valid format values for strings:
date-time— ISO 8601 datetime (2025-01-15T10:30:00Z)date— ISO 8601 date (2025-01-15)time— ISO 8601 time (10:30:00)duration— ISO 8601 duration (PT1H30M)email— Email addresshostname— Domain nameipv4— IPv4 addressipv6— IPv6 addressuuid— UUID format
Schema Limits
- Nesting: Maximum 5 levels deep (prefer 3 or fewer)
- Properties per object: Maximum 100 (prefer under 50)
- Total fields: Maximum 500
- Regex patterns: Maximum 500 characters
Monitoring Your Warps
After deploying a warp, you can watch the extraction agent work in real-time. The Monitor tab provides full observability into what your agent is doing.
Live Agent View
See exactly what your agent sees. The browser view shows real-time screenshots as the agent navigates the target site:
- Browser screenshots: Updated every few seconds during extraction
- Highlighted actions: See what the agent is clicking or reading
- Navigation steps: Watch the agent move through pages
Activity Log
The activity log shows the agent's reasoning and actions as it extracts data:
- Step-by-step actions: "Navigating to URL", "Extracting products", "Found 24 items"
- Agent reasoning: Why it made certain decisions
- Timestamps: How long each step took
- Errors: What went wrong if extraction fails
Execution Status
Key metrics displayed during and after extraction:
- Status: Running, Completed, or Failed
- Records: Number of items extracted
- Duration: Total extraction time
- Cost: Tokens consumed × price per token
Manage Access Token
From the Monitor page, you can secure your API endpoint with an access token. Click the link icon next to the API URL at the top of the page to open the access token manager.
- No token configured: Your endpoint is publicly accessible (default)
- Generate Access Token: Creates a secure token to require authentication
- Copy token: Token is shown once—copy it immediately
- Delete token: Revokes access and makes endpoint public again
See Access Tokens in the Stargate API section for complete usage details.
Understanding Token Usage
Token usage during extraction depends on several factors:
- Page complexity: More content = more tokens to process
- Schema size: More fields = more output tokens
- Navigation depth: Multi-page extractions use more tokens
- Visual analysis: Processing screenshots adds ~20-30% more tokens
Expect 20,000+ tokens for simple extractions. Complex sites with lots of content or multi-step navigation will use significantly more.
Managing Deployed Warps
Once a warp is deployed, the schema is locked to protect API consumers from breaking changes. You can still make changes, but you need to follow a specific process.
Schema Locking
When you deploy a warp, its schema becomes locked. This means:
- Your API continues working: While you edit, consumers keep getting the old schema
- No accidental changes: You can't accidentally break your API
- Explicit deployment: Changes only go live when you click "Deploy Warp"
Editing an Active Warp
To modify a deployed warp's schema:
Click "Edit Schema"
In the warp editor, click Edit Schema to unlock the schema for modifications.
Make Your Changes
Edit the schema in visual or code mode as needed.
Preview Changes
Review the diff to understand what will change in your API.
Deploy the New Schema
Click "Deploy Warp" to deploy your updated schema.
Breaking Changes to Avoid
Some changes will break existing API consumers. Avoid these unless you're coordinating with all consumers:
- Renaming fields: Old field names will no longer exist
- Changing field types: String to number will break JSON parsing
- Removing fields: Consumers expecting them will error
- Restructuring: Moving fields or changing nesting
Manual Runs
You can trigger an extraction manually at any time, regardless of schedule. Click "Run Now" in the warp header. Use this for:
- Testing after schema changes
- Getting fresh data immediately
- Debugging extraction issues
- One-off data collection (when schedule is set to Manual)
Pausing a Warp
Pausing stops scheduled runs but keeps everything else active:
- Your API endpoint remains available
- Existing data is still served
- No new extractions occur (no new costs)
- Resume anytime to restart the schedule
Deleting a Warp
Deleting a warp is permanent:
- Your API endpoint stops working immediately
- Data is retained for 30 days, then permanently deleted
- This action cannot be undone
Scheduling
Scheduling controls how often your warp runs and extracts fresh data. Choose based on how frequently the source data changes and your cost tolerance.
Optimize Before You Schedule
Recurring jobs amplify everything—good and bad. A small inefficiency in your prompt or schema becomes significant when multiplied by hundreds of runs.
Before setting up hourly or daily schedules: review your schema thoroughly, add field validations (enums, patterns, constraints), write precise field descriptions, and test manually until you're satisfied with the output quality. An extra hour optimizing upfront can save significant costs over time.
Available Schedules
Manual
No automatic runs. The warp only extracts data when you click "Run Now". Best for:
- Testing and development
- One-off data collection projects
- Infrequently updated sources
Hourly
Runs every hour, 24 times per day. Best for:
- Real-time price monitoring
- Inventory tracking
- Breaking news feeds
- Stock data (during market hours)
Daily
Runs once per day at a consistent time. The most common choice, balancing freshness and cost. Best for:
- Product catalogs
- Pricing pages
- Job listings
- News aggregation
Weekly
Runs once per week. Best for:
- Market research reports
- Documentation pages
- Reference data
- Slowly changing content
Schedule Cost Impact
Schedule frequency is the biggest cost driver. Consider carefully:
- Hourly: 720 runs/month — ~30x the cost of daily
- Daily: 30 runs/month — baseline cost
- Weekly: ~4 runs/month — ~1/7 the cost of daily
Always choose the lowest frequency that meets your data freshness requirements.
Intelligence Tiers
Intelligence tiers determine which AI model powers your extraction agent. Different models have different capabilities and costs. The "per 1M tokens" price is a blended rate combining both input tokens (page content) and output tokens (extracted data).
Both tiers use vision AI to see and understand web pages like a human. The difference is in capability and cost.
V1-Lite
Fast and cost-efficient. Best for straightforward pages with clear structure.
Best For
- Product listings and directories
- Blog posts and articles
- Simple contact pages
- High-volume extraction where cost matters
V1-Base — Recommended
More powerful AI for complex sites. Handles dynamic content and tricky interfaces.
Best For
- Dynamic JavaScript rendering
- Infinite scroll and pagination
- JavaScript-heavy applications
- Complex layouts and navigation
- Charts, dashboards, and visual data
Choosing the Right Tier
Start with V1-Base for most use cases. It's only 2x the cost of V1-Lite but handles a much wider range of sites. Only use V1-Lite if:
- The target site has straightforward, simple pages
- You've tested and V1-Lite provides acceptable accuracy
- You're running high-volume extractions and need to minimize cost
Stargate Data API
API Overview
The Stargate API provides high-performance, read-only access to your extracted data with SQL-like querying capabilities.
Key Features
- • Public by default, optionally secured with access tokens
- • Sub-second response times
- • SQL-like filtering, sorting, pagination
- • Nested field access with dot notation
- • JSON and CSV output formats
- • Unique URL per warp:
https://{warp-id}.stargate.warpstack.ai
Endpoints
Get Data
GET https://{warp-id}.stargate.warpstack.ai/Returns your extracted data. Supports filtering, sorting, pagination, and column selection.
Response Structure
{
"data": [...], // Your data array
"total": 156, // Total record count
"available_columns": [...], // All available column names
"count": 25 // Records in this response
}Get Statistics
GET https://{warp-id}.stargate.warpstack.ai/statsReturns dataset statistics and metadata.
Response Structure
{
"total_records": 156,
"storage_bytes": 524288,
"total_sessions": 12,
"total_sessions_duration_ms": 540000
}Access Tokens
By default, Stargate endpoints are publicly accessible. You can secure your endpoint by generating an access token, which requires all API requests to include authentication.
When to Use Access Tokens
Enable access tokens when your data is sensitive or you want to control who can access your API. For public datasets or internal testing, leaving endpoints public is often more convenient.
Generating an Access Token
Navigate to Monitor Page
Open your warp and go to the Monitor tab.
Click the Link Icon
Next to your API URL at the top, click the link icon (🔗) to open the access token manager.
Generate Access Token
Click "Generate Access Token" to create a new token. Your token will be displayed once—copy it immediately and store it securely.
Using Access Tokens
Once a token is configured, all requests to your endpoint must include authentication. There are two methods:
Option 1: Authorization Header (Recommended)
curl https://{warp-id}.stargate.warpstack.ai \
-H "Authorization: Bearer {your-token}"Option 2: Query Parameter
GET https://{warp-id}.stargate.warpstack.ai?access_token={your-token}Note: Using the header is preferred as query parameters may be logged in server access logs.
Revoking Access
To revoke an access token, open the access token manager and click the delete button. Once deleted, the endpoint becomes publicly accessible again.
Important Notes
- • Token changes may take up to 10 minutes to propagate
- • Tokens are shown once at creation—store them securely
- • Each warp supports one active access token
- • Requests without valid authentication return
401 Unauthorized
Column Selection
Use the select parameter to specify which columns to return. If omitted, returns the first 50 columns.
Basic Selection
?select=id,name,emailReturns only these three columns
Nested Field Selection
Access nested object fields using dot notation:
?select=user.name,user.email,address.cityReturns fields from within nested objects
Array Element Projection
Extract specific fields from array elements:
?select=items[].name,items[].priceReturns only name and price from each item in the array
Filtering
Filter data using filter=column:operator:value syntax. Combine multiple filters with commas (AND logic).
?filter=status:eq:active,price:lt:100Available Operators
Filter Examples
Find active items under $100:
?filter=status:eq:active,price:lt:100Find Gmail users:
?filter=email:ilike:%@gmail.comFind products in stock with rating above 4:
?filter=in_stock:eq:true,rating:gte:4Sorting
Sort results using sort=column:direction. Direction is asc (ascending) or desc (descending).
Sort by price (lowest first):
?sort=price:ascSort by created date (newest first):
?sort=created_at:descMulti-column sort (price desc, then name asc):
?sort=price:desc,name:ascPagination
Control how many records to return. Default limit is 100, maximum is 10,000.
Offset-Based
?limit=25&offset=50Skip first 50 records, return next 25.
Best for: Shallow pagination (pages 1-10)
Cursor-Based
?limit=25&cursor=eyJpZCI6MTIzfQUse cursor from previous response for next page.
Best for: Deep pagination, large datasets
Using Cursor Pagination
The response includes a `cursor` field for the next page:
{
"data": [...],
"cursor": "eyJpZCI6MTUwfQ", // Use this for next request
"total": 500
}For the next page: ?limit=25&cursor=eyJpZCI6MTUwfQ
Output Formats
JSON (Default)
?output=jsonStructured JSON with full nested data support.
Best for programmatic access, apps, dashboards
CSV
?output=csvSpreadsheet-ready format. Nested data is flattened.
Best for Excel, Google Sheets, data analysis
Usage Examples
cURL
Basic query:
curl "https://my-warp.stargate.warpstack.ai/?limit=10"With filtering and sorting:
curl "https://my-warp.stargate.warpstack.ai/?filter=price:lt:100&sort=price:asc&limit=20"Export to CSV:
curl "https://my-warp.stargate.warpstack.ai/?output=csv" > data.csvJavaScript / Fetch API
const response = await fetch(
'https://my-warp.stargate.warpstack.ai/?limit=10&sort=created_at:desc'
);
const { data, total } = await response.json();
console.log(`Retrieved ${data.length} of ${total} total records`);
data.forEach(item => console.log(item));Python
import requests
# Basic query
response = requests.get(
'https://my-warp.stargate.warpstack.ai/',
params={
'limit': 10,
'sort': 'created_at:desc',
'filter': 'status:eq:active'
}
)
data = response.json()
print(f"Retrieved {len(data['data'])} records")
for item in data['data']:
print(item)Combining Parameters
You can combine multiple query features for powerful data access:
?select=name,price,rating&filter=price:lt:100,rating:gte:4&sort=price:asc&limit=20&offset=0Returns name, price, and rating for items under $100 with rating ≥4, sorted by price ascending, first 20 results.
Error Handling
Invalid query parameters.
Common causes:
- • Unknown column name in select/filter/sort
- • Malformed filter syntax
- • Invalid operator
Path security violation or invalid permissions.
Warp ID doesn't exist or schema not deployed.
Check your warp ID is correct and warp is ACTIVE.
Workbench
Data Viewer
The Workbench is your data exploration tool. View, filter, sort, and export your extracted data.
Accessing Workbench
Click on any warp in your dashboard, then navigate to the "Workbench" tab to view your data in table format.
Features
- • Table view with all extracted records
- • Automatic column detection
- • Nested data visualization
- • Real-time data refresh
- • Column-based filtering
- • Multi-column sorting
- • Pagination controls
- • Export to CSV
Filtering & Sorting Data
Column Filters
Click the filter icon in any column header to filter by that column's values.
- • Text columns: Contains/equals filters
- • Number columns: Range filters (greater than, less than)
- • Multiple filters: Combine across columns (AND logic)
Sorting
Click column headers to sort:
- • First click: Sort ascending
- • Second click: Sort descending
- • Third click: Remove sort
- • Hold Shift: Multi-column sort
Exporting Data
Export to CSV
Click "Export" button to download your data as CSV.
- • Exports current filtered/sorted view
- • Nested objects are flattened (dot notation)
- • Arrays are comma-separated
- • Compatible with Excel, Google Sheets
Export via API
For programmatic exports, use the Stargate API directly:
curl "https://my-warp.stargate.warpstack.ai/?output=csv&limit=10000" > export.csvMetrics & Costs
Understanding Metrics
Warpstack provides detailed metrics to help you understand your usage and costs. The dashboard displays real-time data on extraction usage, storage consumption, and API serving across customizable time ranges.
Viewing Your Metrics
Access your metrics from the Usage section in the dashboard. You can view usage across different time periods:
- Day: Last 24 hours of activity
- Week: Rolling 7-day window
- Month: Current billing cycle (30 days)
- Total: All-time cumulative usage
Key Metrics Tracked
Warpstack tracks three categories of usage that contribute to your costs:
- Extraction (Tokens): LLM tokens consumed during data extraction runs
- Storage (GB): Total data stored across all your warps
- Serving (Bandwidth): Data transferred via Stargate API calls
Cost Breakdown
All Warpstack costs are usage-based. You only pay for what you use, with no hidden fees or minimums.
Extraction Costs (Token-Based Pricing)
Extraction costs are based on LLM token consumption. The "per 1M tokens" price is a blended rate that combines both input tokens (the page content sent to the model) and output tokens (the extracted data returned). This single metric simplifies cost calculation.
Token Usage Expectations
Expect 20,000+ tokens per extraction for simple pages. Complex sites with more content, navigation, or multiple data points will use significantly more. Token usage scales with page complexity, schema size, and the amount of data being extracted.
Intelligence Tiers
- V1-Lite: For simple, straightforward pages.
- V1-Base: Recommended. For complex sites with dynamic content.
Extraction Usage Example
A simple product extraction using V1-Base:
- ~25,000 tokens per extraction
- Token usage scales with page complexity and schema size
- Request a demo to discuss pricing based on your volume requirements
Storage
Storage is billed monthly based on the total data stored across all your warps. Extracted data is stored as optimized JSON and is typically very space-efficient.
- 100,000 records: ~100 MB
- 1 million records: ~1 GB
- Included: Generous storage allocation in enterprise plans
Serving (API Bandwidth)
Serving is based on the bandwidth consumed when your applications fetch data via the Stargate API. This includes all GET requests to your warp endpoints.
- Typical response size: 1-50 KB depending on data volume
- Included: API bandwidth in enterprise plans
Cost Optimization
Optimize your Warpstack costs by following these strategies to reduce token usage and improve efficiency.
Choose the Right Intelligence Tier
Start with V1-Base for most sites. If the target is simple and you need to minimize cost, test V1-Lite. See Intelligence Tiers for details.
Optimize Refresh Frequency
The biggest cost lever is how often your warp runs. Choose the minimum frequency that meets your needs:
- Hourly: Real-time monitoring, price tracking (highest cost)
- Daily: Most common for catalog data, news, listings
- Weekly: Slowly changing data, documentation, reference data
- Manual: One-time extractions, on-demand updates (lowest cost)
Switching from hourly to daily reduces costs by ~24x. Daily to weekly reduces by another ~7x.
Write Targeted Prompts
Specific, focused prompts reduce token usage by helping the agent work more efficiently:
- Include the exact URL path if known (e.g., "Extract from /pricing")
- Specify exactly what data you need — avoid "get everything"
- Mention where on the page the data appears (header, table, footer)
- Describe edge cases upfront to avoid retries
Simplify Your Schema
Every field in your schema adds to extraction complexity and token usage. Only extract what you'll actually use:
- Remove fields you don't need in your application
- Avoid deeply nested structures when flat data works
- Use simple types (string, number) over complex validations when possible
Monitor and Pause Unused Warps
Regularly review your active warps. Pause or delete warps that are no longer needed to stop accumulating extraction costs. You can always re-deploy a paused warp later.
Understanding Your Balance
The Usage section shows your current balance and projected runway:
- Month-to-Date: Total spent so far this billing cycle
- Burn Rate: Average daily spending based on recent usage
- Runway: Days until balance depletes (Balance ÷ Daily Burn Rate)
Activity & Notifications
Activity Feed
Activity Feed
Track all warp activity in one timeline. See what's happening across your account.
Warp execution completed successfully
Warp deployed or re-deployed with schema changes
Execution failed - check logs for details
Funds added or usage charges applied
Notification Center
Stay informed about important events. Access via the bell icon in the top navigation.
Notification Types
- • Execution Failures: When a warp run fails
- • Deployment Complete: When warp deployment finishes
- • Low Balance: When your balance is running low
- • System Updates: Platform news and feature releases
Managing Notifications
- • Click to view details
- • Mark as read individually
- • "Mark all as read" button
- • Dismiss to remove from list
- • Configure email preferences in Settings
Settings
Profile Settings
Customize your account, security, notifications, and preferences.
Profile Settings
Manage your account information and contact details.
Editable Fields
- • Full name
- • Email address (requires re-verification)
- • Company/Organization (optional)
Security Settings
Change your password from the Settings page:
Go to Settings → Security
Navigate to the Settings page and select the Security tab.
Enter Current Password
For security, you must verify your identity by entering your current password.
Enter New Password
Create a new password with at least 8 characters, then confirm it by entering it again.
Click "Change Password"
Your password is updated immediately. Current sessions remain active.
Notification Preferences
Control which email notifications you receive.
Execution Failures
Get notified when warp runs fail
System Updates
Platform news, new features, maintenance
Product Updates
New feature announcements and improvements
Weekly Summary
Weekly digest of your warps and usage
User Preferences
Theme
Choose your interface appearance:
Timezone
Select your timezone for accurate time displays. Affects when "daily" warps run.
Date Format
Choose how dates are displayed throughout the interface.
Language
Interface language selection.
Troubleshooting
Schema Validation Errors
These are the most common schema errors and how to fix them.
Error: "Properties not in required array"
This is the #1 most common error. It means you defined a property but forgot to list it in the `required` array.
❌ Wrong:
{
"properties": {
"name": {...},
"email": {...}
},
"required": ["name"]
}✅ Fixed:
{
"properties": {
"name": {...},
"email": {...}
},
"required": ["name", "email"]
}Error: "Array must have items property"
Every array type must define what its elements are using the `items` property.
❌ Wrong:
{
"tags": {
"type": "array",
"description": "Tags"
}
}✅ Fixed:
{
"tags": {
"type": "array",
"description": "Tags",
"items": {"type": "string"}
}
}Error: "Objects must have additionalProperties: false"
Every object (including nested objects) must explicitly set additionalProperties to false.
{
"type": "object",
"required": ["name"],
"properties": {
"name": {"type": "string"}
},
"additionalProperties": false // ✅ Add this
}Error: "Unsupported property: example" (or default, minLength, etc.)
Structured Outputs forbids certain OpenAPI properties.
Forbidden properties:
- • `example` - Put examples in description instead
- • `default` - Not supported
- • `minLength`, `maxLength` - Use `pattern` with regex instead
- • `format: "uri"` - Use `format: "hostname"` instead
Deployment Failures
Warp Stuck in "Deploying"
If your warp stays in DEPLOYING status for more than 2 minutes:
- • Refresh the page - status might have updated
- • Check for schema validation errors in the editor
- • Try deleting and recreating the warp
- • Contact support if issue persists
Deployment Error Message
If deployment fails with an error:
- • Read the error message carefully - it usually indicates the exact problem
- • Check schema validation (often the cause)
- • Ensure all required fields are present
- • Try simplifying your schema if very complex
Extraction Failures
Agent Couldn't Find Data
When the agent completes but extracts 0 records:
- • Check target URL: Is it the correct page?
- • Improve prompt: Be more specific about where data appears
- • Enable Visual Analysis: Might help with dynamic content
- • Check Live View: See what the agent is seeing
- • Simplify schema: Start with fewer fields, expand gradually
CAPTCHA or Login Required
Warpstack cannot handle authentication-required pages or CAPTCHA challenges.
Workaround: Try to find public/non-authenticated pages with the same data, or wait for our authenticated browser feature (coming soon).
Timeout Errors
If extraction times out (usually after 5-10 minutes):
- • Page might be too slow to load
- • Try providing more direct URLs (skip homepage navigation)
- • Simplify extraction task
- • Use V1-Lite for faster extraction (if applicable)
Getting Help
Support Channels
- • Email: support@warpstack.ai
- • In-app: Help icon in navigation
- • Response time: Usually within 24 hours
When Contacting Support
Include:
- • Your warp ID
- • Error messages (screenshots helpful)
- • What you were trying to do
- • Your target URL (if applicable)
Best Practices & Examples
The #1 Cause of Failed Extractions
Imprecise prompts and misconfigured schemas cause the vast majority of extraction failures—not site complexity or agent limitations. Warpstack agents are powerful, but like any AI tool, they need clear, specific instructions. This section teaches you how to set them up for success.
Writing Effective Prompts
Your prompt is the foundation of every warp. A well-written prompt produces a better schema, which produces better data. Spend an extra minute here to save hours of debugging later.
✓ Do This
- • Be specific: "Extract product name, price, and rating" not "Get product info"
- • Include URL: "From nike.com/shoes extract..." saves navigation tokens
- • Mention structure: "For each product card" helps agent identify elements
- • Note edge cases: "If discounted, capture both original and sale price"
- • Add constraints: "Only products in stock" or "Skip sponsored items"
✗ Avoid This
- • Too vague: "Get data from website"
- • No context: "Extract prices" - prices of what?
- • Ambiguous: "Get product details" - which details?
- • Conversational filler: "Um, I need to like get some products..."
Schema Design Tips
Schema Design Best Practices
1. Keep It Simple
Start with essential fields only. Add complexity later if needed. Simpler schemas = faster extraction, lower costs, fewer errors.
2. Write Descriptive Field Descriptions
Field descriptions guide the AI agent. "Product price in USD as decimal (e.g., 29.99)" is better than "Price".
3. Use Appropriate Types
Prices → number with multipleOf: 0.01. Ratings → number with min/max. Status → enum with specific values.
4. Avoid Deep Nesting
Keep nesting to 2-3 levels max. Flat structures are easier to work with and query.
5. Test with Preview
Always generate and review preview data before deploying. Catches schema issues early.
6. Add Validations for Recurring Jobs
For scheduled warps, add enums, patterns, and constraints to ensure data quality. Validations catch extraction errors before they enter your API—critical when running hundreds of automated extractions.
7. Review Your Schema Before Deploy
The schema is what the agent sees—not your original prompt. After generation, review each field: Is the description clear? Is the type correct? Would you understand what to extract? Treat it like code review.
Use Case Examples
E-Commerce: Product Monitoring
Goal: Track competitor product pricing and availability
Example Prompt:
"Extract products from bestbuy.com/laptops including product name, brand, current price, original price if discounted, star rating (0-5 scale), number of reviews, and stock status. For each product, also get the product URL."
Intelligence:
V1-Base (complex layout, images)
Schedule:
Daily (prices change daily)
SaaS: Pricing Intelligence
Goal: Monitor competitor pricing changes
Example Prompt:
"From linear.app/pricing, extract all pricing tiers including plan name, monthly price, annual price, user limits, storage limits, feature list, and any 'Most Popular' or 'Recommended' badges. Handle pricing toggles if present."
Intelligence:
V1-Base (toggle interactions)
Schedule:
Weekly (prices change rarely)
Company Research
Goal: Gather company contact information
Example Prompt:
"From company website, extract: company legal name, headquarters address (street, city, state, zip), phone number with country code, primary email address, number of employees (if listed), and founded year. Check /about, /contact pages and footer."
Intelligence:
V1-Lite (mostly text)
Schedule:
Manual (one-time research)
News Monitoring
Goal: Track industry news and articles
Example Prompt:
"From techcrunch.com, extract latest articles including headline, author name, publication date, article summary/excerpt, category tags, and full article URL. Extract from homepage grid and 'Latest' section."
Intelligence:
V1-Lite (article lists)
Schedule:
Hourly (breaking news)
Job Market Data
Goal: Aggregate salary data for market research
Example Prompt:
"From job boards, extract Software Engineer positions including: job title, company name, salary range (min and max in USD), location, remote/hybrid/onsite designation, required years of experience, and key required skills. Focus on postings from last 30 days."
Intelligence:
V1-Base (varied layouts)
Schedule:
Weekly (market trends)
Reference
Glossary
- Warp
- An autonomous data extraction and serving job. Configured with a prompt and optional target URL, runs on a schedule you define, and serves data via a production API.
- Stargate
- The high-performance API layer that serves your extracted data via unique URLs:
https://{warp-id}.stargate.warpstack.ai - Schema
- The data contract defining your API structure. Written in OpenAPI 3.1.0 format with Structured Outputs compliance.
- Structured Outputs
- OpenAI's strict JSON schema format requiring all fields to be required, no additional properties, and specific validation rules. Ensures reliable, structured AI responses.
- Intelligence Tier
- The AI model used for extraction. V1-Lite for simple pages, V1-Base for complex content.
- Visual Analysis
- Enables vision AI capabilities allowing the agent to "see" the page like a human. Critical for charts, dynamic dashboards, and visual content.
- Extraction
- The process of gathering data from web pages using autonomous browser agents.
- Deployment
- Making a warp active/live. Locks the schema and begins data extraction. Creates your Stargate API endpoint.
- Schema Locking
- Protection mechanism that prevents accidental schema changes to deployed warps. Ensures API stability for consumers.
- Account Balance
- Prepaid funds used for usage-based billing. Add funds to your balance, usage costs are deducted in real-time.
- Runway
- Days until your balance runs out, calculated from current balance and daily burn rate. Helps plan when to add funds.
- Burn Rate
- Average daily spending based on recent usage patterns. Used to calculate runway.
- Tokens
- Units of AI model consumption. Roughly 4 characters = 1 token. Used to measure and bill extraction costs.
Frequently Asked Questions
General
What is Warpstack?
Warpstack is the Data Acquisition Cloud — built for alternative datasets. Add new data sources in hours, stop maintaining scrapers, and ship with full audit trails from day one. Our AI agents use vision to "see" pages like humans, making them resilient to layout changes.
How is this different from web scraping?
Traditional scrapers use fragile CSS selectors that break when sites update. Warpstack agents use vision and reasoning to understand page content semantically, making them much more robust. They can also handle dynamic content, Canvas/WebGL visualizations, and complex interactions.
What websites work with Warpstack?
Most public websites work well. Best results with: e-commerce sites, SaaS pricing pages, news sites, product catalogs, directories, and public dashboards. Sites with CAPTCHAs or login requirements don't work yet.
Pricing & Billing
How does billing work?
Enterprise billing is customized based on your agreement. Usage costs (extraction, storage, serving) are included in your monthly invoice.
How do I increase my usage limits?
Contact your account manager to discuss increasing your usage limits or adjusting your enterprise agreement.
What's included in my enterprise agreement?
Enterprise agreements include custom SLAs, dedicated support, compliance features, and usage-based billing. Contact your account manager for details.
How is pricing structured?
Enterprise pricing is customized based on your data volume and requirements. Contact your account manager for details on your specific agreement.
Technical
What is a schema?
A schema defines the structure of your data using OpenAPI 3.1.0 format. It tells the agent what fields to extract and defines your API's response format.
Do I need to verify my email?
Yes. Email verification is required to create warps and use the platform. This ensures account security and enables critical notifications.
How often does my warp run?
Based on your schedule setting: Manual (you trigger), Hourly (every hour), Daily (once per day), or Weekly (once per week). Set this when deploying or in the Monitor tab.
How long is data stored?
Data is stored indefinitely while your warp is active. After deleting a warp, data is retained for 30 days then permanently deleted.
Can I delete my data?
Yes. Deleting a warp removes its API endpoint immediately and schedules data for permanent deletion after 30 days.
Are Stargate APIs public?
By default, yes. Endpoints are publicly accessible to anyone with your warp ID. However, you can secure your endpoint by generating an access token from the Monitor page. Once configured, requests require a Bearer token or query parameter for authentication.
Can I extract data from pages behind login?
Not yet. Currently, Warpstack only works with public pages. Support for authenticated sessions is coming soon.
Troubleshooting
My warp failed, what now?
Check the Monitor tab → Activity Log to see error details. Common fixes:
- • URL might be wrong or changed
- • Page requires login/CAPTCHA
- • Prompt too vague - agent couldn't find data
- • Try enabling Visual Analysis
- • Simplify schema and try again
Schema validation errors - help!
See the Schema Validation Errors section above. Most common: properties not in required array. Fix: add all properties to the required array.
Agent extracted 0 records - why?
The agent ran but couldn't find matching data. Try:
- • Check Live Agent View - what did it see?
- • Verify URL is correct
- • Make prompt more specific about where data appears
- • Ensure page loads publicly (no login required)
How do I get support?
Email support@warpstack.ai with your warp ID, error details, and what you're trying to achieve. Response time is usually within 24 hours (faster for paid plans).
