Turn any career page into structured job data — automatically.
CrawlWell crawls employers' job listing pages, extracts structured fields with AI, and delivers clean, consistent data through a simple API — so you can run a job platform without maintaining a zoo of scrapers.
Built for niche and regional job portals. Multi-tenant, secure, and ready for production.
How it works
Three steps from career page to your platform.
- 1
Add organisation URLs
Paste the career page link. CrawlWell auto-detects the organisation name.
- 2
AI discovers and crawls jobs
Listing pages are scanned for job links. Each job page is crawled for title, content, and structured attributes.
- 3
Pull via API or review first
Approved jobs are available through the REST API with incremental sync. Or consume crawled output directly.
AI-Powered Job Crawling
Point CrawlWell at any employer's career page. Our AI-backed extraction discovers job listings and pulls structured data — no brittle selectors, no per-site scrapers to maintain.
| Feature | ✓ |
|---|---|
| Automatic job link discovery from listing pages | ✓ |
| Markdown + structured field extraction per job | ✓ |
| Configurable extraction attributes per customer | ✓ |
| Resilient to site layout changes | ✓ |
| Batch and on-demand crawling | ✓ |
Review & Approval Pipeline
Every crawled job flows through a clear status pipeline — incoming, crawled, approved, or rejected. Your team reviews AI-extracted data before it reaches your platform, or skip the gate entirely.
| Feature | ✓ |
|---|---|
| Status pipeline: incoming → crawled → approved / rejected | ✓ |
| Bulk review with filters by status, organisation, search | ✓ |
| Full markdown preview and extracted attributes | ✓ |
| Optional approval — configurable per customer | ✓ |
| Audit trail on every status change | ✓ |
Customer API
A clean, documented REST API lets your platform pull approved jobs in real time. Bearer-token auth, pagination, and incremental sync keep your data store current without polling everything.
| Feature | ✓ |
|---|---|
| OpenAPI 3.0 specification with Swagger UI | ✓ |
| Bearer token authentication per customer | ✓ |
| Incremental sync via approved_since | ✓ |
| Organisation and job endpoints with pagination | ✓ |
| Tenant-isolated — each API key sees only its own data | ✓ |
Multi-Tenant by Design
Each customer is a fully isolated tenant. Row-level security in PostgreSQL, scoped API keys, and application-level guards ensure data never leaks between customers.
| Feature | ✓ |
|---|---|
| PostgreSQL Row-Level Security on all customer tables | ✓ |
| Per-customer API keys with automatic scoping | ✓ |
| Superadmin impersonation for support and ops | ✓ |
| Customer-admin self-service user management | ✓ |
| Defence-in-depth: app-level + DB-level isolation | ✓ |