Skip to main content

Import Methods

File Upload

CSV, Excel, JSON files up to 100MB

API Integration

Real-time data import via REST API

Database Connection

PostgreSQL, MySQL, SQL Server, Oracle, MongoDB

Third-Party

Google Sheets, Airtable, webhooks

File Upload

Navigate to your dataset → Import DataFile Upload.

Supported Formats

CSV: Best for structured data. UTF-8 encoding recommended, 100MB limit. Excel: .xlsx/.xls files, multiple sheets supported, formulas converted to values. JSON: Array of objects or line-delimited JSON, automatic schema detection.

Upload Process

  1. Select File: Browse or drag-and-drop your file
  2. Configure: System detects headers, delimiters, and encoding automatically
  3. Map Columns: Match file columns to dataset columns, create new columns if needed
  4. Handle Issues: System flags missing headers, type mismatches, duplicates, invalid dates
  5. Choose Import Mode:
    • Append: Add to existing data
    • Replace: Replace all data
    • Update: Update existing records by key column
  6. Execute Import: Review summary and start processing

Best Practices

File Preparation: Use clean headers, consistent formatting, UTF-8 encoding, remove empty rows. Large Files: Split files over 100MB, remove unnecessary columns, test with sample first.

API Integration

Use the REST API for real-time or automated imports.

Setup

  1. Get API Key: Workspace Settings → API Keys → Create with datasets:write permission
  2. Use Endpoint: POST /api/v1/workspaces/{workspace_id}/datasets/{dataset_id}/records
  3. Authentication: Bearer token in Authorization header

Example

import requests
import json

# Configuration
api_key = "your_api_key_here"
workspace_id = "your_workspace_id"
dataset_id = "your_dataset_id"
base_url = "https://api.radicalwhale.com"

# Headers
headers = {
    "Authorization": f"Bearer {api_key}",
    "Content-Type": "application/json"
}

# Data to import
data = [
    {
        "company_name": "Example Corp",
        "website": "https://example.com",
        "industry": "Technology"
    }
]

# Make the request
url = f"{base_url}/api/v1/workspaces/{workspace_id}/datasets/{dataset_id}/records"
response = requests.post(url, headers=headers, json=data)

if response.status_code == 201:
    result = response.json()
    print(f"Successfully imported {result['imported_count']} records")
else:
    print(f"Import failed: {response.status_code} - {response.text}")
Batch Processing: Use 100-1000 records per batch for large datasets. Validation: Include validate_schema, skip_duplicates, update_existing in options.

Database Connections

Workspace Settings → Integrations → Add Database Connection. Supported: PostgreSQL (9.6+), MySQL (5.7+), SQL Server (2017+), Oracle (12c+), MongoDB (4.0+) Setup: Enter host, port, database name, credentials, and SSL settings. Test connection before saving. Import: Select tables/views, map columns, add filtering with WHERE clauses, schedule automatic syncs. Custom SQL: Use custom queries for advanced data selection and transformation.

Best Practices

Performance: Use indexes, limit results with WHERE clauses, schedule during off-hours, use incremental updates. Security: Read-only database accounts, SSL/TLS connections, IP whitelisting.

Third-Party Integrations

Google Sheets

Connect Google account → Select spreadsheet → Configure range → Set sync schedule. Tips: Use headers in first row, consistent data types, avoid merged cells, remove empty rows.

Airtable

Enter Airtable API key → Select base and tables → Map fields → Configure attachment and linked record handling.

Webhooks

Get webhook endpoint URL → Configure security tokens → Define payload format → Test with sample data. Real-time import with automatic validation, error logging, and rate limiting.

Next Steps

Managing Columns

Configure and optimize dataset columns

Working with Records

Edit, filter, and manage imported data

Creating Agents

Set up AI agents to process imported data

API Reference

Technical documentation for import APIs