Skip to main content

Import Methods

File Upload

CSV, Excel, JSON files up to 100MB

API Integration

Real-time data import via REST API

Database Connection

PostgreSQL, MySQL, SQL Server, Oracle, MongoDB

Third-Party

Google Sheets, Airtable, webhooks

File Upload

Navigate to your dataset → Import DataFile Upload.

Supported Formats

CSV: Best for structured data. UTF-8 encoding recommended, 100MB limit. Excel: .xlsx/.xls files, multiple sheets supported, formulas converted to values. JSON: Array of objects or line-delimited JSON, automatic schema detection.

Upload Process

  1. Select File: Browse or drag-and-drop your file
  2. Configure: System detects headers, delimiters, and encoding automatically
  3. Map Columns: Match file columns to dataset columns, create new columns if needed
  4. Handle Issues: System flags missing headers, type mismatches, duplicates, invalid dates
  5. Choose Import Mode:
    • Append: Add to existing data
    • Replace: Replace all data
    • Update: Update existing records by key column
  6. Execute Import: Review summary and start processing

Best Practices

File Preparation: Use clean headers, consistent formatting, UTF-8 encoding, remove empty rows. Large Files: Split files over 100MB, remove unnecessary columns, test with sample first.

API Integration

Use the REST API for real-time or automated imports.

Setup

  1. Get API Key: Workspace Settings → API Keys → Create with datasets:write permission
  2. Use Endpoint: POST /api/v1/workspaces/{workspace_id}/datasets/{dataset_id}/records
  3. Authentication: Bearer token in Authorization header

Example

import requests
import json

# Configuration
api_key = "your_api_key_here"
workspace_id = "your_workspace_id"
dataset_id = "your_dataset_id"
base_url = "https://api.radicalwhale.com"

# Headers
headers = {
    "Authorization": f"Bearer {api_key}",
    "Content-Type": "application/json"
}

# Data to import
data = [
    {
        "company_name": "Example Corp",
        "website": "https://example.com",
        "industry": "Technology"
    }
]

# Make the request
url = f"{base_url}/api/v1/workspaces/{workspace_id}/datasets/{dataset_id}/records"
response = requests.post(url, headers=headers, json=data)

if response.status_code == 201:
    result = response.json()
    print(f"Successfully imported {result['imported_count']} records")
else:
    print(f"Import failed: {response.status_code} - {response.text}")
Batch Processing: Use 100-1000 records per batch for large datasets. Validation: Include validate_schema, skip_duplicates, update_existing in options.

Database Connections

Workspace Settings → Integrations → Add Database Connection. Supported: PostgreSQL (9.6+), MySQL (5.7+), SQL Server (2017+), Oracle (12c+), MongoDB (4.0+) Setup: Enter host, port, database name, credentials, and SSL settings. Test connection before saving. Import: Select tables/views, map columns, add filtering with WHERE clauses, schedule automatic syncs. Custom SQL: Use custom queries for advanced data selection and transformation.

Best Practices

Performance: Use indexes, limit results with WHERE clauses, schedule during off-hours, use incremental updates. Security: Read-only database accounts, SSL/TLS connections, IP whitelisting.

Third-Party Integrations

Google Sheets

Connect Google account → Select spreadsheet → Configure range → Set sync schedule. Tips: Use headers in first row, consistent data types, avoid merged cells, remove empty rows.

Airtable

Enter Airtable API key → Select base and tables → Map fields → Configure attachment and linked record handling.

Webhooks

Get webhook endpoint URL → Configure security tokens → Define payload format → Test with sample data. Real-time import with automatic validation, error logging, and rate limiting.

Next Steps