EngineeringWednesday, April 1, 202612 min read

Building a Multi-Tenant Data Importer: Per-Customer Schemas and White-Labeling

When every customer has a different data model, a single static import schema breaks. Here's how to build a multi-tenant importer that supports per-customer field configs, white-label branding, and isolated data pipelines.

Building a Multi-Tenant Data Importer: Per-Customer Schemas and White-Labeling

Most data importer documentation assumes a single schema. One set of fields. One column mapping configuration. One set of validators. This works for simple cases, but breaks down the moment you have customers with different data models. A CRM that serves both real estate agencies and law firms has two very different contact schemas. A project management tool used by both software teams and construction companies needs different field sets for each. Building a multi-tenant importer that handles per-customer schemas, isolated data pipelines, and optional white-label branding requires design decisions up front that are expensive to retrofit later.

1The Core Problem: Schema Variability Across Tenants

In a single-tenant importer, your schema is hardcoded or configuration-file-driven. You define fields once and every import uses the same mapping. In a multi-tenant context, different customers need different target fields. Customer A is importing contact records with a `lead_score` field. Customer B doesn't use lead scoring but has a `territory` field. Customer C is on a plan that enables custom fields. None of these customers should see each other's schemas, and each schema needs to map correctly to each customer's isolated database tables or tenant partition.

The schema variability problem has three dimensions: field set (which fields exist), validation rules (what constraints apply), and output destination (where the data lands). A well-designed multi-tenant importer handles all three independently per tenant.

2Designing the Tenant Schema Model

Store schemas in your database, not in code. A code-level schema works until your first customer-specific customization request. A database-backed schema model scales to hundreds of tenants each with unique configurations.

PostgreSQL schema for tenant import configurations
-- Tenant import schema definitions
CREATE TABLE tenant_import_schemas (
  id           UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  tenant_id    UUID NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
  name         TEXT NOT NULL, -- e.g. 'contacts', 'products'
  display_name TEXT NOT NULL, -- shown in the UI
  created_at   TIMESTAMPTZ DEFAULT NOW(),
  updated_at   TIMESTAMPTZ DEFAULT NOW(),
  UNIQUE (tenant_id, name)
);

-- Per-tenant field definitions
CREATE TABLE tenant_import_fields (
  id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  schema_id       UUID NOT NULL REFERENCES tenant_import_schemas(id) ON DELETE CASCADE,
  key             TEXT NOT NULL,  -- internal field name
  label           TEXT NOT NULL,  -- display label
  type            TEXT NOT NULL,  -- 'string' | 'number' | 'boolean' | 'date' | 'email' | 'phone'
  required        BOOLEAN DEFAULT FALSE,
  aliases         TEXT[] DEFAULT '{}', -- known synonyms for AI mapping
  validators      JSONB DEFAULT '[]',  -- validation rules
  sort_order      INT DEFAULT 0,
  UNIQUE (schema_id, key)
);

-- Optional: per-tenant branding overrides
CREATE TABLE tenant_importer_branding (
  tenant_id     UUID PRIMARY KEY REFERENCES tenants(id),
  primary_color TEXT,  -- hex color for the importer UI
  logo_url      TEXT,  -- custom logo shown in the importer header
  button_label  TEXT DEFAULT 'Import Data',
  remove_xlork_branding BOOLEAN DEFAULT FALSE -- requires Pro/Scale tier
);

This model lets you add, remove, or modify fields per tenant without a code deploy. When you add custom fields support to your product, you insert rows into `tenant_import_fields`. When a customer deactivates a field, you delete the row. The import pipeline reads the schema from the database at request time.

3Building the Schema API Endpoint

Your frontend importer needs to fetch the schema for the current tenant at initialization time. Build an authenticated API endpoint that returns the tenant's schema in the format your importer SDK expects.

Schema API endpoint (Express + Postgres)
const express = require('express');
const { Pool } = require('pg');

const pool = new Pool({ connectionString: process.env.DATABASE_URL });
const router = express.Router();

// GET /api/import/schema/:schemaName
router.get('/schema/:schemaName', requireAuth, async (req, res) => {
  const { schemaName } = req.params;
  const tenantId = req.user.tenantId; // Set by your auth middleware

  const { rows: fields } = await pool.query(
    `SELECT
       f.key, f.label, f.type, f.required, f.aliases, f.validators
     FROM tenant_import_fields f
     JOIN tenant_import_schemas s ON s.id = f.schema_id
     WHERE s.tenant_id = $1 AND s.name = $2
     ORDER BY f.sort_order ASC`,
    [tenantId, schemaName]
  );

  if (fields.length === 0) {
    return res.status(404).json({ error: 'Schema not found' });
  }

  // Map to Xlork column format
  const columns = fields.map((f) => ({
    key: f.key,
    label: f.label,
    type: f.type,
    required: f.required,
    aliases: f.aliases,
    validators: f.validators,
  }));

  res.json({ columns });
});

module.exports = router;

4Initializing the Importer with Dynamic Schemas

On the frontend, fetch the tenant's schema before rendering the importer. Pass the schema to the importer component. When the schema changes (new fields added, tenant upgraded to a plan with more fields), a page refresh picks up the new configuration without any code change.

React component with dynamic tenant schema
import { useState, useEffect } from 'react';
import { XlorkImporter } from '@xlork/react';

export function TenantImporter({ schemaName, onImportComplete }) {
  const [columns, setColumns] = useState(null);
  const [branding, setBranding] = useState({});
  const [error, setError] = useState(null);

  useEffect(() => {
    async function loadSchema() {
      try {
        const [schemaRes, brandingRes] = await Promise.all([
          fetch(`/api/import/schema/${schemaName}`, { credentials: 'include' }),
          fetch('/api/import/branding', { credentials: 'include' }),
        ]);

        if (!schemaRes.ok) throw new Error('Failed to load import schema');

        const { columns } = await schemaRes.json();
        const brandingData = brandingRes.ok ? await brandingRes.json() : {};

        setColumns(columns);
        setBranding(brandingData);
      } catch (err) {
        setError(err.message);
      }
    }
    loadSchema();
  }, [schemaName]);

  if (error) return <div className="error">{error}</div>;
  if (!columns) return <div className="loading">Loading importer...</div>;

  return (
    <XlorkImporter
      apiKey={process.env.NEXT_PUBLIC_XLORK_API_KEY}
      columns={columns}
      theme={{
        primaryColor: branding.primaryColor ?? '#6366f1',
        logoUrl: branding.logoUrl,
        buttonLabel: branding.buttonLabel ?? 'Import Data',
        removeBranding: branding.removeXlorkBranding ?? false,
      }}
      onComplete={onImportComplete}
    />
  );
}

💡 Pro tip

Loading the schema from your API on every importer initialization ensures tenants always see their current field configuration, including any fields added or removed since their last import session. Cache the schema response on the client for the duration of the import session, but always re-fetch on page load.

5Routing Imported Data to the Right Tenant Partition

When the import completes, the `onComplete` callback fires with the parsed, validated rows. Your webhook handler or callback function needs to route this data to the correct tenant partition. This sounds obvious, but it's a place where subtle bugs introduce data isolation failures.

Webhook handler with tenant isolation
// Server-side webhook handler for completed imports
app.post('/api/import/webhook', async (req, res) => {
  // 1. Verify webhook signature first — always
  const isValid = xlork.webhooks.verify({
    payload: JSON.stringify(req.body),
    signature: req.headers['x-xlork-signature'],
    secret: process.env.XLORK_WEBHOOK_SECRET,
  });
  if (!isValid) return res.status(401).json({ error: 'Invalid signature' });

  // Acknowledge immediately
  res.status(200).json({ received: true });

  setImmediate(async () => {
    const { rows, metadata } = req.body;
    const { tenantId, schemaName, userId } = metadata;

    // CRITICAL: always re-validate tenantId server-side
    // Never trust tenantId from the client payload without verification
    const tenant = await getTenantById(tenantId);
    if (!tenant) {
      console.error('Import webhook: unknown tenant', tenantId);
      return;
    }

    // Get the destination table for this schema
    const destinationTable = `tenant_${tenant.slug}_${schemaName}s`;
    // Or: use row-level security with tenant_id column
    // INSERT INTO contacts (tenant_id, ...) VALUES ($1, ...)

    await batchInsert(destinationTable, rows, tenantId);
    await notifyUser(userId, { type: 'import_complete', count: rows.length });
  });
});

async function batchInsert(table, rows, tenantId) {
  const BATCH_SIZE = 250;
  const client = await pool.connect();
  try {
    await client.query('BEGIN');
    for (let i = 0; i < rows.length; i += BATCH_SIZE) {
      const batch = rows.slice(i, i + BATCH_SIZE);
      // Build parameterized INSERT with tenant_id
      const values = batch.map((row) => ({ ...row, tenant_id: tenantId }));
      await insertBatch(client, table, values);
    }
    await client.query('COMMIT');
  } catch (err) {
    await client.query('ROLLBACK');
    throw err;
  } finally {
    client.release();
  }
}

The single most important rule in multi-tenant import handling: re-derive the tenant context from your authenticated session or a server-side-verified token, never from the webhook payload alone. The webhook payload can be verified for integrity via HMAC signature, but the `tenantId` in the metadata comes from what the client passed when initiating the import. Always cross-reference against your auth system.

6White-Label Branding: Removing Xlork from the User Experience

If you're building a product that resells import functionality to your customers — or if you want your import UI to appear completely native to your product — white-labeling is a requirement. This means replacing the Xlork logo with your customer's logo, applying their brand colors, and removing 'Powered by Xlork' attribution.

Xlork's Pro and Scale tiers support custom branding via the `theme` prop. Pass `removeBranding: true` along with your customer's `logoUrl` and `primaryColor`. The importer renders with your branding and no Xlork attribution. The end user sees a seamlessly branded import experience that matches the rest of your product.

White-label configuration
<XlorkImporter
  apiKey={process.env.NEXT_PUBLIC_XLORK_API_KEY}
  columns={tenantColumns}
  theme={{
    primaryColor: tenant.brandColor,       // e.g. '#0ea5e9'
    logoUrl: tenant.logoUrl,                // shown in importer header
    buttonLabel: tenant.importButtonLabel,  // e.g. 'Upload Customer Data'
    removeBranding: true,                   // requires Pro/Scale tier
    fontFamily: 'inherit',                  // inherits from your app's CSS
  }}
  onComplete={handleImportComplete}
/>

7Plan-Gating Import Features

In a multi-tenant context, different subscription tiers get different import capabilities. A free-tier tenant might get basic CSV import with no custom fields. A paid tenant gets Excel, Google Sheets, custom field mapping, and advanced validation. Implementing this means your schema API and importer configuration need to be tier-aware.

Tier-aware schema filtering
// In your schema API endpoint
const tenantPlan = await getTenantPlan(tenantId); // 'free' | 'growth' | 'pro' | 'scale'

// Filter fields based on plan
const allowedFieldTypes = tenantPlan === 'free'
  ? ['string', 'number', 'boolean']
  : ['string', 'number', 'boolean', 'date', 'email', 'phone'];

const fields = allFields.filter((f) => allowedFieldTypes.includes(f.type));

// Filter allowed file formats
const allowedFormats = tenantPlan === 'free'
  ? ['csv']
  : ['csv', 'xlsx', 'xls', 'xml', 'json', 'google_sheets'];

res.json({ columns: fields, allowedFormats });

8Import History and Audit Logging

In a multi-tenant product, import history is a feature — not an afterthought. Customers want to know who imported what, when, and how many records landed. Log every import event with enough detail to reconstruct the audit trail.

Import audit log table
CREATE TABLE import_events (
  id            UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  tenant_id     UUID NOT NULL REFERENCES tenants(id),
  user_id       UUID NOT NULL REFERENCES users(id),
  schema_name   TEXT NOT NULL,
  file_name     TEXT,
  file_format   TEXT, -- 'xlsx' | 'csv' | etc.
  total_rows    INT,
  inserted_rows INT,
  updated_rows  INT,
  failed_rows   INT,
  status        TEXT DEFAULT 'pending', -- 'pending' | 'processing' | 'completed' | 'failed'
  error_summary JSONB,
  started_at    TIMESTAMPTZ DEFAULT NOW(),
  completed_at  TIMESTAMPTZ
);

-- Index for per-tenant history queries
CREATE INDEX ON import_events (tenant_id, started_at DESC);

Audit logs are a feature your customers didn't ask for — until the day their data looks wrong and they need to understand why. Build the logging from day one; retrofitting it is significantly harder.

9Summary

Multi-tenant data importers require database-backed schema storage per tenant, a schema API endpoint that filters by authenticated tenant context, dynamic schema loading on the frontend, and server-side tenant re-verification in webhook handlers. White-label branding, plan-gated features, and per-tenant audit logging round out a production-quality implementation. None of these are difficult to implement once you have the data model and routing right — and getting them right from the start is far less expensive than retrofitting them after launch.

💡 Pro tip

Xlork's React SDK supports dynamic column configurations, custom theming, and white-label branding out of the box. You bring the per-tenant schema; Xlork handles the file parsing, AI mapping, validation UI, and clean data delivery. See the configuration API at xlork.com/docs.

#csv-import#data-engineering#best-practices#engineering

Ready to simplify data imports?

Drop a production-ready CSV importer into your app. Free tier included, no credit card required.