Customer Data Merge - Render Workflows Demo

Merge customer data from multiple sources into enriched profiles using parallel Render Workflows.

This demo showcases:

Parallel processing: 10 shards processed simultaneously
Multi-source merge: CRM + Billing + Product + Support → Enriched profiles
High throughput: 400K records processed in seconds
Both Python and TypeScript: Identical implementations in both languages

Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                        FRONTEND (Next.js)                           │
│                     UI - Trigger & Monitor                          │
└─────────────────────────────────────────────────────────────────────┘
                              │
              ┌───────────────┴───────────────┐
              ▼                               ▼
┌─────────────────────────┐     ┌─────────────────────────┐
│    Python API           │     │    TypeScript API       │
│    (FastAPI)            │     │    (Fastify)            │
└─────────────────────────┘     └─────────────────────────┘
              │                               │
              ▼                               ▼
┌─────────────────────────┐     ┌─────────────────────────┐
│   Python Workflow       │     │   TypeScript Workflow   │
│   (render_sdk)          │     │   (@renderinc/sdk)      │
└─────────────────────────┘     └─────────────────────────┘
              │                               │
              └───────────────┬───────────────┘
                              ▼
┌─────────────────────────────────────────────────────────────────────┐
│                         SAMPLE DATA                                  │
│   crm.csv │ billing.csv │ product.csv │ support.csv (100K each)     │
└─────────────────────────────────────────────────────────────────────┘

Workflow: Shard-Based Parallel Processing

The workflow uses hash-based sharding to ensure deterministic routing:

Load: Read all 4 CSV source files
Route: Hash each customer_id to assign records to 10 shards
Process: Spawn 10 parallel subtasks (one per shard)
Merge: Each shard merges its customers' data from all sources
Enrich: Calculate health_score, churn_risk, expansion_potential
Aggregate: Combine all shard results into final output

customer_id → hash(customer_id) % 10 → shard_id

Same customer always routes to the same shard across all files.

Quick Start

Prerequisites

Python 3.11+
Node.js 20+
Render CLI 2.11.0+ (brew install render on macOS)
Render account with Workflows access

Local Development

Generate sample data:

cd scripts
python generate_data.py --rows 1000  # Small dataset for testing
# python generate_data.py --rows 100000  # Full 100K dataset

Start the local workflow server (pick Python or TypeScript):

The Render CLI runs a local task server on port 8120:

Python:

cd python/workflows
pip install -r requirements.txt
render workflows dev -- python main.py

TypeScript:

cd typescript/workflows
npm install
render workflows dev -- npx tsx src/main.ts

Verify tasks registered:

render workflows list --local

Start the matching API (pick one):

Set RENDER_USE_LOCAL_DEV=true so the API triggers the local workflow server instead of Render's API:

Python (default, runs on http://localhost:8001):
```
cd python/api
pip install -r requirements.txt
RENDER_USE_LOCAL_DEV=true python main.py
```
TypeScript (runs on http://localhost:8002):
```
cd typescript/api
npm install
RENDER_USE_LOCAL_DEV=true npm run dev
```
If using the TypeScript API, also set NEXT_PUBLIC_API_URL=http://localhost:8002 before starting the frontend.

Start the frontend:

cd frontend
npm install
npm run dev
# Runs on http://localhost:3000

Open http://localhost:3000 and click Run Workflow.

Environment Variables

Variable	Default	Used by
`RENDER_API_KEY`	(required for deployed services)	API services
`RENDER_USE_LOCAL_DEV`	`false`	API services (set `true` for local dev)
`WORKFLOW_SLUG`	`data-processor-workflows-py` / `data-processor-workflows-ts`	API services
`DATA_DIR`	`../../sample_data`	Workflow services
`NEXT_PUBLIC_API_URL`	`http://localhost:8001`	Frontend

Deploy to Render

1. Deploy Frontend and API (Blueprint)

The Blueprint (render.yaml) deploys the frontend and the Python API by default. If you prefer TypeScript, edit render.yaml to uncomment the TypeScript API and comment out the Python one (see the instructions in the file).

Or manually:

Push this repo to GitHub/GitLab
In Render Dashboard: New → Blueprint
Connect your repo and deploy

2. Create Workflows (Manual)

Workflows are not yet supported in Blueprints. Create them manually:

Python Workflow

In Render Dashboard: New → Workflow
Connect your repo
Settings:
- Name: data-processor-workflows-py
- Root Directory: python/workflows
- Build Command: pip install -r requirements.txt
- Start Command: python main.py
Deploy

TypeScript Workflow

In Render Dashboard: New → Workflow
Connect your repo
Settings:
- Name: data-processor-workflows-ts
- Root Directory: typescript/workflows
- Build Command: npm install && npm run build
- Start Command: npm start
Deploy

3. Configure Environment Variables

On each API service, set:

RENDER_API_KEY: Your Render API key (create at Dashboard → Account → API Keys)
WORKFLOW_SLUG: The workflow service name (the API appends /merge_customer_data automatically), e.g.:
- Python: data-processor-workflows-py
- TypeScript: data-processor-workflows-ts

Project Structure

/
├── frontend/                    # Next.js brutalist UI
│   ├── app/
│   │   ├── page.tsx             # Main demo page
│   │   └── how-it-works/
│   │       └── page.tsx         # Workflow visualizer
│   ├── components/
│   │   ├── WorkflowTrigger.tsx  # Run button
│   │   ├── EventLog.tsx         # Terminal-style log
│   │   ├── DataPreview.tsx      # Before/after view
│   │   └── ResultsSummary.tsx   # Stats and shard timings
│   └── lib/
│       ├── api.ts               # API client
│       └── workflow-config.ts   # Visualizer config
│
├── python/
│   ├── api/                     # FastAPI service
│   │   └── main.py              # Trigger endpoints
│   └── workflows/               # Render Workflow
│       ├── main.py              # Task definitions
│       ├── sharding.py          # Hash-based routing
│       └── enrichment.py        # Score calculations
│
├── typescript/
│   ├── api/                     # Fastify service
│   │   └── src/index.ts         # Trigger endpoints
│   └── workflows/               # Render Workflow
│       └── src/
│           ├── main.ts          # Task definitions
│           ├── sharding.ts      # Hash-based routing
│           └── enrichment.ts    # Score calculations
│
├── sample_data/                 # Generated CSVs
├── scripts/
│   └── generate_data.py         # Data generator
│
├── render.yaml                  # Blueprint (frontend + APIs)
└── README.md

Sample Data Schema

Input CSVs

crm.csv

customer_id,email,company_name,industry,employee_count,deal_stage,deal_value,sales_owner,last_contact

billing.csv

customer_id,email,plan,mrr,payment_status,subscription_start,last_payment

product.csv

customer_id,email,signup_date,last_active,total_sessions,features_used,usage_pct,account_status

support.csv

customer_id,email,total_tickets,open_tickets,avg_resolution_hrs,last_ticket_date,nps_score,csat_score

Output: Enriched Profile

All fields merged, plus calculated fields:

health_score: 0-100 based on usage, payments, NPS, support tickets
churn_risk: LOW / MEDIUM / HIGH
expansion_potential: LOW / MEDIUM / HIGH

Performance

With 100K rows per source (400K total records):

Metric	Value
Total records	400,000
Shards	10
Parallel tasks	10
Estimated time	2-5 seconds
Sequential estimate	~20+ seconds
Speedup	~5-10x

Customization

Change shard count

Edit NUM_SHARDS in:

python/workflows/sharding.py
typescript/workflows/src/sharding.ts

Modify enrichment logic

Edit the calculation functions in:

python/workflows/enrichment.py
typescript/workflows/src/enrichment.ts

Generate different data sizes

python scripts/generate_data.py --rows 10000    # 10K rows
python scripts/generate_data.py --rows 1000000  # 1M rows

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Customer Data Merge - Render Workflows Demo

Architecture

Workflow: Shard-Based Parallel Processing

Quick Start

Prerequisites

Local Development

Environment Variables

Deploy to Render

1. Deploy Frontend and API (Blueprint)

2. Create Workflows (Manual)

Python Workflow

TypeScript Workflow

3. Configure Environment Variables

Project Structure

Sample Data Schema

Input CSVs

Output: Enriched Profile

Performance

Customization

Change shard count

Modify enrichment logic

Generate different data sizes

Troubleshooting

Workflow not found

API key errors

CSV not found

Learn More

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
frontend		frontend
python		python
sample_data		sample_data
scripts		scripts
typescript		typescript
.gitignore		.gitignore
README.md		README.md
render.yaml		render.yaml

Folders and files

Latest commit

History

Repository files navigation

Customer Data Merge - Render Workflows Demo

Architecture

Workflow: Shard-Based Parallel Processing

Quick Start

Prerequisites

Local Development

Environment Variables

Deploy to Render

1. Deploy Frontend and API (Blueprint)

2. Create Workflows (Manual)

Python Workflow

TypeScript Workflow

3. Configure Environment Variables

Project Structure

Sample Data Schema

Input CSVs

Output: Enriched Profile

Performance

Customization

Change shard count

Modify enrichment logic

Generate different data sizes

Troubleshooting

Workflow not found

API key errors

CSV not found

Learn More

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages