Serverless Cold Start Optimization: From 25s to 500ms Using Free Tools | Sumit Kumar

I was checking my portfolio's API when I noticed something weird. A simple health check was taking 25 seconds.

Twenty. Five. Seconds.

For context, my Express.js service? 800ms. But my Python FastAPI backends? A painful 10-25 seconds. All deployed on Vercel's free tier, all getting pinged every 10 minutes to "stay warm."

Clearly, something was broken.

The Real Problem

Turns out, serverless functions aren't always "instantly ready." They go cold when idle, and restarting isn't free:


markdown
Node.js cold start:  ~1-2 seconds   (lightweight runtime)
Python cold start:   ~8-12 seconds  (heavier interpreter)
Python + ML libs:    ~20-30 seconds (my portfolio's RAG chatbot)

My 10-minute cron schedule? Perfect for Node.js (stays warm 10-15 min). Terrible for Python (goes cold after 3-7 min).

I needed different strategies for different runtimes.

The Fix: Smart Keep-Alive with Cron-Job.org

Instead of guessing, I built a Python script using binary search to find the optimal ping interval for each service.

The idea: Test progressively shorter intervals until the function stays warm.


python
def find_optimal_interval():
    # Start with 9 minutes, then 4, then 2...
    # If it's cold, go shorter. If warm, try longer.
    # Binary search finds the answer in 4-5 tests

Results:


markdown
| Service | Framework | Old Response | New Interval | New Response | Improvement |
|---------|-----------|--------------|--------------|--------------|-------------|
| AI Hub X | Express.js | 800ms | `*/10` (kept) | 380ms | 2.1x |
| My Talent Flow | FastAPI | 10,000ms | `*/2` (changed) | 567ms | **17.6x** 🚀 |
| Portfolio API | FastAPI+RAG | 25,000ms | `*/2` (changed) | 587ms | **42.6x** 🚀 |

From 25 seconds to 587 milliseconds. Problem solved.

The Setup

Cron-job.org is perfect for this:

✅ Free (50 jobs, unlimited runs)
✅ Reliable (99.9% uptime)
✅ Built-in monitoring

I configured three jobs:

Express.js service:


markdown
URL: https://express-service.vercel.app/api/health
Schedule: */10 * * * * (every 10 minutes)

Python services:


markdown
URL: https://python-service.vercel.app/
Schedule: */2 * * * * (every 2 minutes)

The Python services need more frequent pings because they cool down faster.

Cost Check

"Wait, doesn't that eat into Vercel's free tier?"

My usage:


markdown
Total monthly invocations: 47,520
Vercel free tier limit:    100,000
Usage:                     47.5% ✅

Still have 52% headroom for actual user traffic. And it costs $0.

What I Learned

1. Runtimes Matter

Node.js is naturally serverless-friendly. Python needs more babysitting.

2. Imports = Startup Time

My Portfolio API loads LangChain, ChromaDB, pandas—that's 15+ seconds of imports. For a health check that returns {"status": "ok"}.

Better approach:


python
# Lightweight health check
@app.get("/health")
def health():
    return {"status": "ok"}  # No heavy imports

# Heavy endpoint loads libraries only when needed
@app.post("/query")
def query():
    import langchain  # Lazy load
    # RAG logic here

3. Measure, Don't Assume

I thought "10 minutes should work everywhere." Data said otherwise. Building a small testing tool saved hours of guessing.

4. Free Tier ≠ Bad

With smart engineering, free tools can be production-grade. No need to upgrade just because "that's what you do."

The Monitoring

Cron-job.org shows me exactly what's happening:


markdown
Recent execution history:

Time      | Response | Status
─────────────────────────────────
16:40     | 0.95s    | ✅ 200 OK
16:30     | 0.91s    | ✅ 200 OK
16:20     | 0.87s    | ✅ 200 OK
16:10     | 0.94s    | ✅ 200 OK

Average: 0.90s
Success: 100%

All services consistently under 1 second. Email alerts if anything breaks.

Try It Yourself

If you're running serverless functions:

Test your cold start time:


bash
curl -w "\nTime: %{time_total}s\n" https://your-api.vercel.app/health
# Wait 10 minutes
curl -w "\nTime: %{time_total}s\n" https://your-api.vercel.app/health

If second call is slow: You're hitting cold starts. Reduce cron interval.
Set up cron-job.org:
- Sign up (free)
- Add your endpoint
- Start with */5 * * * *
- Adjust based on response times
Monitor and tweak: Check execution history weekly, adjust if needed.

The Bottom Line

Serverless optimization isn't about throwing money at the problem. It's about understanding how different runtimes behave and configuring them accordingly.

Results:

✅ 23x faster on average
✅ Sub-second responses consistently
✅ Still on free tier (47% usage)
✅ $0 monthly cost

Not bad for a weekend project.

Running serverless on free tier? Check your cold start times. You might be surprised what's hiding there.

Resources

Cron-Job.org - Free cron scheduler
Vercel Functions Docs - Official documentation
My optimizer script: GitHub

Topics

#Performance#Serverless#FastAPI#Vercel#Python#Node.js#DevOps#Cloud Optimization#Engineering

Thanks for reading!

Share this post:

I was checking my portfolio's API when I noticed something weird. A simple health check was taking 25 seconds.

Twenty. Five. Seconds.

For context, my Express.js service? 800ms. But my Python FastAPI backends? A painful 10-25 seconds. All deployed on Vercel's free tier, all getting pinged every 10 minutes to "stay warm."

Clearly, something was broken.

The Real Problem

Turns out, serverless functions aren't always "instantly ready." They go cold when idle, and restarting isn't free:


markdown
Node.js cold start:  ~1-2 seconds   (lightweight runtime)
Python cold start:   ~8-12 seconds  (heavier interpreter)
Python + ML libs:    ~20-30 seconds (my portfolio's RAG chatbot)

My 10-minute cron schedule? Perfect for Node.js (stays warm 10-15 min). Terrible for Python (goes cold after 3-7 min).

I needed different strategies for different runtimes.

The Fix: Smart Keep-Alive with Cron-Job.org

Instead of guessing, I built a Python script using binary search to find the optimal ping interval for each service.

The idea: Test progressively shorter intervals until the function stays warm.


python
def find_optimal_interval():
    # Start with 9 minutes, then 4, then 2...
    # If it's cold, go shorter. If warm, try longer.
    # Binary search finds the answer in 4-5 tests

Results:


markdown
| Service | Framework | Old Response | New Interval | New Response | Improvement |
|---------|-----------|--------------|--------------|--------------|-------------|
| AI Hub X | Express.js | 800ms | `*/10` (kept) | 380ms | 2.1x |
| My Talent Flow | FastAPI | 10,000ms | `*/2` (changed) | 567ms | **17.6x** 🚀 |
| Portfolio API | FastAPI+RAG | 25,000ms | `*/2` (changed) | 587ms | **42.6x** 🚀 |

From 25 seconds to 587 milliseconds. Problem solved.

The Setup

Cron-job.org is perfect for this:

✅ Free (50 jobs, unlimited runs)
✅ Reliable (99.9% uptime)
✅ Built-in monitoring

I configured three jobs:

Express.js service:


markdown
URL: https://express-service.vercel.app/api/health
Schedule: */10 * * * * (every 10 minutes)

Python services:


markdown
URL: https://python-service.vercel.app/
Schedule: */2 * * * * (every 2 minutes)

The Python services need more frequent pings because they cool down faster.

Cost Check

"Wait, doesn't that eat into Vercel's free tier?"

My usage:


markdown
Total monthly invocations: 47,520
Vercel free tier limit:    100,000
Usage:                     47.5% ✅

Still have 52% headroom for actual user traffic. And it costs $0.

What I Learned

1. Runtimes Matter

Node.js is naturally serverless-friendly. Python needs more babysitting.

2. Imports = Startup Time

My Portfolio API loads LangChain, ChromaDB, pandas—that's 15+ seconds of imports. For a health check that returns {"status": "ok"}.

Better approach:


python
# Lightweight health check
@app.get("/health")
def health():
    return {"status": "ok"}  # No heavy imports

# Heavy endpoint loads libraries only when needed
@app.post("/query")
def query():
    import langchain  # Lazy load
    # RAG logic here

3. Measure, Don't Assume

I thought "10 minutes should work everywhere." Data said otherwise. Building a small testing tool saved hours of guessing.

4. Free Tier ≠ Bad

With smart engineering, free tools can be production-grade. No need to upgrade just because "that's what you do."

The Monitoring

Cron-job.org shows me exactly what's happening:


markdown
Recent execution history:

Time      | Response | Status
─────────────────────────────────
16:40     | 0.95s    | ✅ 200 OK
16:30     | 0.91s    | ✅ 200 OK
16:20     | 0.87s    | ✅ 200 OK
16:10     | 0.94s    | ✅ 200 OK

Average: 0.90s
Success: 100%

All services consistently under 1 second. Email alerts if anything breaks.

Try It Yourself

If you're running serverless functions:

Test your cold start time:


bash
curl -w "\nTime: %{time_total}s\n" https://your-api.vercel.app/health
# Wait 10 minutes
curl -w "\nTime: %{time_total}s\n" https://your-api.vercel.app/health

If second call is slow: You're hitting cold starts. Reduce cron interval.
Set up cron-job.org:
- Sign up (free)
- Add your endpoint
- Start with */5 * * * *
- Adjust based on response times
Monitor and tweak: Check execution history weekly, adjust if needed.

The Bottom Line

Serverless optimization isn't about throwing money at the problem. It's about understanding how different runtimes behave and configuring them accordingly.

Results:

✅ 23x faster on average
✅ Sub-second responses consistently
✅ Still on free tier (47% usage)
✅ $0 monthly cost

Not bad for a weekend project.

Running serverless on free tier? Check your cold start times. You might be surprised what's hiding there.

Resources

Cron-Job.org - Free cron scheduler
Vercel Functions Docs - Official documentation
My optimizer script: GitHub

Topics

#Performance#Serverless#FastAPI#Vercel#Python#Node.js#DevOps#Cloud Optimization#Engineering

Thanks for reading!

Share this post: