Scaling to 40k Users on Edge Networks

Engineering Case Study: Scaling to 40k Users on Edge Networks

Project: Moi Social Learning | Stack: Next.js PWA, Redis, NestJS, BullMQ Reading Time: 6 mins

The Executive Summary

In developed markets, we take “always-on” connectivity for granted. But when scaling Moi Social Learning—an ed-tech platform serving 40,000+ children across Chile and Ecuador—a dropped packet wasn’t just an annoyance; it was a barrier to education.

At OpenMind, we specialize in Offline-First Architectures. We build Progressive Web Apps (PWAs) that assume the internet will fail, and design systems that handle that failure gracefully.

Here is the architecture that allowed us to process real-time learning data and payments in regions with unstable 3G connections.

1. The Challenge: “The 30-Second Latency”

Our mandate was to build a play-based learning platform for thousands of simultaneous users. The constraints were brutal:

Intermittent Connectivity: Students often lost signal mid-session.
Device Fragmentation: 80% of our users were on low-end Android devices with limited RAM.
Data Costs: Every megabyte counted. Heavy JS bundles literally cost the user money.

2. The Solution: “Optimistic UI” & Async Queues

We couldn’t rely on standard request/response cycles. Instead, we built a “Hub and Spoke” architecture that treats the server as a “sync point” rather than a real-time dependency.

A. The “Optimistic” Frontend (Next.js PWA)

When a student completes a lesson, the interface updates instantly, regardless of network status.

Local-First State: We write the action to localStorage immediately.
Background Sync: A Service Worker attempts to push this data to the server in the background. If the network is down, it retries exponentially until a connection is found.
User Impact: The student never sees a “Loading…” spinner.

B. The “Resilient” Backend (NestJS + BullMQ)

On the server side, we couldn’t afford to block the main thread.

Example: When a heavy calculation or progress report was triggered, the server acknowledged receipt instantly, but the actual processing happened asynchronously in a worker queue.
Result: The API never timed out, even if the database was under heavy load from 5,000 concurrent students.

3. Solving the “Data Heavy” Problem

Educational apps are asset-heavy. To respect user data caps, we implemented aggressive optimization:

Code Splitting: We used Next.js dynamic imports to ensure users only downloaded the code for the specific lesson they were viewing.
Asset Caching: Core UI assets were cached permanently on the device after the first load.

4. Technical Impact

99.9% Uptime: Maintained during peak school hours.
300ms Time-to-Interactive: Achieved on low-end Android hardware via PWA caching.

Why OpenMind?

We respect the code that pays the bills. Our architecture allows mature businesses to innovate on the frontend today, while safely maintaining the backend systems that have worked for a decade.