
Amina Diallo
AI Research Lead
Amina heads Kavod's AI research lab, building adaptive learning systems and NLP models for African languages.
The RunnerStack Challenge
RunnerStack is Kavod Technologies' marketplace platform — a place where African sellers can set up shop and reach buyers across the continent. At peak traffic (typically Friday evenings and the days leading up to holidays), we serve 50,000 concurrent sellers managing their storefronts and 300,000 concurrent buyers browsing, searching, and purchasing.
The engineering challenge is significant: each seller expects their dashboard to be responsive in real-time (inventory updates, order notifications, analytics), while each buyer expects fast search results, instant add-to-cart, and smooth checkout — all while the system processes thousands of orders per minute.
This post covers the key architectural decisions that make RunnerStack work at scale.
Seller Onboarding Pipeline
The Funnel
Seller onboarding is a critical conversion funnel. Every extra step or minute of friction costs us sellers. Our onboarding flow is designed to get a seller from "I want to sell" to "my first product is live" in under 15 minutes.
The pipeline:
- Account creation (30 seconds) — Phone number + OTP, name, business name
- Identity verification (2–5 minutes) — Government ID upload + liveness check (same system as GrandEstate KYC, shared across Kavod platforms)
- Store setup (3–5 minutes) — Store name, category, logo upload, basic description. We auto-generate a professional-looking storefront from these inputs using templates
- First product listing (3–5 minutes) — Photo upload (our AI auto-generates background-removed product images), title, description, price, inventory count
- Payment setup (1–2 minutes) — Connect mobile money or bank account for payouts
Automated Product Enhancement
When a seller uploads a product photo, our product enhancement pipeline automatically:
- Removes the background using a fine-tuned U-Net segmentation model
- Color-corrects the image (white balance, exposure normalization)
- Generates additional angles (using a 3D reconstruction model for simple products like shoes and bags — experimental)
- Suggests a category based on the image (using a ResNet-50 classifier trained on our product taxonomy)
- Generates a description draft using an LLM fine-tuned on high-performing product listings in the same category
async def enhance_product(image: bytes, seller_category: str) -> EnhancedProduct:
# Run enhancement tasks in parallel
bg_removed, description, category, color_corrected = await asyncio.gather(
remove_background(image),
generate_description(image, seller_category),
predict_category(image),
color_correct(image),
)
return EnhancedProduct(
images=[color_corrected, bg_removed],
suggested_description=description,
suggested_category=category,
processing_time_ms=elapsed(),
)This pipeline processes a product image in under 4 seconds (P95), enabling a smooth "upload and review" experience for sellers.
Inventory Management
The Consistency Challenge
In a marketplace, inventory consistency is critical. When a product has 3 units left and two buyers try to purchase simultaneously, exactly one should succeed and the other should see an "out of stock" message. Getting this wrong leads to overselling — one of the most damaging experiences for both buyers and sellers.
We use a distributed counter with optimistic locking backed by Redis:
async function reserveInventory(productId: string, quantity: number): Promise<boolean> {
const key = `inventory:${productId}`;
// WATCH + MULTI/EXEC for atomic decrement with floor check
const result = await redis.executeIsolated(async (client) => {
await client.watch(key);
const current = parseInt(await client.get(key) || "0");
if (current < quantity) {
await client.unwatch();
return false;
}
const tx = client.multi();
tx.decrBy(key, quantity);
const execResult = await tx.exec();
// exec returns null if WATCH detected a change (race condition)
return execResult !== null;
});
if (result) {
// Async: persist the reservation to the database
await queue.publish("inventory.reserved", { productId, quantity });
}
return result;
}Redis gives us the speed we need (sub-millisecond operations), and the WATCH/MULTI/EXEC pattern gives us the consistency guarantee. The authoritative inventory count is persisted to PostgreSQL asynchronously, with Redis serving as the real-time source of truth during active selling.
Inventory Sync for Multi-Channel Sellers
Many RunnerStack sellers also sell on other platforms (Instagram, WhatsApp, physical stores). We provide an inventory sync API that lets sellers connect external channels:
- Webhook-based sync: When inventory changes on RunnerStack (sale, restock, manual adjustment), we fire a webhook to connected channels
- Pull-based sync: External systems can poll our API for current inventory levels
- Bulk import/export: CSV upload for sellers managing inventory in spreadsheets
Search Indexing
The Search Challenge
RunnerStack's catalog contains 2.3 million active product listings across 15,000 categories. Buyers expect search results to be fast, relevant, and personalized. A buyer searching for "ankara dress" should see relevant results within 200 ms, ranked by a combination of relevance, seller rating, price, and geographic proximity.
Search Architecture
We use Elasticsearch as our search engine, with a custom ranking pipeline:
Product Created/Updated ──> Kafka ──> Indexer Service ──> Elasticsearch
│
Buyer Search Query ──> Search API ──> Elasticsearch ──> Re-ranker ──> ResultsThe indexer service listens for product events on Kafka and updates the Elasticsearch index in near real-time (median indexing latency: 1.2 seconds from product update to searchable).
Ranking Model
Our search ranking combines Elasticsearch's built-in BM25 text relevance with a learning-to-rank (LTR) model that incorporates:
- Text relevance (BM25 score from Elasticsearch)
- Seller quality score (composite of rating, response time, order fulfillment rate)
- Price competitiveness (how the product's price compares to similar products)
- Geographic proximity (buyers prefer sellers in the same city for faster delivery)
- Click-through rate (historical CTR for this product in response to similar queries)
- Conversion rate (historical purchase rate after viewing)
- Freshness (recently listed products get a small boost)
// LTR feature vector for a search result
interface LTRFeatures {
bm25Score: number;
sellerQualityScore: number; // 0-100
pricePercentile: number; // 0-1, lower = more competitive
distanceKm: number; // Buyer-to-seller distance
historicalCTR: number; // 0-1
historicalConversionRate: number; // 0-1
daysSinceListed: number;
hasVerifiedSeller: boolean;
imageCount: number;
reviewCount: number;
averageRating: number; // 1-5
}The LTR model is a LambdaMART gradient-boosted tree trained on click and purchase signals. We retrain weekly on the latest behavioral data.
Autocomplete and Suggestions
In addition to full search, we provide:
- Autocomplete: As the buyer types, we suggest completions using a prefix trie built from popular queries (updated daily). Response time target: 50 ms
- "Did you mean?": Typo correction using edit distance + frequency-weighted suggestions
- Related searches: After displaying results, we suggest related queries based on co-occurrence patterns in search logs
Order Processing at Scale
The Order Pipeline
When a buyer clicks "Purchase," the following happens in rapid succession:
- Inventory reservation (Redis, < 5 ms)
- Order creation (PostgreSQL, with Kafka event)
- Payment initiation (forwarded to Karat Dollar via event)
- Payment confirmation (async, via Karat Dollar callback event)
- Seller notification (push notification + dashboard update)
- Fulfillment tracking (order enters the logistics pipeline)
Steps 1–3 complete synchronously within 800 ms (P95). Steps 4–6 are asynchronous.
Handling Peak Traffic
During peak events (Black Friday, end-of-Ramadan sales), traffic can spike to 5x normal levels within minutes. Our scaling strategy:
- Pre-scaling: Before known events, we pre-scale all services to estimated peak capacity
- Auto-scaling: Kubernetes Horizontal Pod Autoscaler monitors request rate and CPU. New pods spin up within 30 seconds of a threshold breach
- Queue buffering: The Kafka-based event pipeline naturally absorbs traffic spikes. Order processing consumers can temporarily lag behind producers without affecting the buyer experience (the buyer sees "order confirmed" as soon as the event is published)
- Read replicas: During peak, we spin up additional PostgreSQL read replicas and Elasticsearch nodes to handle increased query load
Peak Performance Metrics
During our most recent peak event (February 2026 Valentine's Day sale):
- Peak concurrent sellers: 52,400
- Peak concurrent buyers: 341,000
- Peak orders per minute: 8,200
- Search P99 latency: 180 ms
- Checkout P99 latency: 1.1 seconds
- Zero overselling incidents
- System availability: 100%
What's Next
We're investing in three major areas for 2026:
- AI-powered seller tools: Automated pricing suggestions, demand forecasting, and ad campaign optimization
- Same-day delivery: Partnering with Buslyft for intra-city deliveries, with real-time tracking integrated into the buyer experience
- Cross-border commerce: Enabling sellers in Nigeria to sell to buyers in Kenya (and vice versa) with automated customs documentation and currency conversion via Karat Dollar
Explore RunnerStack at runnerstack.com.
Try RunnerStack today
Discover how RunnerStack can help you build better, faster. Get started for free and see the difference.
Get Started


