
James Mwangi
Head of Security
James leads security across the Kavod platform suite, implementing zero-trust architectures that protect millions of users.
Why Zero Trust?
Kavod Technologies operates a diverse portfolio of platforms — from financial services (Karat Dollar) to ride-hailing (Buslyft) to streaming (BantuStream). Each platform handles sensitive user data: financial transactions, location history, personal identifiers, and payment credentials.
The traditional perimeter-based security model — "trust everything inside the network" — simply doesn't work at our scale. With over 400 microservices running across multiple cloud providers and regions, the concept of a single network perimeter is meaningless. Any service could be compromised, and a breach in one platform should never cascade to others.
We adopted a zero-trust architecture based on the principle: never trust, always verify. Every request — whether it originates from a user's phone, from another internal service, or from an admin's laptop — must be authenticated, authorized, and encrypted.
Identity as the New Perimeter
User Identity
All Kavod platforms share a unified identity service (internally called KavodID) built on top of Keycloak. KavodID provides:
- Multi-factor authentication (TOTP, WebAuthn/FIDO2, SMS fallback)
- Social sign-in (Google, Apple, with plans for African identity providers)
- Biometric verification (fingerprint and face, delegated to the device's secure enclave)
- Risk-based authentication — login attempts from new devices, unusual locations, or at unusual times trigger step-up authentication
User ──> KavodID ──> JWT (short-lived, 15 min)
│
├── Platform claims (which platforms the user has access to)
├── Role claims (admin, user, artist, driver, etc.)
└── Device trust scoreJWTs are short-lived (15 minutes) with rotating refresh tokens. This limits the blast radius of a stolen token. Refresh tokens are bound to a specific device fingerprint and are revoked immediately if the device trust score drops.
Service Identity
Every microservice has a cryptographic identity issued by our internal certificate authority (built on HashiCorp Vault PKI). Service identities are:
- Provisioned automatically at deployment via a Kubernetes admission controller
- Short-lived (24-hour certificates, auto-rotated)
- Scoped to a specific service name, namespace, and cluster
# Example service identity cert fields
Subject: CN=payment-service.karat-dollar.prod
SAN: payment-service.karat-dollar.svc.cluster.local
Issuer: Kavod Internal CA v2
Not After: 2026-02-16T15:00:00Z # 24-hour lifetimeMutual TLS Everywhere
All service-to-service communication is encrypted and authenticated using mutual TLS (mTLS). Both the client and server present certificates, and both verify the other's identity against the internal CA.
We implement mTLS at the service mesh layer using Istio with a custom authorization policy engine. The mesh handles certificate rotation, TLS termination, and policy enforcement transparently — application code doesn't need to know about any of it.
Authorization Policies
mTLS tells us who is calling. Authorization policies tell us whether they're allowed to. We define fine-grained policies in a declarative format:
apiVersion: security.kavod.io/v1
kind: ServicePolicy
metadata:
name: payment-service-access
spec:
target: payment-service.karat-dollar
rules:
- from:
- service: order-service.runnerstack
- service: ride-service.buslyft
- service: subscription-service.bantustream
methods: ["POST /v1/charges", "GET /v1/charges/*"]
- from:
- service: admin-gateway.kavod-internal
methods: ["GET /v1/charges/*", "GET /v1/reports/*"]
conditions:
- header: "X-Admin-Role"
values: ["finance-admin", "super-admin"]This policy says: only the order service (RunnerStack), ride service (Buslyft), and subscription service (BantuStream) can create charges via the payment service. The admin gateway can read charges and reports, but only if the request carries a valid admin role header. All other traffic is denied by default.
Threat Detection and Response
Real-Time Anomaly Detection
We run a security event pipeline that ingests logs, metrics, and traces from all platforms and feeds them into a detection engine. The pipeline processes approximately 2 billion events per day.
Our detection model combines rule-based detection (for known attack patterns) with ML-based anomaly detection (for novel threats):
Rule-based examples:
- More than 5 failed login attempts from the same IP in 60 seconds → temporary IP block
- Service-to-service call to an endpoint not in the authorization policy → alert + block
- JWT with claims that don't match the originating service → alert + revoke
ML-based anomaly detection:
- We train an isolation forest on normal traffic patterns (request rates, response times, error rates, payload sizes) per service pair. Deviations beyond 3 standard deviations trigger investigation
- A sequence model (LSTM) analyzes the order of API calls in user sessions. Unusual sequences (e.g., a user who normally only uses BantuStream suddenly making API calls to the Karat Dollar admin endpoint) trigger step-up authentication
All Services ──> Fluentd ──> Kafka ──> Flink (enrichment + windowing)
│
┌──────────────┼──────────────┐
▼ ▼ ▼
Rule Engine Isolation Forest LSTM Model
│ │ │
└──────────────┴──────────────┘
│
Alert / Block / InvestigateIncident Response Automation
When the detection engine fires a high-confidence alert, our SOAR (Security Orchestration, Automation, and Response) system automatically:
- Isolates the affected service — the Istio mesh is reconfigured to drop all traffic to/from the service except for the investigation pipeline
- Captures forensic data — memory dumps, recent logs, network flows
- Notifies the security on-call via PagerDuty with full context
- Begins automated playbook — depending on the alert type, the system may automatically rotate credentials, revoke tokens, or scale down the affected deployment
Data Protection
Encryption at Rest
All data stores across the Kavod platform suite use AES-256 encryption at rest with keys managed by HashiCorp Vault. We implement envelope encryption: each data record is encrypted with a unique Data Encryption Key (DEK), and the DEK itself is encrypted with a Key Encryption Key (KEK) stored in Vault.
Data Classification
We classify all data fields into four sensitivity tiers:
| Tier | Examples | Protection | |---|---|---| | Public | Platform names, blog posts | Standard TLS | | Internal | Non-PII analytics, aggregated metrics | Encrypted at rest | | Confidential | User email, phone, location history | Encrypted at rest + field-level encryption + access logging | | Restricted | Payment credentials, government IDs, biometrics | All above + hardware security module (HSM) key storage + data residency controls |
Cross-Platform Data Isolation
Even though all Kavod platforms share infrastructure, user data is strictly isolated between platforms. A user's BantuStream viewing history is never accessible to Buslyft, and vice versa. This is enforced at multiple levels:
- Database level: Each platform uses separate database clusters with distinct credentials
- Service mesh level: Cross-platform data access requires explicit authorization policies (see above)
- Application level: Our shared libraries include tenant-context middleware that tags every database query with the originating platform, and a query interceptor rejects any query that crosses platform boundaries
Compliance and Auditing
We maintain compliance with:
- NDPR (Nigeria Data Protection Regulation)
- POPIA (South Africa Protection of Personal Information Act)
- Kenya Data Protection Act
- GDPR (for European users accessing our platforms)
Our audit log captures every access to sensitive data — who accessed what, when, from where, and why. Audit logs are written to an append-only, tamper-evident log (backed by Amazon QLDB) and retained for 7 years.
Measuring Security Posture
We track security health through a custom Security Posture Score that aggregates metrics across all platforms:
- Certificate health: % of services with valid, non-expired certificates (target: 100%)
- Policy coverage: % of service-to-service communication paths covered by explicit authorization policies (current: 98.7%)
- Vulnerability SLA: Mean time to remediate critical CVEs (current: 4.2 hours)
- Incident response time: Mean time from alert to containment (current: 8 minutes)
These metrics are reviewed weekly by the security team and monthly by the executive team.

