Zero-Trust Security Across the Kavod Platform Suite

James Mwangi

Head of Security

James leads security across the Kavod platform suite, implementing zero-trust architectures that protect millions of users.

Why Zero Trust?

Kavod Technologies operates a diverse portfolio of platforms — from financial services (Karat Dollar) to ride-hailing (Buslyft) to streaming (BantuStream). Each platform handles sensitive user data: financial transactions, location history, personal identifiers, and payment credentials.

The traditional perimeter-based security model — "trust everything inside the network" — simply doesn't work at our scale. With over 400 microservices running across multiple cloud providers and regions, the concept of a single network perimeter is meaningless. Any service could be compromised, and a breach in one platform should never cascade to others.

We adopted a zero-trust architecture based on the principle: never trust, always verify. Every request — whether it originates from a user's phone, from another internal service, or from an admin's laptop — must be authenticated, authorized, and encrypted.

Identity as the New Perimeter

User Identity

All Kavod platforms share a unified identity service (internally called KavodID) built on top of Keycloak. KavodID provides:

Multi-factor authentication (TOTP, WebAuthn/FIDO2, SMS fallback)
Social sign-in (Google, Apple, with plans for African identity providers)
Biometric verification (fingerprint and face, delegated to the device's secure enclave)
Risk-based authentication — login attempts from new devices, unusual locations, or at unusual times trigger step-up authentication

User ──> KavodID ──> JWT (short-lived, 15 min)
                         │
                         ├── Platform claims (which platforms the user has access to)
                         ├── Role claims (admin, user, artist, driver, etc.)
                         └── Device trust score

JWTs are short-lived (15 minutes) with rotating refresh tokens. This limits the blast radius of a stolen token. Refresh tokens are bound to a specific device fingerprint and are revoked immediately if the device trust score drops.

Service Identity

Every microservice has a cryptographic identity issued by our internal certificate authority (built on HashiCorp Vault PKI). Service identities are:

Provisioned automatically at deployment via a Kubernetes admission controller
Short-lived (24-hour certificates, auto-rotated)
Scoped to a specific service name, namespace, and cluster

# Example service identity cert fields
Subject: CN=payment-service.karat-dollar.prod
SAN: payment-service.karat-dollar.svc.cluster.local
Issuer: Kavod Internal CA v2
Not After: 2026-02-16T15:00:00Z  # 24-hour lifetime

Mutual TLS Everywhere

All service-to-service communication is encrypted and authenticated using mutual TLS (mTLS). Both the client and server present certificates, and both verify the other's identity against the internal CA.

We implement mTLS at the service mesh layer using Istio with a custom authorization policy engine. The mesh handles certificate rotation, TLS termination, and policy enforcement transparently — application code doesn't need to know about any of it.

Authorization Policies

mTLS tells us who is calling. Authorization policies tell us whether they're allowed to. We define fine-grained policies in a declarative format:

apiVersion: security.kavod.io/v1
kind: ServicePolicy
metadata:
  name: payment-service-access
spec:
  target: payment-service.karat-dollar
  rules:
    - from:
        - service: order-service.runnerstack
        - service: ride-service.buslyft
        - service: subscription-service.bantustream
      methods: ["POST /v1/charges", "GET /v1/charges/*"]
    - from:
        - service: admin-gateway.kavod-internal
      methods: ["GET /v1/charges/*", "GET /v1/reports/*"]
      conditions:
        - header: "X-Admin-Role"
          values: ["finance-admin", "super-admin"]

This policy says: only the order service (RunnerStack), ride service (Buslyft), and subscription service (BantuStream) can create charges via the payment service. The admin gateway can read charges and reports, but only if the request carries a valid admin role header. All other traffic is denied by default.

Threat Detection and Response

Real-Time Anomaly Detection

We run a security event pipeline that ingests logs, metrics, and traces from all platforms and feeds them into a detection engine. The pipeline processes approximately 2 billion events per day.

Our detection model combines rule-based detection (for known attack patterns) with ML-based anomaly detection (for novel threats):

Rule-based examples:

More than 5 failed login attempts from the same IP in 60 seconds → temporary IP block
Service-to-service call to an endpoint not in the authorization policy → alert + block
JWT with claims that don't match the originating service → alert + revoke

ML-based anomaly detection:

We train an isolation forest on normal traffic patterns (request rates, response times, error rates, payload sizes) per service pair. Deviations beyond 3 standard deviations trigger investigation
A sequence model (LSTM) analyzes the order of API calls in user sessions. Unusual sequences (e.g., a user who normally only uses BantuStream suddenly making API calls to the Karat Dollar admin endpoint) trigger step-up authentication

All Services ──> Fluentd ──> Kafka ──> Flink (enrichment + windowing)
                                            │
                              ┌──────────────┼──────────────┐
                              ▼              ▼              ▼
                        Rule Engine    Isolation Forest   LSTM Model
                              │              │              │
                              └──────────────┴──────────────┘
                                            │
                                    Alert / Block / Investigate

Incident Response Automation

When the detection engine fires a high-confidence alert, our SOAR (Security Orchestration, Automation, and Response) system automatically:

Isolates the affected service — the Istio mesh is reconfigured to drop all traffic to/from the service except for the investigation pipeline
Captures forensic data — memory dumps, recent logs, network flows
Notifies the security on-call via PagerDuty with full context
Begins automated playbook — depending on the alert type, the system may automatically rotate credentials, revoke tokens, or scale down the affected deployment

Data Protection

Encryption at Rest

All data stores across the Kavod platform suite use AES-256 encryption at rest with keys managed by HashiCorp Vault. We implement envelope encryption: each data record is encrypted with a unique Data Encryption Key (DEK), and the DEK itself is encrypted with a Key Encryption Key (KEK) stored in Vault.

Data Classification

We classify all data fields into four sensitivity tiers:

| Tier | Examples | Protection | |---|---|---| | Public | Platform names, blog posts | Standard TLS | | Internal | Non-PII analytics, aggregated metrics | Encrypted at rest | | Confidential | User email, phone, location history | Encrypted at rest + field-level encryption + access logging | | Restricted | Payment credentials, government IDs, biometrics | All above + hardware security module (HSM) key storage + data residency controls |

Cross-Platform Data Isolation

Even though all Kavod platforms share infrastructure, user data is strictly isolated between platforms. A user's BantuStream viewing history is never accessible to Buslyft, and vice versa. This is enforced at multiple levels:

Database level: Each platform uses separate database clusters with distinct credentials
Service mesh level: Cross-platform data access requires explicit authorization policies (see above)
Application level: Our shared libraries include tenant-context middleware that tags every database query with the originating platform, and a query interceptor rejects any query that crosses platform boundaries

Compliance and Auditing

We maintain compliance with:

NDPR (Nigeria Data Protection Regulation)
POPIA (South Africa Protection of Personal Information Act)
Kenya Data Protection Act
GDPR (for European users accessing our platforms)

Our audit log captures every access to sensitive data — who accessed what, when, from where, and why. Audit logs are written to an append-only, tamper-evident log (backed by Amazon QLDB) and retained for 7 years.

Measuring Security Posture

We track security health through a custom Security Posture Score that aggregates metrics across all platforms:

Certificate health: % of services with valid, non-expired certificates (target: 100%)
Policy coverage: % of service-to-service communication paths covered by explicit authorization policies (current: 98.7%)
Vulnerability SLA: Mean time to remediate critical CVEs (current: 4.2 hours)
Incident response time: Mean time from alert to containment (current: 8 minutes)

These metrics are reviewed weekly by the security team and monthly by the executive team.

#security#zero-trust#infrastructure#authentication

Zero-Trust Security Across the Kavod Platform Suite

James Mwangi

Why Zero Trust?

Identity as the New Perimeter

User Identity

Service Identity

Mutual TLS Everywhere

Authorization Policies

Threat Detection and Response

Real-Time Anomaly Detection

Incident Response Automation

Data Protection

Encryption at Rest

Data Classification

Cross-Platform Data Isolation

Compliance and Auditing

Measuring Security Posture

Related Articles

Related Posts

How We Process 2M Transactions per Day Across Kavod Platforms

Annual Report FY2025

Stay updated

Headquarters

Regional Offices

Contact

Accessibility Options