
Kofi Mensah
Full-Stack Engineer
Kofi builds end-to-end marketplace and social platforms at Kavod, from real-time APIs to polished mobile-first UIs.
The Pricing Problem in African Automotive Markets
Buying or selling a vehicle in most African countries is an exercise in information asymmetry. There is no Kelley Blue Book, no standardized pricing database, and no centralized sales history. Prices vary wildly between cities, dealerships, and even days of the week. Sellers overprice because they expect haggling; buyers underpay because they cannot verify fair value.
Automovil was built to solve this with data. Our ML-powered pricing engine ingests market signals from across the continent to produce accurate, real-time vehicle valuations. This post details how we built it, from data collection to model architecture to the dealer tools that bring it all together.
Market Data Collection at Scale
The first challenge was assembling a training dataset in markets where structured automotive data barely exists.
Data Sources
We collect pricing signals from multiple channels:
- Online classifieds scraping: We monitor over 40 automotive classifieds platforms across 15 countries, extracting listing price, vehicle details, location, listing duration, and price changes over time. Our scrapers handle wildly inconsistent data formats, local naming conventions (a "Tokunbo" in Nigeria means a used import from abroad), and duplicate detection across platforms.
- Dealer network: Over 3,000 registered Automovil dealers submit transaction data including actual sale prices (not just asking prices), trade-in valuations, and inventory costs.
- Import records: We partner with customs data providers in several countries to track import volumes, landed costs, and duty rates for popular vehicle models.
- Insurance valuations: Anonymized insurance valuation data provides another anchor point for vehicle values.
- Auction data: Wholesale auction results from Japan, UAE, and European markets (the primary sources of used vehicle imports to Africa) give us the supply-side cost basis.
Data Cleaning and Normalization
Raw automotive data in Africa is messy. The same vehicle might be listed as "Toyota Camry 2018", "Camry 2018 XLE", "Toyota Camry (Muscle) 2018", or "Tokunbo Camry 018". Our normalization pipeline:
- Entity resolution: A custom NER model extracts make, model, year, trim, and variant from free-text listings with 94% accuracy.
- VIN decoding: Where VIN numbers are available (increasingly common on our platform), we decode them to get definitive vehicle specifications.
- Deduplication: Listings for the same vehicle across multiple platforms are identified using a combination of image perceptual hashing, text similarity, and phone number matching.
- Currency normalization: All prices are converted to USD at the transaction-date exchange rate, then stored alongside the local currency amount.
class VehicleNormalizer:
def __init__(self):
self.ner_model = AutoMotiveNER.load("automovil-ner-v4")
self.vin_decoder = VINDecoder(markets=["africa", "japan", "uae", "eu"])
self.dedup_engine = LSHDeduplicator(threshold=0.85)
def normalize(self, raw_listing: RawListing) -> NormalizedVehicle:
entities = self.ner_model.extract(raw_listing.title + " " + raw_listing.description)
specs = self.vin_decoder.decode(raw_listing.vin) if raw_listing.vin else None
return NormalizedVehicle(
make=specs.make if specs else entities.make,
model=specs.model if specs else entities.model,
year=specs.year if specs else entities.year,
trim=specs.trim if specs else entities.trim,
mileage_km=self.parse_mileage(raw_listing),
price_local=raw_listing.price,
price_usd=self.convert_currency(raw_listing.price, raw_listing.currency),
location=self.geocode(raw_listing.location),
condition_signals=entities.condition_tags,
source=raw_listing.source,
)The Pricing Model
With clean data assembled, we built a pricing model that accounts for the unique dynamics of African automotive markets.
Feature Engineering
Our model uses over 120 features grouped into categories:
- Vehicle attributes: Make, model, year, trim, engine size, transmission, fuel type, body style, drive type
- Condition signals: Mileage, accident history (where available), listed condition (new, foreign-used, locally-used), number of previous owners
- Market context: Country, city, current exchange rate, fuel price, import duty rate for the specific vehicle class, time of year (prices spike before holidays)
- Supply-demand indicators: Number of similar vehicles currently listed in the market, average days-on-market for the model, import volume trends
Model Architecture
We use a gradient-boosted decision tree ensemble (LightGBM) as our primary pricing model. We evaluated deep learning alternatives, but tree-based models consistently outperformed on our tabular dataset while being faster to train, easier to interpret, and more robust to the distribution shifts common in our data.
Key design decisions:
- Hierarchical training: We train a base model on all continental data, then fine-tune country-specific and city-specific sub-models. This lets rare vehicles benefit from global data while common vehicles get locally-calibrated predictions.
- Quantile regression: Instead of predicting a single price point, we predict the 10th, 25th, 50th, 75th, and 90th percentile prices. This gives users a price range that reflects genuine market variability.
- Temporal decay: Training examples are weighted by recency, with a half-life of 90 days. This ensures the model tracks rapid price shifts caused by currency movements or policy changes.
Condition Assessment via Photos
Many listings include photos but minimal text description of the vehicle's condition. We built a visual condition assessment model that analyzes listing photos to detect:
- Exterior damage: Dents, scratches, rust, mismatched paint panels
- Interior wear: Torn seats, dashboard cracks, worn steering wheel
- Tire condition: Tread depth estimation from side-angle tire photos
- Modification detection: Aftermarket wheels, body kits, tinted windows
The visual condition score is fed as a feature into the pricing model and significantly improves accuracy for vehicles where textual condition descriptions are absent or unreliable.
interface PricingResult {
vehicleId: string;
predictedPrice: {
p10: number; // likely minimum (good deal)
p25: number; // below average
p50: number; // fair market value
p75: number; // above average
p90: number; // likely maximum (overpriced)
currency: string;
};
confidence: number;
comparables: ComparableVehicle[];
priceDrivers: {
feature: string;
impact: number; // SHAP value
direction: "increases" | "decreases";
}[];
marketInsight: {
daysOnMarketAvg: number;
supplyTrend: "increasing" | "stable" | "decreasing";
priceTrend30d: number; // percentage
};
}Dealer Tools
Automovil is not just a consumer pricing tool. We provide a comprehensive dealer platform:
- Inventory management: Dealers list vehicles with guided data entry that ensures consistent, complete information. Our photo upload flow prompts dealers to capture specific angles and automatically tags detected features.
- Appraisal tool: When a customer brings a vehicle for trade-in, the dealer can run an instant appraisal using the pricing model, complete with comparable recent transactions and a confidence interval.
- Market intelligence dashboard: Dealers see real-time demand signals for vehicle models in their area, import cost trends, and competitor pricing analysis.
- Financing integration: We integrate with 12 auto-finance lenders, allowing dealers to present financing options to customers at the point of sale with real-time pre-approval.
Accuracy and Impact
Our pricing model achieves a median absolute percentage error (MAPE) of 8.3% against actual transaction prices across our dealer network. For high-volume models like the Toyota Camry and Honda Accord, MAPE drops to 5.1%. This is comparable to established markets with far more data availability.
Since launching the public pricing tool, we have processed over 6 million valuation requests. Dealers report that transparent pricing has actually accelerated their sales cycles, as buyers arrive pre-informed and negotiations are shorter and more productive. The information asymmetry that defined African automotive markets is dissolving, and Automovil is leading that change.
Try Automovil today
Discover how Automovil can help you build better, faster. Get started for free and see the difference.
Get Started


