Coffee recommendation systems typically either use basic filtering (too simple) or LLM-powered suggestions (expensive, unpredictable latency, non-deterministic). Coffee Sommelier needed recommendations that feel personalized while remaining fast, predictable, and cost-free to operate at scale.
Built a deterministic scoring engine using weighted cosine similarity between user preference vectors and product feature vectors. User vectors encode preferences for roast level, origin region, brew method, and flavor notes. Product vectors are pre-computed from cafe menu data. To prevent recommendation homogeneity, implemented Maximal Marginal Relevance (MMR) diversification — each successive recommendation is penalized for similarity to already-selected items. The scoring weights are configurable via the admin dashboard, allowing business operators to tune the recommendation behavior. Geolocation filtering uses the haversine formula to pre-filter cafes within a configurable radius before scoring.
Zero LLM costs with sub-50ms recommendation latency. The MMR diversification ensures users see variety rather than a cluster of similar cafes. Configurable weights mean the business can A/B test different scoring strategies without code changes. The multi-frontend architecture (consumer, admin, widget, B2B) allows the recommendation engine to serve different contexts through a single API.