Zero-Downtime AI: Building a Sentiment Pipeline That Never Fails

⚛️React ⚡Vite 🟨JavaScript 🚀FastAPI 🐍Python 🍃MongoDB 🤗Hugging Face ☁️AWS Lambda

The Problem

Mind Mirror needs real-time sentiment analysis for every journal entry, but the Hugging Face Inference API has rate limits, occasional downtime, and costs that scale with usage. The system needed to be resilient against API failures while maintaining acceptable sentiment quality.

The Approach

Implemented a three-tier sentiment pipeline: (1) Check LRU cache — an OrderedDict-based cache holding 256 entries with 30-minute TTL. (2) Call Hugging Face Inference API (cardiffnlp/twitter-roberta-base-sentiment-latest) with retry logic for 429 and 5xx responses using exponential backoff. (3) Fall back to TextBlob for offline sentiment analysis. The cache key is the normalized journal text, so identical or very similar entries resolve instantly. Trigger analysis uses keyword-based classification (Work, Fatigue, Social, Health categories) applied post-sentiment to correlate mood patterns with life domains. The dual deployment strategy (standard Uvicorn server + AWS Lambda via Mangum) required careful handling of cold starts — the MongoDB connection pool is initialized during the FastAPI lifespan event.

The Outcome

Cache hit rate of ~40% for regular journalers reduces API calls significantly. The fallback chain means zero downtime for sentiment analysis — users never see an error. TextBlob quality is acceptable for fallback (accuracy ~70% vs ~85% for the RoBERTa model). The day-key migration system handles backward compatibility when the entry schema evolved, preventing data loss for existing users.

Key Highlights

Zero downtime: 3-tier fallback chain

~40% cache hit rate, fewer API calls

85% accuracy (RoBERTa) / 70% fallback (TextBlob)

Dual deploy: Uvicorn + AWS Lambda

Backward-compatible schema migration

PreviousWhy I Built a Recommendation Engine Without GPT — And It's Better