CacheFlow AI - AI API Cost OptimizerCacheFlow AI - AI API Cost Optimizer

Name: CacheFlow AI - AI API Cost Optimizer
Brand: eneseserkan
SKU: 63905
Availability: InStock

Cut AI API costs by 60-85%. Smart caching, free API routing, local models. Works with OpenAI, Claude, Gemini, Cursor. One-time buy

Home / Scripts & Code / NodeJS

CacheFlow AI - AI API Cost Optimizer

Cut AI API costs by 60-85%. Smart caching, free API routing, local models. Works with OpenAI, Cla...

Live Demo Add bookmark Like

Screenshots

Overview

CacheFlow AI is a local proxy that sits between your app and AI APIs (OpenAI, Claude, Gemini), automatically reducing your costs by 60-85% without changing your code or sacrificing quality.

How it works:
1. Smart Caching — Same question = instant answer at $0. SQLite-based, zero dependencies.
2. Free API Routing — Simple tasks auto-route to Groq, Cerebras, OpenRouter (70B+ models at $0).
3. Local Model Support — Ollama integration with auto hardware detection (NVIDIA, AMD, Apple Silicon).
4. Prompt Compression — 10-30% token reduction on every request.
5. Real-Time Dashboard — Beautiful dark-themed UI with live savings counter and request logs.

Setup: npm install → npx cacheflow init → npx cacheflow start. Then change one line: baseURL: "http://127.0.0.1:4747/v1"

Includes 30 source files, 8 AI provider integrations, real-time WebSocket dashboard, CLI with init wizard, auto hardware detection, SQLite-based caching + analytics. Node.js 18+. MIT License.

Works with OpenAI SDK, Anthropic SDK, Cursor, LangChain, and any OpenAI-compatible tool.

Features

- Smart caching (SQLite-based exact match) — duplicate requests served instantly at $0
- Free API routing to Groq, Cerebras, OpenRouter, Gemini Free — 70B+ models at $0
- Local model support via Ollama — auto-detects NVIDIA GPU, AMD GPU, Apple Silicon
- Prompt compression — 10-30% token reduction per request
- Real-time dashboard with live savings counter, request timeline, provider breakdown
- OpenAI-compatible API — /v1/chat/completions, /v1/completions, /v1/embeddings, /v1/models
- Anthropic Messages API compatibility (/v1/messages)
- 8 provider integrations: OpenAI, Anthropic, Gemini, Groq, Cerebras, Ollama, OpenRouter, Gemini Free
- CLI: init wizard (auto-detects hardware + API keys), start, stop, status, stats, demo
- Cost estimation with per-model pricing for 15+ models (GPT-4o, Claude Sonnet, Gemini Pro, etc.)
- Streaming support (SSE) with analytics tracking
- Request analytics with detailed stats API
- YAML configuration with sensible defaults
- Node.js 18+ with ES Modules
- Full test suite included
- .env.example with all configuration options
- MIT License — use however you want
```

Requirements

- Node.js 18 or higher
- npm or yarn
- No external database needed (SQLite built-in)
- Optional: Ollama for local models
- Optional: Free API keys (Groq, Cerebras — get at groq.com)
- Optional: Paid API keys (OpenAI, Anthropic, Google)
```

Instructions

1. Extract the ZIP file
2. cd Source_Code
3. npm install (or use included node_modules)
4. npx cacheflow init (auto-detects your hardware and API keys)
5. npx cacheflow start (proxy starts on localhost:4747, dashboard on :4748)
6. Change your app's base URL to http://127.0.0.1:4747/v1
7. Open http://localhost:4748 for the real-time dashboard
8. Run: npx cacheflow demo (sends test requests to verify everything works)
9. Run: npx cacheflow status (shows live stats and savings)

For OpenAI SDK: new OpenAI({ baseURL: "http://127.0.0.1:4747/v1" })
For Anthropic SDK: new Anthropic({ baseURL: "http://127.0.0.1:4747/v1" })
For Cursor: Settings → API → Custom API URL → http://127.0.0.1:4747/v1
```

Other items by this author

View author's shop

Free support
Future product updates
Quality checked by Codester
Lowest price guarantee

Buy Now

Information

Category	Scripts & Code / NodeJS
First release	30 March 2026
Last update	30 March 2026
Files included	.css, .html, Javascript .js
Tags	cursor, NodeJS, developer tools, openai, llm, gemini, claude, groq, ollama, ai api proxy, smart caching, cost optimizer, token saver, api gateway, langchain