Discover why token usage spikes during Lovable debugging and learn expert tips to cut waste with best practices for optimized sessions.

Book a call with an Expert
Starting a new venture? Need to upgrade your web app? RapidDev builds application with your growth in mind.
Token usage spikes in Lovable debugging sessions because every interactive edit, Preview, regenerate, or pasted log typically resends conversation context, file contents, and long system instructions to the model — and repeated trial-and-error multiplies calls. Large inputs (stack traces, whole files), multiple model calls from previews/autocomplete, and choosing high-capacity models all increase tokens per request and overall calls, so small iterative debugging steps can rapidly balloon token usage.
// Prompt for Lovable to implement non-breaking token-usage instrumentation.
// Goal: collect per-LLM-call token counts so we can see what operations spike usage.
// Do not change feature behavior; add a debug-only wrapper and a read-only debug endpoint.
// 1) Create src/lib/llm-wrapper.ts
// - Export async function callLLM(opts) that accepts the same params as current OpenAI call site.
// - Internally call the existing OpenAI client used in this repo.
// - Capture request/response token counts from the OpenAI response (use response.usage if available).
// - Append a small record to a file logs/token-usage.log (JSON lines) including:
// { timestamp, routeOrFeature, model, promptTokens, completionTokens, totalTokens, note }
// - Keep behavior identical otherwise.
// 2) Find every file that directly calls the OpenAI client (search for "openai", "OpenAI", "chat.completions", or the repo's existing client import).
// - Replace those call sites to import and use callLLM(...) from src/lib/llm-wrapper.ts.
// - At each replacement add a short string for routeOrFeature (e.g., "api/chat", "preview/render", "code-action") so entries are identifiable.
// 3) Create a debug API endpoint at pages/api/_debug/token-usage.ts (or src/pages/api/_debug/token-usage.ts depending on project):
// - Returns the last 100 JSON lines from logs/token-usage.log parsed as JSON.
// - This endpoint must be gated by an environment variable DEBUG_TOKEN_LOG=true; if not set, return 403.
// 4) Add small README at debug/TOKEN_DEBUG_README.md explaining how to enable the debug endpoint via Lovable Secrets UI:
// - Ask the developer to set DEBUG_TOKEN_LOG=true in Lovable Secrets to enable reading the endpoint.
// 5) Do not change production behavior when DEBUG_TOKEN_LOG is not set.
// - Ensure log writes are best-effort and failures do not break requests (try/catch around logging).
// After making changes, run the repo's test or preview (no code changes in behavior) so we can see initial token log entries.
This prompt helps an AI assistant understand your setup and guide you through the fix step by step, without assuming technical knowledge.
Trim conversation context, add a small context-trimmer in your server code, add a dev toggle to stop verbose debug messages, use a lightweight “mini-repro” route with mocked small data, and put the debug flag in Lovable Secrets so you can switch it off without re-editing code. These changes stop your app from sending huge histories, large payloads, or full files to the model while debugging.
// Prompt 1: Add trimContext and use it in the chat handler
// Paste this into Lovable chat (Chat Mode) and ask it to apply as a patch/diff
Please update my project to trim the LLM conversation context before calling the model.
Create file: src/lib/trimContext.ts
Content:
// Trim oldest messages until the approximate character budget fits
export function trimContext(messages: Array<{role:string, content:string}>, charBudget = 12000) {
let total = messages.reduce((s, m) => s + m.content.length, 0);
if (total <= charBudget) return messages;
// always keep system + last 6 user/assistant exchanges
const system = messages.filter(m => m.role === 'system');
const rest = messages.filter(m => m.role !== 'system');
const keep = [];
for (let i = rest.length - 1; i >= 0; i--) {
keep.unshift(rest[i]);
total = system.reduce((s,m)=>s+m.content.length,0) + keep.reduce((s,m)=>s+m.content.length,0);
if (keep.length >= 12 || total <= charBudget) break;
}
return [...system, ...keep];
}
Then edit the file where you build the messages sent to the LLM. Search for the place that assembles "messages" or "history" (common files: src/server/llm.ts, src/server/chatHandler.ts, src/pages/api/chat.ts). Replace direct usage of full history with:
import { trimContext } from '../lib/trimContext';
const safeMessages = trimContext(messages, 12000); // adjust budget if needed
// then call LLM with safeMessages instead of messages
If file paths differ, apply the same change where the model call is made.
// Prompt 2: Add Dev Settings toggle in the UI to disable verbose inputs
// Paste this into Lovable chat and ask to create files and integrate into App
Create file: src/components/DevSettings.tsx
Content:
// A small panel to toggle sending full state to the LLM during development
import React from 'react';
export default function DevSettings() {
const [sendVerbose, setSendVerbose] = React.useState(() => localStorage.getItem('sendVerbose') !== 'false');
React.useEffect(()=>{ localStorage.setItem('sendVerbose', String(sendVerbose)); }, [sendVerbose]);
return (
<div style={{position:'fixed',right:10,top:10,background:'#fff',padding:8,border:'1px solid #ddd',zIndex:999}}>
<label>
<input type="checkbox" checked={sendVerbose} onChange={e=>setSendVerbose(e.target.checked)} />
<span style={{marginLeft:6}}>Send verbose LLM input</span>
</label>
</div>
);
}
Edit src/App.tsx (or main layout) to import and render <DevSettings /> at the top level.
Then update server-side code that builds LLM messages to respect the toggle:
- When calls come from the browser include a header 'x-send-verbose' with value 'true' or 'false' from localStorage.
- In the handler, if header === 'false', strip large state pieces (full file content, long logs) before building messages.
If adding a header is hard, alternatively read process.env.LLM_DEBUG and use Secrets UI (next prompt).
// Prompt 3: Add a mini-repro route that uses mocked small data
// Paste into Lovable chat to create a tiny reproduction endpoint
Create file: src/pages/mini-repro.tsx (or src/routes/mini-repro.jsx depending on your stack)
Content:
// A tiny page to reproduce issues with small payloads
import React from 'react';
export default function MiniRepro() {
async function run() {
const res = await fetch('/api/mini-repro-run');
const json = await res.json();
alert('Result: ' + JSON.stringify(json).slice(0,200));
}
return (<div><h3>Mini Repro</h3><button onClick={run}>Run small repro</button></div>);
}
Create API: src/pages/api/mini-repro-run.ts
Content:
// Return a small mocked conversation and call the same handler path but with tiny data
export default async function handler(req, res) {
const smallMessages = [
{role:'system', content:'You are a helpful assistant.'},
{role:'user', content:'Small repro: summarize "hello world" in one sentence.'}
];
// call internal LLM wrapper or stub it for offline testing
// If you have a server function to call, call it here with smallMessages
res.json({ok:true, smallMessages});
}
// Prompt 4: Set a secret in Lovable (Secrets UI) and read it in server code
// Instruction to perform in Lovable UI: go to Secrets (Lovable Cloud) and add key LLM_DEBUG value false
// Then update server code where you check debug flags:
// In server (e.g., src/server/config.ts)
export const LLM_DEBUG = process.env.LLM_DEBUG === 'true' || false;
// In chat handler, gate verbose behavior:
if (LLM_DEBUG === false) {
// ensure we do not include full logs or full file contents in messages
}
Direct answer: Use a small, explicit debug mode that you toggle from Lovable’s Secrets UI, route calls through a single token-guard wrapper in your app, use a cheap model or a local mock for iterative debugging, sample logs instead of logging every request, and trim or summarize conversation context automatically. Implement these as Lovable-native changes: create a small wrapper file, a mock client, and update your API route(s); set debug flags in Lovable Secrets; and use Preview to test. This keeps token usage predictable and controllable without needing a terminal.
// Please create a new file at src/lib/aiClient.ts with the following content.
// This wrapper reads runtime flags and provides a single export `callAI`
// so all server-side code uses it. It uses a simple token estimator (chars/4),
// supports a DEBUG_MODE and DEBUG_DRYRUN flag read from process.env (set via Lovable Secrets UI),
// enforces max tokens, optionally switches to a cheap model, and does sampled logging.
export type AIOptions = {
model?: string
maxTokens?: number
dryRun?: boolean
sampleLogRate?: number // 0..1
}
function estimateTokens(text: string) {
// very cheap estimator: average 4 chars per token
return Math.ceil(text.length / 4)
}
async function realCallToAPI(payload: any) {
// // Replace this comment with your actual provider call (OpenAI, Anthropic, etc.)
// // Keep calls centralized here so we can change model/maxTokens in one place.
throw new Error('Implement provider call inside realCallToAPI when integrating provider credentials.')
}
async function maybeSampleLog(req: any, usage: any, sampleRate = 0.02) {
// // Simple random sampling to avoid logging every request
if (Math.random() < sampleRate) {
// // Lovable: use your existing logger; this is just a placeholder.
console.log('[AI SAMPLE LOG]', { req, usage })
}
}
export async function callAI(messages: Array<{ role: string; content: string }>, opts: AIOptions = {}) {
const DEBUG_MODE = process.env.DEBUG_MODE === 'true'
const DEBUG_DRYRUN = process.env.DEBUG_DRYRUN === 'true'
const DEBUG_MAX_TOKENS = Number(process.env.DEBUG_MAX_TOKENS || opts.maxTokens || 512)
const SAMPLE_LOG_RATE = Number(process.env.DEBUG_SAMPLE_LOG_RATE || opts.sampleLogRate || 0.02)
const fallbackModel = process.env.DEBUG_MODEL || opts.model || 'gpt-3.5-turbo'
// trim messages to stay under token budget (cheap estimator)
let tokenCount = messages.reduce((sum, m) => sum + estimateTokens(m.content), 0)
while (tokenCount > DEBUG_MAX_TOKENS && messages.length > 1) {
// drop the oldest non-system message to shrink context
messages.splice(1, 1)
tokenCount = messages.reduce((sum, m) => sum + estimateTokens(m.content), 0)
}
const payload = {
model: DEBUG_MODE ? fallbackModel : opts.model || fallbackModel,
messages,
max_tokens: DEBUG_MAX_TOKENS
}
if (DEBUG_DRYRUN || DEBUG_MODE && process.env.USE_LOCAL_MOCK === 'true') {
// // If in dry-run or local mock mode, return a small canned response to avoid token spend
const mock = { content: 'DEBUG MOCK: short response to conserve tokens.' }
await maybeSampleLog(payload, { tokens: estimateTokens(mock.content) }, SAMPLE_LOG_RATE)
return { text: mock.content, usage: { prompt_tokens: estimateTokens(messages.map(m=>m.content).join(' ')), completion_tokens: estimateTokens(mock.content) } }
}
// // Real call (centralized). Integrate your provider here.
const result = await realCallToAPI(payload)
await maybeSampleLog(payload, result.usage, SAMPLE_LOG_RATE)
return { text: result.text, usage: result.usage }
}
// Create src/mocks/aiMock.ts
// Used by the wrapper when USE_LOCAL_MOCK is true. Keep responses short.
// This file is safe to create inside Lovable and doesn't require external deps.
export async function mockAI(messages: Array<{ role: string; content: string }>) {
// // return a concise, deterministic debug response
const text = 'MOCK RESPONSE: reply truncated for debugging.'
return { text, usage: { prompt_tokens: Math.ceil(messages.join(' ').length / 4), completion_tokens: Math.ceil(text.length / 4) } }
}
// Update or create server API route at src/pages/api/chat.ts (or your existing route file).
// Replace direct provider calls with an import from src/lib/aiClient.ts and call callAI(...).
import { callAI } from '../../lib/aiClient'
export default async function handler(req, res) {
// // Ensure this route only runs server-side
if (req.method !== 'POST') return res.status(405).end()
const { messages } = req.body
const result = await callAI(messages, { model: 'gpt-4', maxTokens: 800 }) // choose defaults; wrapper will enforce debug overrides
res.json(result)
}
When it comes to serving you, we sweat the little things. That’s why our work makes a big impact.