/lovable-issues

Reducing Token Usage While Debugging in Lovable

Discover why token usage spikes during Lovable debugging and learn expert tips to cut waste with best practices for optimized sessions.

Matt Graham, CEO of Rapid Developers

Book a call with an Expert

Starting a new venture? Need to upgrade your web app? RapidDev builds application with your growth in mind.

Book a free No-Code consultation

Why Token Usage Spikes During Lovable Debugging Sessions

Token usage spikes in Lovable debugging sessions because every interactive edit, Preview, regenerate, or pasted log typically resends conversation context, file contents, and long system instructions to the model — and repeated trial-and-error multiplies calls. Large inputs (stack traces, whole files), multiple model calls from previews/autocomplete, and choosing high-capacity models all increase tokens per request and overall calls, so small iterative debugging steps can rapidly balloon token usage.

 

Why this happens (detailed)

 

  • Full conversation + context every turn: Lovable’s chat-driven edits include prior messages and the current file context. That growing context is sent to the model each time you ask for a new change.
  • Repeated trial-and-error: Hitting “Regenerate”, applying many small edits, or running Preview repeatedly creates many separate LLM calls that each cost tokens.
  • Large pasted artifacts: Debug sessions often paste stack traces, logs, or whole files into the chat. Those get tokenized and counted on top of the conversation context.
  • Preview / background LLM calls: Preview, code suggestions, or any automated assistant checks can trigger extra model calls you might not notice, adding up fast.
  • Verbose system or role prompts: Long fixed instructions or templates (for tests, linters, reviewers) add the same overhead to every call.
  • Model choice and response size: Higher-capacity models and long model responses both increase token spend per call.
  • Conversation growth multiplies cost: As the chat grows, each subsequent call includes more tokens of history, so cost per call increases over time.
  • Multiple features touch the model: Semantic code search, embeddings, or summarization during debugging spawn extra calls (and embedding costs are separate).

 

Lovable prompt to audit token usage (paste into Lovable chat)

 

// Prompt for Lovable to implement non-breaking token-usage instrumentation.
// Goal: collect per-LLM-call token counts so we can see what operations spike usage.
// Do not change feature behavior; add a debug-only wrapper and a read-only debug endpoint.

// 1) Create src/lib/llm-wrapper.ts
//    - Export async function callLLM(opts) that accepts the same params as current OpenAI call site.
//    - Internally call the existing OpenAI client used in this repo.
//    - Capture request/response token counts from the OpenAI response (use response.usage if available).
//    - Append a small record to a file logs/token-usage.log (JSON lines) including:
//         { timestamp, routeOrFeature, model, promptTokens, completionTokens, totalTokens, note }
//    - Keep behavior identical otherwise.

// 2) Find every file that directly calls the OpenAI client (search for "openai", "OpenAI", "chat.completions", or the repo's existing client import).
//    - Replace those call sites to import and use callLLM(...) from src/lib/llm-wrapper.ts.
//    - At each replacement add a short string for routeOrFeature (e.g., "api/chat", "preview/render", "code-action") so entries are identifiable.

// 3) Create a debug API endpoint at pages/api/_debug/token-usage.ts (or src/pages/api/_debug/token-usage.ts depending on project):
//    - Returns the last 100 JSON lines from logs/token-usage.log parsed as JSON.
//    - This endpoint must be gated by an environment variable DEBUG_TOKEN_LOG=true; if not set, return 403.

// 4) Add small README at debug/TOKEN_DEBUG_README.md explaining how to enable the debug endpoint via Lovable Secrets UI:
//    - Ask the developer to set DEBUG_TOKEN_LOG=true in Lovable Secrets to enable reading the endpoint.

// 5) Do not change production behavior when DEBUG_TOKEN_LOG is not set.
//    - Ensure log writes are best-effort and failures do not break requests (try/catch around logging).

// After making changes, run the repo's test or preview (no code changes in behavior) so we can see initial token log entries.

 

Quick tips about the audit prompt

 

  • Use the debug endpoint only for investigation — gate it via Lovable Secrets.
  • This prompt edits files via Lovable Chat Mode. If you need terminal operations (e.g., to rotate logs off the instance), export to GitHub and continue locally.

Still stuck?
Copy this prompt into ChatGPT and get a clear, personalized explanation.

This prompt helps an AI assistant understand your setup and guide you through the fix step by step, without assuming technical knowledge.

AI AI Prompt

How to Reduce Token Waste When Debugging in Lovable

Trim conversation context, add a small context-trimmer in your server code, add a dev toggle to stop verbose debug messages, use a lightweight “mini-repro” route with mocked small data, and put the debug flag in Lovable Secrets so you can switch it off without re-editing code. These changes stop your app from sending huge histories, large payloads, or full files to the model while debugging.

 

Quick Lovable prompts to paste (do these in Chat Mode)

 

  • Add a context-trimmer and wire it into your chat handler — reduces how many messages you send to the LLM by trimming older messages by token/character estimate.
  • Create a Dev Settings UI toggle — lets you turn off verbose LLM inputs in the browser while debugging.
  • Add a mini-repro route that uses mocked/small data — reproduce bugs with tiny payloads instead of full datasets.
  • Use Lovable Secrets — add LLM\_DEBUG (false) and read it from server code so you can flip behavior without editing code.

 

// Prompt 1: Add trimContext and use it in the chat handler
// Paste this into Lovable chat (Chat Mode) and ask it to apply as a patch/diff
Please update my project to trim the LLM conversation context before calling the model.

Create file: src/lib/trimContext.ts
Content:
// Trim oldest messages until the approximate character budget fits
export function trimContext(messages: Array<{role:string, content:string}>, charBudget = 12000) {
  let total = messages.reduce((s, m) => s + m.content.length, 0);
  if (total <= charBudget) return messages;
  // always keep system + last 6 user/assistant exchanges
  const system = messages.filter(m => m.role === 'system');
  const rest = messages.filter(m => m.role !== 'system');
  const keep = [];
  for (let i = rest.length - 1; i >= 0; i--) {
    keep.unshift(rest[i]);
    total = system.reduce((s,m)=>s+m.content.length,0) + keep.reduce((s,m)=>s+m.content.length,0);
    if (keep.length >= 12 || total <= charBudget) break;
  }
  return [...system, ...keep];
}

Then edit the file where you build the messages sent to the LLM. Search for the place that assembles "messages" or "history" (common files: src/server/llm.ts, src/server/chatHandler.ts, src/pages/api/chat.ts). Replace direct usage of full history with:
import { trimContext } from '../lib/trimContext';
const safeMessages = trimContext(messages, 12000); // adjust budget if needed
// then call LLM with safeMessages instead of messages

If file paths differ, apply the same change where the model call is made.

 

// Prompt 2: Add Dev Settings toggle in the UI to disable verbose inputs
// Paste this into Lovable chat and ask to create files and integrate into App

Create file: src/components/DevSettings.tsx
Content:
// A small panel to toggle sending full state to the LLM during development
import React from 'react';
export default function DevSettings() {
  const [sendVerbose, setSendVerbose] = React.useState(() => localStorage.getItem('sendVerbose') !== 'false');
  React.useEffect(()=>{ localStorage.setItem('sendVerbose', String(sendVerbose)); }, [sendVerbose]);
  return (
    <div style={{position:'fixed',right:10,top:10,background:'#fff',padding:8,border:'1px solid #ddd',zIndex:999}}>
      <label>
        <input type="checkbox" checked={sendVerbose} onChange={e=>setSendVerbose(e.target.checked)} />
        <span style={{marginLeft:6}}>Send verbose LLM input</span>
      </label>
    </div>
  );
}

Edit src/App.tsx (or main layout) to import and render <DevSettings /> at the top level.

Then update server-side code that builds LLM messages to respect the toggle:
- When calls come from the browser include a header 'x-send-verbose' with value 'true' or 'false' from localStorage.
- In the handler, if header === 'false', strip large state pieces (full file content, long logs) before building messages.

If adding a header is hard, alternatively read process.env.LLM_DEBUG and use Secrets UI (next prompt).

 

// Prompt 3: Add a mini-repro route that uses mocked small data
// Paste into Lovable chat to create a tiny reproduction endpoint

Create file: src/pages/mini-repro.tsx (or src/routes/mini-repro.jsx depending on your stack)
Content:
// A tiny page to reproduce issues with small payloads
import React from 'react';
export default function MiniRepro() {
  async function run() {
    const res = await fetch('/api/mini-repro-run');
    const json = await res.json();
    alert('Result: ' + JSON.stringify(json).slice(0,200));
  }
  return (<div><h3>Mini Repro</h3><button onClick={run}>Run small repro</button></div>);
}

Create API: src/pages/api/mini-repro-run.ts
Content:
// Return a small mocked conversation and call the same handler path but with tiny data
export default async function handler(req, res) {
  const smallMessages = [
    {role:'system', content:'You are a helpful assistant.'},
    {role:'user', content:'Small repro: summarize "hello world" in one sentence.'}
  ];
  // call internal LLM wrapper or stub it for offline testing
  // If you have a server function to call, call it here with smallMessages
  res.json({ok:true, smallMessages});
}

 

// Prompt 4: Set a secret in Lovable (Secrets UI) and read it in server code
// Instruction to perform in Lovable UI: go to Secrets (Lovable Cloud) and add key LLM_DEBUG value false
// Then update server code where you check debug flags:

// In server (e.g., src/server/config.ts)
export const LLM_DEBUG = process.env.LLM_DEBUG === 'true' || false;

// In chat handler, gate verbose behavior:
if (LLM_DEBUG === false) {
  // ensure we do not include full logs or full file contents in messages
}

Want to explore opportunities to work with us?

Connect with our team to unlock the full potential of no-code solutions with a no-commitment consultation!

Book a Free Consultation

Best Practices for Managing Token Usage During Debugging in Lovable

Direct answer: Use a small, explicit debug mode that you toggle from Lovable’s Secrets UI, route calls through a single token-guard wrapper in your app, use a cheap model or a local mock for iterative debugging, sample logs instead of logging every request, and trim or summarize conversation context automatically. Implement these as Lovable-native changes: create a small wrapper file, a mock client, and update your API route(s); set debug flags in Lovable Secrets; and use Preview to test. This keeps token usage predictable and controllable without needing a terminal.

 

Practical Lovable prompts to implement token-usage management (paste each into Lovable chat)

 

  • Create a token-guard AI client wrapper (central place to control model, max tokens, sampling, and dry-run):
// Please create a new file at src/lib/aiClient.ts with the following content.
// This wrapper reads runtime flags and provides a single export `callAI`
// so all server-side code uses it. It uses a simple token estimator (chars/4),
// supports a DEBUG_MODE and DEBUG_DRYRUN flag read from process.env (set via Lovable Secrets UI),
// enforces max tokens, optionally switches to a cheap model, and does sampled logging.

export type AIOptions = {
  model?: string
  maxTokens?: number
  dryRun?: boolean
  sampleLogRate?: number // 0..1
}

function estimateTokens(text: string) {
  // very cheap estimator: average 4 chars per token
  return Math.ceil(text.length / 4)
}

async function realCallToAPI(payload: any) {
  // // Replace this comment with your actual provider call (OpenAI, Anthropic, etc.)
  // // Keep calls centralized here so we can change model/maxTokens in one place.
  throw new Error('Implement provider call inside realCallToAPI when integrating provider credentials.')
}

async function maybeSampleLog(req: any, usage: any, sampleRate = 0.02) {
  // // Simple random sampling to avoid logging every request
  if (Math.random() < sampleRate) {
    // // Lovable: use your existing logger; this is just a placeholder.
    console.log('[AI SAMPLE LOG]', { req, usage })
  }
}

export async function callAI(messages: Array<{ role: string; content: string }>, opts: AIOptions = {}) {
  const DEBUG_MODE = process.env.DEBUG_MODE === 'true'
  const DEBUG_DRYRUN = process.env.DEBUG_DRYRUN === 'true'
  const DEBUG_MAX_TOKENS = Number(process.env.DEBUG_MAX_TOKENS || opts.maxTokens || 512)
  const SAMPLE_LOG_RATE = Number(process.env.DEBUG_SAMPLE_LOG_RATE || opts.sampleLogRate || 0.02)
  const fallbackModel = process.env.DEBUG_MODEL || opts.model || 'gpt-3.5-turbo'

  // trim messages to stay under token budget (cheap estimator)
  let tokenCount = messages.reduce((sum, m) => sum + estimateTokens(m.content), 0)
  while (tokenCount > DEBUG_MAX_TOKENS && messages.length > 1) {
    // drop the oldest non-system message to shrink context
    messages.splice(1, 1)
    tokenCount = messages.reduce((sum, m) => sum + estimateTokens(m.content), 0)
  }

  const payload = {
    model: DEBUG_MODE ? fallbackModel : opts.model || fallbackModel,
    messages,
    max_tokens: DEBUG_MAX_TOKENS
  }

  if (DEBUG_DRYRUN || DEBUG_MODE && process.env.USE_LOCAL_MOCK === 'true') {
    // // If in dry-run or local mock mode, return a small canned response to avoid token spend
    const mock = { content: 'DEBUG MOCK: short response to conserve tokens.' }
    await maybeSampleLog(payload, { tokens: estimateTokens(mock.content) }, SAMPLE_LOG_RATE)
    return { text: mock.content, usage: { prompt_tokens: estimateTokens(messages.map(m=>m.content).join(' ')), completion_tokens: estimateTokens(mock.content) } }
  }

  // // Real call (centralized). Integrate your provider here.
  const result = await realCallToAPI(payload)
  await maybeSampleLog(payload, result.usage, SAMPLE_LOG_RATE)
  return { text: result.text, usage: result.usage }
}

 

  • Add a local mock client to use for fast iterative debugging without spending tokens:
// Create src/mocks/aiMock.ts
// Used by the wrapper when USE_LOCAL_MOCK is true. Keep responses short.
// This file is safe to create inside Lovable and doesn't require external deps.

export async function mockAI(messages: Array<{ role: string; content: string }>) {
  // // return a concise, deterministic debug response
  const text = 'MOCK RESPONSE: reply truncated for debugging.'
  return { text, usage: { prompt_tokens: Math.ceil(messages.join(' ').length / 4), completion_tokens: Math.ceil(text.length / 4) } }
}

 

  • Update your server route to use callAI (example file path; update your route if different):
// Update or create server API route at src/pages/api/chat.ts (or your existing route file).
// Replace direct provider calls with an import from src/lib/aiClient.ts and call callAI(...).

import { callAI } from '../../lib/aiClient'

export default async function handler(req, res) {
  // // Ensure this route only runs server-side
  if (req.method !== 'POST') return res.status(405).end()

  const { messages } = req.body
  const result = await callAI(messages, { model: 'gpt-4', maxTokens: 800 }) // choose defaults; wrapper will enforce debug overrides
  res.json(result)
}

 

Secrets & toggles you must set in Lovable (use Lovable Secrets UI)

 

  • Open Lovable → Settings → Secrets (or Secrets UI) and create these keys:
  • DEBUG\_MODE = "true" or "false" — flip this while debugging to enable debug behavior
  • DEBUG\_DRYRUN = "true" — returns mock short responses and avoids real API calls
  • DEBUG\_MODEL = "gpt-3.5-turbo" (a cheaper model for debug)
  • DEBUG_MAX_TOKENS = "512" — global cap while debugging
  • DEBUG_SAMPLE_LOG\_RATE = "0.02" — sample 2% of requests for full logs

 

Short tips and testing workflow inside Lovable

 

  • Always use Preview to exercise the debug mode before Publish; flip DEBUG\_DRYRUN to true for cheap iteration.
  • Centralize changes — ensure every server-side call goes through src/lib/aiClient.ts so toggles work universally.
  • Sample logs to avoid logging every message; use SAMPLE_LOG_RATE to inspect occasional requests only.
  • Trim conversation context automatically in the wrapper rather than relying on the UI to do it manually.
  • If you need terminal-only tooling (like tiktoken for exact token counts), export to GitHub from Lovable and run locally/GitHub Actions — label that as outside Lovable (terminal required).

Client trust and success are our top priorities

When it comes to serving you, we sweat the little things. That’s why our work makes a big impact.

Rapid Dev was an exceptional project management organization and the best development collaborators I've had the pleasure of working with. They do complex work on extremely fast timelines and effectively manage the testing and pre-launch process to deliver the best possible product. I'm extremely impressed with their execution ability.

CPO, Praction - Arkady Sokolov

May 2, 2023

Working with Matt was comparable to having another co-founder on the team, but without the commitment or cost. He has a strategic mindset and willing to change the scope of the project in real time based on the needs of the client. A true strategic thought partner!

Co-Founder, Arc - Donald Muir

Dec 27, 2022

Rapid Dev are 10/10, excellent communicators - the best I've ever encountered in the tech dev space. They always go the extra mile, they genuinely care, they respond quickly, they're flexible, adaptable and their enthusiasm is amazing.

Co-CEO, Grantify - Mat Westergreen-Thorne

Oct 15, 2022

Rapid Dev is an excellent developer for no-code and low-code solutions.
We’ve had great success since launching the platform in November 2023. In a few months, we’ve gained over 1,000 new active users. We’ve also secured several dozen bookings on the platform and seen about 70% new user month-over-month growth since the launch.

Co-Founder, Church Real Estate Marketplace - Emmanuel Brown

May 1, 2024 

Matt’s dedication to executing our vision and his commitment to the project deadline were impressive. 
This was such a specific project, and Matt really delivered. We worked with a really fast turnaround, and he always delivered. The site was a perfect prop for us!

Production Manager, Media Production Company - Samantha Fekete

Sep 23, 2022