Algorithmia as a standalone platform is deprecated — it was acquired by DataRobot in 2021 and its API endpoints shut down. For ML model marketplace functionality in Bolt.new, use Hugging Face Inference API (the modern replacement with 100,000+ models), Replicate (run any ML model via API), or RapidAPI for pre-built ML endpoints. All three are HTTP-based and work seamlessly in Bolt's WebContainer through a Next.js API route.
ML Model Marketplace APIs in Bolt.new — Hugging Face and Replicate
Algorithmia was pioneering when it launched in 2014 — a marketplace where data scientists published models as serverless API endpoints, and developers paid per-call to run them. At its peak, Algorithmia hosted 4,000+ algorithms across natural language processing, computer vision, data processing, and financial analysis. DataRobot acquired Algorithmia in January 2021 and integrated it into DataRobot's MLOps enterprise platform. The standalone Algorithmia.com API and algorithm marketplace were shut down, with existing customers migrated to DataRobot's enterprise offering. If you have existing code using the algorithmia Python or JavaScript client libraries, those API calls no longer work.
The ML model marketplace concept Algorithmia pioneered has since been eclipsed by two platforms that are meaningfully better in scope, quality, and developer experience. Hugging Face hosts 500,000+ models as of 2025 — text classification, summarization, translation, image captioning, audio transcription, embeddings, and more — accessible via a simple Inference API. Replicate runs any model from its registry on GPU infrastructure and returns results via HTTP, covering image generation (Stable Diffusion, FLUX), video generation, audio, code, and 3D models. Both platforms offer free tiers and pay-per-use pricing, and both work perfectly in Bolt.new through server-side Next.js API routes.
For Bolt.new developers who were interested in Algorithmia, the mapping is straightforward: use Hugging Face Inference API for the tasks Algorithmia's NLP and classification models covered, and Replicate for generation tasks (image synthesis, audio processing, code generation) that require significant GPU compute. Both provide better models than what Algorithmia offered in 2020, at competitive pricing, with active development communities and modern API designs. The rest of this guide covers a practical Hugging Face integration you can build in under 20 minutes.
Integration method
The Algorithmia platform's independent API endpoints are deprecated following DataRobot's 2021 acquisition. This page covers the modern ML model marketplace alternatives that Bolt.new developers use instead: Hugging Face Inference API for open-source models, Replicate for GPU-accelerated generation models, and RapidAPI for pre-built ML APIs. All are HTTP-based and called through Next.js API routes with API keys stored server-side in .env.
Prerequisites
- A Hugging Face account at huggingface.co — free tier includes 30,000 free API calls per month for serverless inference
- A Hugging Face API token (Settings → Access Tokens → New token with 'Inference API' permission)
- Optionally, a Replicate account at replicate.com for GPU-accelerated generation models (free credits on signup)
- A Replicate API token (Account → API tokens) if using Replicate models
- A Bolt.new project using Next.js (request Next.js when creating the project for API route support)
Step-by-step guide
Understand Why Algorithmia Is Deprecated and What to Use Instead
Understand Why Algorithmia Is Deprecated and What to Use Instead
Algorithmia's platform history is important context for making the right architectural choice today. Algorithmia launched in 2014 as the first commercial ML model marketplace — data scientists could publish Python, R, Java, or Scala algorithms as REST endpoints, and developers could call them per-use at fractions of a cent. By 2020, Algorithmia hosted 4,500+ algorithms and processed 2 billion API calls. DataRobot acquired Algorithmia in January 2021 and integrated its enterprise MLOps features (model monitoring, deployment pipelines, governance) into DataRobot's platform for large enterprises. The consumer-facing algorithm marketplace and API were sunset, with existing customers migrated to DataRobot enterprise contracts. The Algorithmia JavaScript client library (@algorithmia/algorithmia or algorithmia npm package) still exists on npm but the API endpoints it calls are offline — do not use it for new projects. The Python algorithmia package is similarly defunct for independent use. For Bolt.new developers who need what Algorithmia offered — pre-trained models as HTTP endpoints — the landscape in 2025-2026 is dramatically better. Hugging Face provides the most comprehensive open-model catalog: 500,000+ models, most available via their free Inference API, with a clean REST interface that returns JSON predictions. Replicate focuses on compute-intensive generation models (image, video, audio) and provides a polling-based API that starts GPU instances on demand. RapidAPI aggregates hundreds of third-party ML APIs under one marketplace with unified billing. All three work perfectly in Bolt's WebContainer through standard HTTP calls proxied through Next.js API routes.
Set up Hugging Face Inference API credentials in my Bolt project. Create a .env file with HUGGING_FACE_API_KEY as a placeholder. Create a lib/huggingface.ts utility that exports a huggingfaceInfer function accepting modelId (string) and inputs (string or object). The function calls https://api-inference.huggingface.co/models/{modelId} with POST method, Authorization: Bearer header with HUGGING_FACE_API_KEY, and inputs in the request body. Handle the case where the model is loading by retrying after 20 seconds if the response includes 'estimated_time'. Return the parsed JSON response.
Paste this in Bolt.new chat
1// lib/huggingface.ts2const HF_API_BASE = 'https://api-inference.huggingface.co/models';34export async function huggingfaceInfer<T>(5 modelId: string,6 inputs: string | Record<string, unknown>,7 retryOnLoading = true8): Promise<T> {9 const apiKey = process.env.HUGGING_FACE_API_KEY;10 if (!apiKey) throw new Error('HUGGING_FACE_API_KEY not set in .env');1112 const response = await fetch(`${HF_API_BASE}/${modelId}`, {13 method: 'POST',14 headers: {15 Authorization: `Bearer ${apiKey}`,16 'Content-Type': 'application/json',17 },18 body: JSON.stringify({ inputs }),19 });2021 if (response.status === 503) {22 // Model is loading — HF cold-starts models that haven't been used recently23 const errorBody = await response.json();24 const waitSeconds = errorBody.estimated_time ?? 20;25 if (retryOnLoading) {26 console.log(`HF model ${modelId} loading, retrying in ${waitSeconds}s...`);27 await new Promise((res) => setTimeout(res, waitSeconds * 1000));28 return huggingfaceInfer<T>(modelId, inputs, false); // retry once29 }30 throw new Error(`Model ${modelId} is loading. Try again in ${waitSeconds} seconds.`);31 }3233 if (!response.ok) {34 const error = await response.text();35 throw new Error(`Hugging Face API error ${response.status}: ${error}`);36 }3738 return response.json() as Promise<T>;39}Pro tip: Hugging Face's free Inference API cold-starts models that haven't been used recently — the first request after inactivity returns a 503 with an estimated_time field. The helper above retries automatically. For production, consider using Hugging Face's Inference Endpoints (dedicated instances) to eliminate cold starts.
Expected result: The huggingfaceInfer helper is available for all API routes. It handles model cold-start retries automatically and throws descriptive errors for API failures.
Build a Text Analysis API Route with Hugging Face
Build a Text Analysis API Route with Hugging Face
Hugging Face's Inference API follows a simple pattern: each model has a specific input format and output format documented on its model card at huggingface.co/{model-id}. Text classification models (sentiment, topic, intent) accept inputs as a string or array of strings and return an array of { label, score } objects. Summarization models accept inputs as a string and return [{ summary_text: '...' }]. Zero-shot classification accepts { inputs: 'text', parameters: { candidate_labels: ['label1', 'label2'] } }. Named entity recognition (NER) returns an array of entity objects with word, entity_group, score, start, and end fields. Feature extraction (embeddings) returns a nested array of floats — a 768-dimensional vector for most BERT-based models. When choosing a model ID, prefer models with 'GGUF', 'text-classification', 'feature-extraction', or 'summarization' pipeline tags on Hugging Face, sorted by downloads. The most reliable free-tier models are: distilbert-base-uncased-finetuned-sst-2-english for positive/negative sentiment, facebook/bart-large-cnn for news-style summarization, sentence-transformers/all-MiniLM-L6-v2 for embeddings, and dslim/bert-base-NER for named entity recognition. Avoid very large models (>7B parameters) on the free Inference API — they time out. Stick to models under 1B parameters for reliable responses. The response format varies by task — always check the model's API page on Hugging Face for the exact response structure before building your display logic.
Create a text analysis API route at app/api/analyze-text/route.ts. It accepts POST with body containing text (string) and type ('sentiment' | 'summarize' | 'entities' | 'embeddings'). For sentiment: call huggingfaceInfer with model 'cardiffnlp/twitter-roberta-base-sentiment-latest' and return label and score. For summarize: call 'facebook/bart-large-cnn' and return the summary_text. For entities: call 'dslim/bert-base-NER' and return the entities array. For embeddings: call 'sentence-transformers/all-MiniLM-L6-v2' and return the vector (first element of nested array). Add input validation and return appropriate error messages.
Paste this in Bolt.new chat
1// app/api/analyze-text/route.ts2import { NextRequest, NextResponse } from 'next/server';3import { huggingfaceInfer } from '@/lib/huggingface';45type AnalysisType = 'sentiment' | 'summarize' | 'entities' | 'embeddings';67const MODELS: Record<AnalysisType, string> = {8 sentiment: 'cardiffnlp/twitter-roberta-base-sentiment-latest',9 summarize: 'facebook/bart-large-cnn',10 entities: 'dslim/bert-base-NER',11 embeddings: 'sentence-transformers/all-MiniLM-L6-v2',12};1314export async function POST(request: NextRequest) {15 const body = await request.json();16 const { text, type }: { text: string; type: AnalysisType } = body;1718 if (!text?.trim()) {19 return NextResponse.json({ error: 'text is required' }, { status: 400 });20 }2122 if (!MODELS[type]) {23 return NextResponse.json({ error: `Invalid type. Use: ${Object.keys(MODELS).join(', ')}` }, { status: 400 });24 }2526 try {27 const raw = await huggingfaceInfer(MODELS[type], text);2829 let result;30 switch (type) {31 case 'sentiment': {32 // Returns array of [{label, score}] sorted by score desc33 const preds = raw as Array<{ label: string; score: number }>;34 result = { label: preds[0]?.label, score: preds[0]?.score };35 break;36 }37 case 'summarize': {38 const summary = raw as Array<{ summary_text: string }>;39 result = { summary: summary[0]?.summary_text };40 break;41 }42 case 'entities': {43 result = { entities: raw }; // array of {word, entity_group, score, start, end}44 break;45 }46 case 'embeddings': {47 // Feature extraction returns nested array [[...floats]]48 const vectors = raw as number[][];49 result = { vector: vectors[0], dimensions: vectors[0]?.length };50 break;51 }52 }5354 return NextResponse.json({ type, result });55 } catch (error) {56 return NextResponse.json(57 { error: error instanceof Error ? error.message : 'Analysis failed' },58 { status: 500 }59 );60 }61}Pro tip: The free Hugging Face Inference API includes 30,000 requests/month and rate-limits to 1,000 requests/day per token. For production apps with higher volume, use Hugging Face Inference Endpoints (dedicated infrastructure starting at ~$0.03/hour) or switch specific high-volume tasks to OpenAI's API.
Expected result: POSTing to /api/analyze-text with text and type='sentiment' returns the sentiment label and confidence score. Switching type to 'summarize' returns a concise summary. 'entities' returns named entities. 'embeddings' returns a 384-dimensional vector.
Add Replicate for Image Generation
Add Replicate for Image Generation
Replicate is the preferred platform for GPU-intensive generation tasks that Hugging Face's free Inference API doesn't support well — image generation (FLUX, Stable Diffusion, Ideogram), video generation, audio synthesis, upscaling, and code models requiring large VRAM. Replicate's API uses an asynchronous polling pattern: you POST to /v1/predictions to start the job, receive a prediction object with an id and status of 'starting', then poll the prediction URL until status changes to 'succeeded' with an output field. For image generation, output is an array of image URLs (hosted on Replicate's CDN for 24 hours). The polling interval should start at 1 second and increase gradually to avoid rate limits — Replicate documentation recommends polling every 0.5-1 seconds. Alternatively, Replicate supports webhooks (available in their API) — POST your deployed URL as the webhook parameter, and Replicate calls it when the prediction completes, sending the full prediction object. As always, webhooks require a deployed URL; Bolt's WebContainer cannot receive incoming HTTP traffic during development. For development, the polling approach works without any deployment. Model IDs on Replicate follow the format 'owner/model-name:version-hash' — for example, 'black-forest-labs/flux-schnell' uses the latest version, or 'stability-ai/sdxl:39ed52f2319f9b2321e1be19d0c3095f22e6e956...' for a specific version. The FLUX schnell model is the fastest option (typically 1-3 seconds), while FLUX dev provides higher quality at 15-30 seconds per image. Always use the model's version hash for production to ensure reproducibility.
Create an image generation API route at app/api/generate-image/route.ts using the Replicate API. Accept POST with prompt (string) and optional negative_prompt (string). Call https://api.replicate.com/v1/predictions to start a flux-schnell prediction with Authorization: Token REPLICATE_API_TOKEN from .env. Poll the prediction URL every 1.5 seconds until status is 'succeeded' (max 60 seconds timeout). Return the image URL from the output array. Also return the prediction ID and generation time. Build a ImageGenerator React component with a prompt textarea, Generate button, and a results area that shows the generated image with a download button.
Paste this in Bolt.new chat
1// app/api/generate-image/route.ts2import { NextRequest, NextResponse } from 'next/server';34const REPLICATE_API_TOKEN = process.env.REPLICATE_API_TOKEN;5const REPLICATE_BASE = 'https://api.replicate.com/v1';6const FLUX_MODEL = 'black-forest-labs/flux-schnell';7const POLL_INTERVAL_MS = 1500;8const MAX_WAIT_MS = 90_000;910async function createPrediction(prompt: string, negativePrompt?: string) {11 const response = await fetch(`${REPLICATE_BASE}/models/${FLUX_MODEL}/predictions`, {12 method: 'POST',13 headers: {14 Authorization: `Bearer ${REPLICATE_API_TOKEN}`,15 'Content-Type': 'application/json',16 Prefer: 'wait', // ask Replicate to wait up to 60s before returning17 },18 body: JSON.stringify({19 input: {20 prompt,21 negative_prompt: negativePrompt ?? '',22 num_inference_steps: 4, // schnell is optimized for 4 steps23 width: 1024,24 height: 1024,25 },26 }),27 });28 return response.json();29}3031async function pollPrediction(predictionUrl: string): Promise<string> {32 const startTime = Date.now();33 while (Date.now() - startTime < MAX_WAIT_MS) {34 const res = await fetch(predictionUrl, {35 headers: { Authorization: `Bearer ${REPLICATE_API_TOKEN}` },36 });37 const prediction = await res.json();3839 if (prediction.status === 'succeeded') {40 return prediction.output?.[0] ?? prediction.output;41 }42 if (prediction.status === 'failed' || prediction.status === 'canceled') {43 throw new Error(`Prediction ${prediction.status}: ${prediction.error ?? 'Unknown error'}`);44 }45 await new Promise((resolve) => setTimeout(resolve, POLL_INTERVAL_MS));46 }47 throw new Error('Generation timed out after 90 seconds');48}4950export async function POST(request: NextRequest) {51 const { prompt, negative_prompt } = await request.json();5253 if (!prompt?.trim()) {54 return NextResponse.json({ error: 'prompt is required' }, { status: 400 });55 }5657 const startTime = Date.now();58 try {59 let prediction = await createPrediction(prompt, negative_prompt);6061 // Replicate's 'Prefer: wait' header returns synchronously if fast enough62 let imageUrl: string;63 if (prediction.status === 'succeeded') {64 imageUrl = prediction.output?.[0] ?? prediction.output;65 } else {66 imageUrl = await pollPrediction(prediction.urls.get);67 }6869 return NextResponse.json({70 imageUrl,71 predictionId: prediction.id,72 generationTimeMs: Date.now() - startTime,73 });74 } catch (error) {75 return NextResponse.json(76 { error: error instanceof Error ? error.message : 'Generation failed' },77 { status: 500 }78 );79 }80}Pro tip: Replicate's 'Prefer: wait' header asks the API to wait synchronously for up to 60 seconds before returning. For fast models like FLUX schnell (1-4 seconds), this eliminates the need for polling. The code above handles both the synchronous and polling paths to cover all model speeds.
Expected result: Submitting a prompt to /api/generate-image starts a Replicate FLUX prediction and polls until complete, returning an image URL. The React component displays the generated image within 3-5 seconds for FLUX schnell.
Deploy and Compare Your Model Marketplace Options
Deploy and Compare Your Model Marketplace Options
With both Hugging Face and Replicate integrations working in development, deploying to Netlify or Bolt Cloud is the final step. Both API providers work identically in production — the same outbound HTTP calls your Next.js API routes make in development work after deployment. Add your API keys to Netlify's environment variables (HUGGING_FACE_API_KEY, REPLICATE_API_TOKEN) and redeploy. When deciding which ML model marketplace to use for future features, the following comparison helps. Hugging Face Inference API: best for NLP tasks (sentiment, classification, summarization, translation, NER, embeddings), free tier is generous (30,000 req/month), model selection is unmatched (500,000+ models), latency is 0.5-3 seconds for most text models, response format varies by model task. Replicate: best for generation tasks (images, video, audio, code), pay-per-second of GPU compute (~$0.001-0.01 per image), 5-60 second latency depending on model and GPU type, asynchronous polling pattern, supports webhooks for production (deployed URL required). RapidAPI: good for specialized domain APIs (financial data ML, real estate APIs, sports prediction) where neither HF nor Replicate have a relevant model, unified billing across multiple providers. For a Bolt app that started looking at Algorithmia, the most likely replacement is Hugging Face for text-based ML tasks. Most of what Algorithmia's algorithm marketplace offered in text processing, classification, and data transformation has a direct equivalent in Hugging Face's model hub at better quality and comparable or lower cost.
Build a model comparison dashboard that tests both Hugging Face and Replicate integrations. Create a AnalysisDashboard React component with two sections: (1) Text Analysis section with a textarea and buttons for each analysis type (Sentiment, Summarize, Entities) that calls /api/analyze-text and shows results in a styled card — include the model name that was used and the response time. (2) Image Generation section with a prompt input and Generate button that calls /api/generate-image and shows the generated image with download link and generation time. Add a provider badge (Hugging Face or Replicate) to each results card. Show API call costs as informational text: 'Free tier (Hugging Face)' and approximate cost for Replicate.
Paste this in Bolt.new chat
1// components/AnalysisDashboard.tsx2'use client';3import { useState } from 'react';45interface TextResult { type: string; result: Record<string, unknown>; responseTimeMs?: number; }6interface ImageResult { imageUrl: string; generationTimeMs: number; }78export function AnalysisDashboard() {9 const [text, setText] = useState('');10 const [textResult, setTextResult] = useState<TextResult | null>(null);11 const [textLoading, setTextLoading] = useState<string | null>(null);1213 const [prompt, setPrompt] = useState('');14 const [imageResult, setImageResult] = useState<ImageResult | null>(null);15 const [imageLoading, setImageLoading] = useState(false);16 const [error, setError] = useState<string | null>(null);1718 const analyzeText = async (type: string) => {19 setTextLoading(type);20 setError(null);21 const start = Date.now();22 try {23 const res = await fetch('/api/analyze-text', {24 method: 'POST',25 headers: { 'Content-Type': 'application/json' },26 body: JSON.stringify({ text, type }),27 });28 const data = await res.json();29 if (data.error) throw new Error(data.error);30 setTextResult({ ...data, responseTimeMs: Date.now() - start });31 } catch (err) {32 setError(err instanceof Error ? err.message : 'Failed');33 } finally {34 setTextLoading(null);35 }36 };3738 const generateImage = async () => {39 setImageLoading(true);40 setError(null);41 try {42 const res = await fetch('/api/generate-image', {43 method: 'POST',44 headers: { 'Content-Type': 'application/json' },45 body: JSON.stringify({ prompt }),46 });47 const data = await res.json();48 if (data.error) throw new Error(data.error);49 setImageResult(data);50 } catch (err) {51 setError(err instanceof Error ? err.message : 'Failed');52 } finally {53 setImageLoading(false);54 }55 };5657 return (58 <div className="max-w-2xl mx-auto p-6 space-y-8">59 <div className="space-y-3">60 <div className="flex items-center gap-2">61 <h2 className="text-xl font-bold">Text Analysis</h2>62 <span className="text-xs bg-yellow-100 text-yellow-800 px-2 py-0.5 rounded-full">Hugging Face · Free tier</span>63 </div>64 <textarea value={text} onChange={(e) => setText(e.target.value)}65 placeholder="Enter text to analyze..." rows={4}66 className="w-full border rounded p-3 text-sm" />67 <div className="flex gap-2 flex-wrap">68 {['sentiment', 'summarize', 'entities'].map((type) => (69 <button key={type} onClick={() => analyzeText(type)}70 disabled={!text.trim() || textLoading === type}71 className="px-4 py-1.5 bg-blue-600 text-white rounded text-sm disabled:opacity-50 capitalize">72 {textLoading === type ? 'Analyzing...' : type}73 </button>74 ))}75 </div>76 {textResult && (77 <div className="bg-gray-50 border rounded p-3 text-sm">78 <div className="flex justify-between text-xs text-gray-400 mb-2">79 <span>Type: {textResult.type}</span>80 <span>{textResult.responseTimeMs}ms</span>81 </div>82 <pre className="whitespace-pre-wrap">{JSON.stringify(textResult.result, null, 2)}</pre>83 </div>84 )}85 </div>8687 <div className="space-y-3">88 <div className="flex items-center gap-2">89 <h2 className="text-xl font-bold">Image Generation</h2>90 <span className="text-xs bg-purple-100 text-purple-800 px-2 py-0.5 rounded-full">Replicate · ~$0.003/image</span>91 </div>92 <textarea value={prompt} onChange={(e) => setPrompt(e.target.value)}93 placeholder="Describe an image to generate..." rows={2}94 className="w-full border rounded p-3 text-sm" />95 <button onClick={generateImage} disabled={!prompt.trim() || imageLoading}96 className="w-full bg-purple-600 text-white py-2 rounded disabled:opacity-50">97 {imageLoading ? 'Generating (3-10s)...' : 'Generate with FLUX'}98 </button>99 {imageResult && (100 <div className="space-y-2">101 <img src={imageResult.imageUrl} alt={prompt} className="w-full rounded-lg" />102 <div className="flex justify-between text-xs text-gray-400">103 <span>Generated in {(imageResult.generationTimeMs / 1000).toFixed(1)}s</span>104 <a href={imageResult.imageUrl} download className="text-blue-600">Download</a>105 </div>106 </div>107 )}108 </div>109110 {error && <p className="text-red-600 text-sm">{error}</p>}111 </div>112 );113}Pro tip: Replicate image URLs expire after 24 hours — for any application that needs to persist images, download them immediately after generation and upload to your own storage (Supabase Storage, S3, or Cloudflare R2) rather than storing the Replicate URL.
Expected result: The dashboard shows both Hugging Face text analysis and Replicate image generation working in the same interface, with response times and provider badges visible. Both work in the WebContainer preview without deployment.
Common use cases
Text Analysis with Hugging Face Inference API
Add sentiment analysis, zero-shot classification, named entity recognition, or text summarization to a Bolt app using Hugging Face's hosted inference API. Choose from thousands of specialized models — a sentiment model trained on product reviews, a classifier fine-tuned on support tickets, or a summarization model for long-form content. No GPU infrastructure required.
Build a text analysis dashboard using the Hugging Face Inference API. Create a Next.js API route at app/api/analyze/route.ts that accepts text and an analysis_type parameter (sentiment, summarize, or classify). For sentiment, call https://api-inference.huggingface.co/models/cardiffnlp/twitter-roberta-base-sentiment-latest with the HUGGING_FACE_API_KEY from .env. For summarize, call facebook/bart-large-cnn. For classify, use zero-shot-classification with candidate_labels ['positive', 'negative', 'neutral']. Build a TextAnalyzer React component with a textarea, analysis type tabs, and results showing labels and confidence scores.
Copy this prompt to try it in Bolt.new
Image Generation with Replicate
Add AI image generation to a Bolt app using Replicate's hosted Stable Diffusion or FLUX models. Users enter a text prompt, click Generate, and see an AI-generated image appear. Replicate handles the GPU compute; your Next.js API route manages the polling or webhook pattern to retrieve the result when ready.
Build an AI image generator using Replicate. Create a Next.js API route at app/api/generate-image/route.ts that accepts a text prompt, calls the Replicate API at https://api.replicate.com/v1/predictions to start a FLUX schnell generation (model: black-forest-labs/flux-schnell), polls the prediction status URL until complete, and returns the image URL. Use REPLICATE_API_TOKEN from .env. Build a React ImageGenerator component with a prompt input, Generate button, loading spinner, and the generated image displayed when ready.
Copy this prompt to try it in Bolt.new
Audio Transcription with Replicate Whisper
Build a file upload tool that transcribes audio or video files to text using OpenAI Whisper hosted on Replicate. Users upload a .mp3 or .wav file, the app sends it to Replicate for transcription, and the transcript appears in a copy-ready text area. Useful for meeting notes, podcast transcription, or accessibility features.
Build an audio transcription tool using Replicate's Whisper model. Create a Next.js API route at app/api/transcribe/route.ts that accepts a base64-encoded audio file, creates a Replicate prediction using the openai/whisper model, polls for completion, and returns the transcription text. Use REPLICATE_API_TOKEN from .env. Build a TranscriptionTool React component with a file upload for audio/video files, a Transcribe button, a loading state showing 'Transcribing...' with estimated wait time, and the resulting transcript in a scrollable text area with a Copy to Clipboard button.
Copy this prompt to try it in Bolt.new
Troubleshooting
Hugging Face API returns 503 with 'Model is currently loading' even after waiting
Cause: The serverless Hugging Face Inference API cold-starts models that haven't been used recently. For large models (>1GB), loading can take 30-120 seconds. If the model is also in high demand, it may stay in queue longer.
Solution: The huggingfaceInfer helper auto-retries once after the estimated_time from the error response. If the model is consistently slow, switch to a smaller equivalent (distilbert instead of bert-large) or use Hugging Face's dedicated Inference Endpoints for guaranteed warm instances. Free tier warm-up delays are expected for infrequently used models.
1// Switch to a smaller, faster model that stays warm on the free tier2// Instead of bert-large (1.3GB) use distilbert (66MB)3const SENTIMENT_MODEL = 'distilbert-base-uncased-finetuned-sst-2-english'; // stays warm, fast4// Instead of t5-large (3GB) for summarization:5const SUMMARIZE_MODEL = 'facebook/bart-large-cnn'; // ~1.6GB, popular = stays warmReplicate prediction status stays at 'starting' or 'processing' indefinitely
Cause: GPU resources may be temporarily unavailable (queue is full), the model is being pulled for the first time (can take 5-15 minutes for large models on cold start), or your API token has insufficient credits.
Solution: Check your Replicate account's credit balance at replicate.com/account/billing. For the first run of a model, wait up to 15 minutes for the initial pull. Monitor prediction status in real time at replicate.com/predictions. If consistently timing out, switch to a smaller or faster model version.
1// Increase timeout for models that are slow to start2const MAX_WAIT_MS = 180_000; // 3 minutes for larger models3// Or check prediction status in the Replicate dashboard:4// https://replicate.com/predictions/{predictionId}Algorithmia npm package fails with 'Network error' or 'API unavailable'
Cause: The Algorithmia platform's API endpoints (api.algorithmia.com) are offline following DataRobot's acquisition and platform sunset in 2021-2022. The algorithmia JavaScript and Python client libraries make calls to these defunct endpoints.
Solution: Algorithmia is permanently deprecated — do not use the algorithmia npm package for new projects. Migrate to Hugging Face Inference API (for NLP and classification tasks), Replicate (for generation tasks), or DataRobot's enterprise platform (for existing enterprise Algorithmia customers). Remove the algorithmia package from your project's dependencies.
1// Remove Algorithmia dependency2// npm uninstall algorithmia3// or remove from package.json:4// "algorithmia": "x.x.x" ← delete this line56// Replace with Hugging Face (for text/classification tasks)7import { huggingfaceInfer } from '@/lib/huggingface';8const result = await huggingfaceInfer('distilbert-base-uncased-finetuned-sst-2-english', 'My text here');Best practices
- Use Hugging Face Inference API for NLP tasks (sentiment, classification, summarization, NER, embeddings) — it has the widest model selection and generous free tier for most text-based ML needs
- Use Replicate for GPU-intensive generation (images, video, audio) where Hugging Face's free tier times out or doesn't support the specific model architecture needed
- Always proxy Hugging Face and Replicate calls through Next.js API routes — never call these APIs with secret keys from client-side code
- Download and re-host Replicate-generated images immediately — Replicate CDN URLs expire after 24 hours, so any application persisting generated content must save images to Supabase Storage, S3, or similar
- Choose Hugging Face models under 1GB for reliable response times on the free tier — larger models cold-start slowly and time out frequently on shared serverless infrastructure
- Never attempt to use the Algorithmia npm package or algorithmia.com API endpoints — the platform is shut down and all API calls will fail
- Test Hugging Face and Replicate calls from Bolt's WebContainer during development — both outbound API calls work without deployment, giving you real model results during prototyping
Alternatives
OpenAI's API provides the highest-quality language models and GPT-4 Vision, best for tasks requiring complex reasoning or multimodal input rather than simple classification or generation.
TensorFlow.js runs pre-trained models directly in the browser with no API cost or key — the right choice for vision and text tasks that work within the capabilities of browser-runnable model sizes.
DeepAI provides focused APIs for specific tasks (colorization, super-resolution, text summarization) with simple API key authentication and predictable per-request pricing.
RapidAPI aggregates hundreds of specialized ML APIs under unified billing — useful when you need domain-specific APIs (financial ML, sports prediction) not available on Hugging Face or Replicate.
Frequently asked questions
Is Algorithmia still active and can I use it in Bolt.new?
No — Algorithmia as an independent platform is deprecated. DataRobot acquired Algorithmia in January 2021 and shut down the standalone algorithm marketplace and API. The algorithmia npm package and Python client library no longer function. Do not use Algorithmia for new projects. Use Hugging Face Inference API (for text/classification tasks) or Replicate (for generation tasks) as modern equivalents.
What is the best Algorithmia replacement for Bolt.new?
For text and NLP tasks (sentiment analysis, classification, summarization, translation, embeddings), Hugging Face Inference API is the direct replacement — 500,000+ models, free tier with 30,000 requests/month, and simple REST API. For image generation, audio, and compute-intensive models, Replicate is the best option. Both work seamlessly in Bolt.new through Next.js API routes.
How do I connect Bolt.new to Hugging Face Inference API?
Create a free Hugging Face account, generate an API token in Settings → Access Tokens, and add it to your Bolt project's .env file as HUGGING_FACE_API_KEY. Create a Next.js API route that calls https://api-inference.huggingface.co/models/{model-id} with Authorization: Bearer header. React components call your /api/ route — never the Hugging Face API directly, since the API key must stay server-side.
Do Hugging Face API calls work in Bolt's WebContainer during development?
Yes — your Next.js API route makes outbound HTTPS calls to Hugging Face's servers, which works fine in Bolt's WebContainer. You'll get real model predictions during development without deploying. The only limitation is that models may cold-start with a 503 response if they haven't been used recently — the helper code in this guide retries automatically after the estimated wait time.
What does Replicate cost for image generation in Bolt.new?
Replicate charges by GPU compute time. FLUX schnell (the fastest image generation model) costs approximately $0.003-0.005 per image. Stable Diffusion XL costs ~$0.02 per image. New accounts receive free credits. Pricing updates regularly — check replicate.com/pricing for current rates. Unlike Algorithmia's per-algorithm pricing, Replicate charges for actual GPU time consumed.
Talk to an Expert
Our team has built 600+ apps. Get personalized help with your project.
Book a free consultation