| name | nano-banana-builder |
| description | Build full-stack web applications powered by Google Gemini's Nano Banana & Nano Banana Pro image generation APIs. Use when creating Next.js image generators, editors, galleries, or any web app that integrates gemini-2.5-flash-image or gemini-3-pro-image-preview models. Covers React components, server actions, API routes, storage, rate limiting, and production deployment patterns.
|
Nano Banana Builder
Build production-ready web applications powered by Google's Nano Banana image generation APIsโcreating everything from simple text-to-image generators to sophisticated iterative editors with multi-turn conversation.
CRITICAL: Exact Model Names
Use ONLY these exact model strings. Do not invent, guess, or add date suffixes.
| Model String (use exactly) | Alias | Use Case |
|---|
gemini-2.5-flash-image | Nano Banana | Fast iterations, drafts, high volume |
gemini-3-pro-image-preview | Nano Banana Pro | Quality output, text rendering, 2K |
Common mistakes to avoid:
- โ
gemini-2.5-flash-preview-05-20 โ wrong, date suffixes are for text models
- โ
gemini-2.5-pro-image โ wrong, 2.5 Pro doesn't do image generation
- โ
gemini-3-flash-image โ wrong, doesn't exist
- โ
gemini-pro-vision โ wrong, that's for image input, not generation
The only valid image generation models are gemini-2.5-flash-image and gemini-3-pro-image-preview.
Philosophy: Conversational Image Generation
Nano Banana isn't just another image APIโit's conversational by design. The core insight is that image generation works best as a dialogue, not a one-shot prompt.
Think of it as working with an AI art director:
- Iterative refinement โ Build up images through conversation, not perfection in one prompt
- Context awareness โ The model "remembers" previous generations and edits
- Natural language editing โ Describe changes conversationally, not with parameters
Before Building, Ask
- What's the primary use case? Text-to-image generation? Image editing? Multi-image composition? Style transfer?
- Which model fits the need? Nano Banana (speed/iterations) or Nano Banana Pro (quality/complex prompts)?
- What's the user journey? Single generation? Iterative refinement? Gallery browsing?
- What are production constraints? Rate limits? Storage? Cost per image? User volume?
Core Principles
- Conversation over configuration: Leverage Nano Banana's iterative editing rather than complex parameter UIs
- Model selection matters: Use
gemini-2.5-flash-image for speed/iterations, gemini-3-pro-image-preview for quality/complexity
- State as conversation history: Track generations as chat messages to enable multi-turn editing
- Rate limit awareness: Image generation has strict quotasโimplement queuing and caching
- Storage strategy: Store generated images (Vercel Blob/S3), not just inline base64
Model Selection Framework
Choose based on use case:
| Use Case | Model | Why |
|---|
| Rapid iterations, drafts | gemini-2.5-flash-image | Fast (2-5s), lower cost per image |
| Final output, quality | gemini-3-pro-image-preview | Superior quality, thinking, text rendering |
| Text-heavy images | gemini-3-pro-image-preview | Best typography, 2K resolution |
| Multi-turn editing | Either | Both support conversational editing |
| High volume | gemini-2.5-flash-image | Lower cost, faster throughput |
Quick Start
Basic Server Action
'use server'
import { google } from '@ai-sdk/google'
import { generateText } from 'ai'
export async function generateImage(prompt: string) {
const result = await generateText({
model: google('gemini-2.5-flash-image'),
prompt,
providerOptions: {
google: {
responseModalities: ['IMAGE'],
imageConfig: { aspectRatio: '16:9' }
}
}
})
return result.files[0]
}
Client Component with useChat
'use client'
import { useChat } from '@ai-sdk/react'
export function ImageGenerator() {
const { append, messages, isLoading } = useChat({
api: '/api/generate'
})
return (
<div>
{messages.map(m => (
<div key={m.id}>
{m.parts?.map((part, i) =>
part.type === 'image' && (
<img key={i} src={part.url} alt="Generated" />
)
)}
</div>
))}
<button
disabled={isLoading}
onClick={() => append({
role: 'user',
content: 'A futuristic cityscape at dusk'
})}
>
Generate
</button>
</div>
)
}
Advanced Implementation
For complete implementations including:
- Server Actions with model selection, storage, and error handling
- API Routes with streaming responses
- Client Components with iterative editing and galleries
- Advanced Patterns like multi-image composition and batch generation
See references/advanced-patterns.md
Configuration & Operations
For detailed configuration and operational concerns:
- Provider Options (responseModalities, imageConfig, thinkingConfig)
- Storage Strategy (Vercel Blob, S3/R2 implementations)
- Rate Limiting (Upstash Redis patterns, quota management)
- Cost Optimization strategies
See references/configuration.md
Anti-Patterns to Avoid
โ Inventing model names or adding date suffixes:
Why wrong: Image generation models have specific names; date suffixes like -preview-05-20 are for text models only
Better: Use exactly gemini-2.5-flash-image or gemini-3-pro-image-preview โ no variations
โ Using Gemini 2.5 Pro for images:
Why wrong: Gemini 2.5 Pro doesn't generate images directly
Better: Use gemini-2.5-flash-image or gemini-3-pro-image-preview
โ Storing only base64 in database:
Why wrong: Blobs database, expensive storage, slow retrieval
Better: Store in object storage (Vercel Blob/S3), save URL only
โ No rate limit handling:
Why wrong: Will hit 429 errors in production, poor UX
Better: Implement rate limiting with user-friendly error messages
โ Ignoring multi-turn context:
Why wrong: Wastes Nano Banana's conversational editing strength
Better: Track chat history for iterative refinement
โ Hardcoding API keys client-side:
Why wrong: Exposes credentials, security risk
Better: Use server actions / API routes with environment variables
โ Using wrong aspect ratio:
Why wrong: 21:9 on 1:1 request wastes tokens, unexpected crop
Better: Match aspect ratio to intended use case
โ No loading states:
Why wrong: Image generation takes 5-30s, users think it's broken
Better: Show progress indicators and estimated wait time
โ Generating on every keystroke:
Why wrong: Wastes quota, slow response
Better: Debounce prompts, require explicit action
Variation Guidance
IMPORTANT: Every app should feel uniquely designed for its specific purpose.
Vary across dimensions:
- UI Style: Minimal, brutalist, playful, professional, dark, light
- Color Scheme: Warm, cool, monochrome, vibrant, muted
- Layout: Single page, multi-step wizard, sidebar, grid, list
- Interaction: Click-to-generate, drag-and-drop, real-time typing, batch
Avoid overused patterns:
- โ Default Tailwind purple gradients
- โ Generic "AI startup" aesthetic
- โ Same component libraries for every project
- โ Inter/Roboto fonts without thought
Context should drive design:
- Meme generator โ Bold, fun, casual
- Product mockup tool โ Clean, professional, grid-based
- Art exploration โ Gallery-first, visual-heavy
- Brand asset creator โ Polished, template-guided
Environment Setup
GEMINI_API_KEY=your_api_key_here
BLOB_READ_WRITE_TOKEN=your_vercel_token
S3_BUCKET=your-bucket
S3_ENDPOINT=https://your-endpoint.r2.cloudflarestorage.com
S3_ACCESS_KEY_ID=your_key
S3_SECRET_ACCESS_KEY=your_secret
UPSTASH_REDIS_REST_URL=your_url
UPSTASH_REDIS_REST_TOKEN=your_token
npm install @ai-sdk/google ai @ai-sdk/react @vercel/blob
npm install google-genai
Remember
Nano Banana enables conversational image generation that feels like working with a creative partner, not a tool.
The best apps:
- Leverage multi-turn editing for refinement
- Choose models intentionally (speed vs quality)
- Handle rate limits gracefully
- Store images efficiently
- Provide great loading states
- Feel uniquely designed for their purpose
You're building more than an image generatorโyou're creating a creative experience. Design it thoughtfully.