NanoStudio

Project Overview
- Name: NanoStudio
- Type: AI-powered e-commerce product imagery tool
- Recognition: Google DeepMind "Nano Banana" Hackathon - one of the winners out of 800+ worldwide submissions
- Architecture: Full-stack Next.js 15 App Router application with server-side AI orchestration
- Purpose: Turn a single product photo into campaign-ready mockups and AI-edited imagery using natural language
Overview
NanoStudio is a Next.js 15 application built in 17 hours for the Nano Banana Hackathon (Google DeepMind and Kaggle), one of Kaggle's largest hackathons, where it was one of the winners out of 800+ worldwide submissions. The app wraps Google's Gemini 2.5 Flash Image Preview model (the "Nano Banana" model) behind a focused, product-photography-first workflow so a merchant can upload one reference image and generate professional studio shots, lifestyle scenes, and edited variations without a photo studio.
Key Features
Scene Weaver
- Reference-grounded generation - every prompt is sent to Gemini alongside the product's reference image so identity, materials, proportions, and branding are preserved across scenes
- Curated scenes - studio, outdoor, indoor, pedestal, flat-lay, in-use, and surreal presets, plus free-form custom scene prompts
- Aspect ratio control - 1:1, 16:9, 9:16, 4:5, or original, injected as explicit model instructions
- Batch generation - multiple scene variations produced in a single run, persisted as a generation "run"
- Bring your own key - an optional per-session Gemini API key override isolates quotas without exposing team credentials
Magic Edit
- Natural-language editing - describe a change in plain text and the model re-renders the image
- Style and consistency controls - adjustable style strength and product-consistency settings
- Before/after comparison - tab and slider comparison modes with a zoom/pan canvas
- Edit history - every edit is tracked so prior versions can be revisited
Templates, Products, and Gallery
- 15 seeded templates - 360-degree views, studio essentials pack, colorway showcase, macro detail, lifestyle context, ad creative kit, seasonal themes, and more, each carrying its own base prompt, variant list, and per-template options schema
- Product library - reusable products with reference images and AI-friendly descriptions
- Gallery - browse all generated mockups with metadata and high-quality PNG export
Architecture / How It Works
Generation pipeline
- The client posts a product id plus the chosen scenes, template, options, variant indices, and aspect ratio to
app/api/runs. - The server loads the product and its reference image, then builds the final prompts in
lib/prompt.ts. - Prompt construction layers an identity-preservation clause, the scene or template body, the aspect-ratio instruction, and a professional-lighting directive. Templates interpolate
{{product_description}}and option values, and expand selected variants (capped at four) into one prompt per image. lib/gemini.tscallsgemini-2.5-flash-image-previewvia@google/genai, sending the prompt together with the base64 reference image as inline data. Calls are wrapped in an exponential-backoff retry helper, and Gemini clients are cached per API key.- Returned image parts are decoded from base64 into buffers and stored.
Data and storage
- Turso (LibSQL) is the database. The schema (
db/schema.sql) definesproducts,runs,images,templates, andeventstables. - Images are stored as BLOBs directly in the database rather than on a filesystem, and served back through
app/api/images/[id]. This keeps the app fully stateless and deployable to Vercel edge/serverless without external object storage. - Runs are first-class records - each run persists its prompts, status (processing/completed/failed), and a JSON array of results, giving the gallery and history views durable state.
- Self-initializing schema - on first run the server reads
schema.sql, seeds the 15 templates idempotently, and applies lightweight column migrations (PRAGMA table_infochecks) so existing databases upgrade in place. - Usage events are logged to drive quota reporting against Gemini's free-tier limits.
API surface (Next.js route handlers)
runs- create and read generation runs, including recent runsedit- the Magic Edit endpointproducts- CRUD for the product librarytemplates- template listing and banner imageryimages- BLOB serving by idusage- quota/usage reporting
Technical Stack
- Framework: Next.js 15 (App Router), React 18, TypeScript, Node.js runtime route handlers
- AI Model: Google Gemini 2.5 Flash Image Preview via the
@google/genaiSDK - Database: Turso / LibSQL (
@libsql/client) with image BLOB storage - UI: Tailwind CSS v4, shadcn/ui on Radix primitives, lucide-react icons, next-themes for dark mode
- Forms and validation: React Hook Form with Zod
- Notifications: Sonner toasts
- Deployment: Vercel (serverless functions,
@vercel/analytics) - License: AGPL-3.0
Architecture Diagram
Drag to pan, scroll to zoom