EchoHunt - Gemini 3 AR Quest

Project Description

EchoHunt is a mobile-first web application that creates personalized AI-generated scavenger hunts using the power of the Gemini AI. Unlike traditional scavenger hunt apps that rely on GPS or pre-defined lists, EchoHunt uses Dynamic Visual Intelligence.

How it works:
1. Point your camera around your space (home, office, classroom, etc.)
2. Gemini AI analyzes your environment and generates 3-10 personalized riddles based on what it sees
3. The AI creates unique clues tailored to YOUR specific location
4. Hunt for objects as Gemini provides real-time feedback ("Getting warmer!", "Too dark!")

Every quest is completely unique and dynamically generated. The app features animated AR-style overlays, voice feedback, and runs entirely in the browser with no app store download required. Each player gets their own personalized treasure hunt!

How we used Gemini 3

Dynamic Quest Generation: We use Gemini 3 Flash to analyze a snapshot of the user's environment and generate 3-10 personalized riddles. The AI identifies available objects, colors, textures, and materials, then creates creative clues ordered from easiest to hardest.
Real-time Visual Validation: During gameplay, Gemini 3 analyzes camera frames to verify if the user found the correct object. The model identifies objects, lighting conditions, and provides contextual hints ("Get closer", "Too dark").
Structured JSON Output: We rely on Gemini's strict JSON schema generation for both quest creation and validation. The API returns typed responses with matched, confidence, hints, and voice_lines, ensuring deterministic game behavior.
Multi-modal Intelligence: The same AI that generates the quest also validates player progress, creating a cohesive and intelligent game experience.

Demo Script (90s)

[0:00] INTRO
"This is EchoHunt. The world's first AI-generated AR scavenger hunt, powered by Gemini."

[0:10] SETUP
"I tap Start Quest. The app asks me to point my camera around my room."
"I select 5 riddles and tap Generate Quest."
"Gemini AI analyzes my environment... and creates personalized riddles just for my space!"

[0:30] GAMEPLAY
"First clue: 'Find something with visible text.' I point at a blank wall."
"I tap Scan. Gemini speaks: 'I don't see any text here.'"
"I move to a book. I enable Auto Scan for continuous checking."

[0:50] SUCCESS
"Gemini recognizes the text! Confetti explodes. Voice: 'Perfect! Next clue unlocked.'"
"The AI generated riddle: 'Find something reflective' - it saw my mirror earlier!"

[1:10] TECH
"Two Gemini calls: First analyzes the room and generates custom riddles. Second validates in real-time."
"Strict JSON schemas ensure reliable game logic. Works on any mobile browser."

[1:25] CLOSE
"EchoHunt: Every hunt is unique. Your space. Your quest."

Architecture

Phase 1: Quest Generation

User Camera → Capture Environment Snapshot
           → /api/generate-quest
           → Gemini AI (analyzes scene)
           → Returns 3-10 personalized riddles
           → Game starts with custom quest

Phase 2: Real-time Gameplay

User Camera → Capture Frame
           → /api/scan + Current Riddle
           → Gemini AI (validates object)
           → Returns: matched, confidence, hints, voice
           → UI updates + TTS speaks feedback
           → Confetti on success → Next riddle

Tech Stack: Next.js 16 (App Router), Gemini 3 Flash/Pro, Browser MediaDevices API, Canvas API, Web Speech API (TTS), Tailwind CSS