After finishing the requirement definition for my new app, “Velocity English,” I was ready to dive into coding. Or so I thought.
Suddenly, a 429 error struck. Just three days ago, everything was running perfectly.
A 429 error usually indicates resource exhaustion. But I hadn’t even started using it yet—this happened on the very first test run after launch.
Why did this happen? Let’s dive into the investigation.
- The Problem: Encountered a sudden
429 Resource Exhaustederror, followed by a mysterious404 Not Foundeven with working code. - The Fix: Identified that “dynamic model discovery” was wasting API quotas, but discovered the real cause was API deprecation.
- Decision: Stopped “fixing” an unstable API and switched to Groq to maintain development momentum.
目次
1. The Incident: Yesterday’s Working Code now “429 Resource Exhausted”

In the middle of developing Velocity English, I encountered this error on my first boot of the day:
Bash
google.api_core.exceptions.ResourceExhausted: 429 You exceeded your current quota...
Identifying the Root Cause
First, I suspected a coding issue—specifically, the possibility that multiple API requests were being fired in a short burst.
Upon analyzing the code, I realized that a function I had “helpfully” implemented to dynamically search for available Flash models was the culprit.
Python
def get_best_model_name():
"""Search for available Flash model names in the environment"""
try:
model_list = [m.name for m in genai.list_models() if 'generateContent' in m.supported_generation_methods]
# Prioritize 1.5 flash
for name in model_list:
if "gemini-1.5-flash" in name:
return name
return model_list[0] if model_list else "models/gemini-1.5-flash"
except Exception:
return "models/gemini-1.5-flash"
The sad reality: Each single button click was firing two requests: one for “model discovery” and one for “content generation.”
Since I am now explicitly targeting Gemini 1.5, I decided to scrap this discovery logic entirely.
2. The Fix: Hardcoding the Model Name
To save API requests, I removed the dynamic search and changed the operation to specify the model name directly.
Python
# Before
# target_model = get_best_model_name()
# After
TARGET_MODEL = "gemini-1.5-flash"
3. The Next Wall: “404 models/gemini-1.5-flash is not found”

Thinking, “Will it work? I have a bad feeling about this,” I ran the app again. My intuition was right—a 404 error appeared.
Bash
google.api_core.exceptions.NotFound: 404 models/gemini-1.5-flash is not found for API version v1beta...
As an engineer, I’ve learned that errors often have multiple layers. Sometimes, the error on the surface is just a mask for deeper issues. That was exactly the case here.
I tried several patterns, but all failed:
gemini-1.5-flashmodels/gemini-1.5-flash-latestgemini-flash-latest
Conclusion: The Risk of API Dependency and the Importance of “Cutting Your Losses”
Is it a “not supported” issue? I checked the official Gemini API Changelog.
It turns out that the 1.5-flash series, which worked until a few days ago, can suddenly become unavailable in certain API versions or regions due to rapid version cycling or deprecation.
To prioritize development speed, I decided to stop trying to please the “moody” Gemini API. Instead, I made the call to switch to Groq (Llama 3, etc.), which offers faster responses and fewer restrictions.

5 Key Lessons Every Developer Should Remember
Reflecting on this struggle, here are the lessons I’ve carved into my heart. I share these with all engineers using external APIs:
- Suspect Multi-layered Error Structures: Solving a surface-level 429 (Resource Exhausted) might reveal a hidden 404 (Resource Not Found). The error message is just the tip of the iceberg; always look deeper.
- API Aliases are Not Forever: Convenient names like
gemini-1.5-flash-latestcan vanish instantly due to version updates or regional shifts. Consider specifying exact versions for stability. - Convenience vs. Request Quota: “Helpful” functions that search for models at runtime can eat your quota before you even start. Keep it simple and static in early development.
- The Courage to Cut Your Losses: Don’t waste hours trying to fix an unstable API. Switching to a more stable platform (like Groq) is often the best move for project momentum.
- Changelogs are the “Crime Scene” Evidence: When “it worked yesterday” no longer applies, the official Changelog is the only objective source of truth. Check the primary source before hunting for secondary info.

