Every developer who's ever built on an LLM has had the same cringe moment. You're debugging an agent. You run the test. It fails. You change something. You run it again. It fails. You change something else. You run it again. You repeat this for an hour. Then you check your billing dashboard and realize that hour of debugging just cost you twelve dollars.
The cost per call is small. The cost of iteration is not. If you iterate like a real engineer — dozens of runs to get something right — you are paying real money to the provider every time, and most of those calls are hitting APIs that are not even delivering value. They're just returning responses for your debug loop.
Mock mode is IBYOK's answer to this. Flip a switch, and your key starts returning realistic responses without actually calling the provider. Your code runs. Your tests pass or fail based on the shape of the response. Your budget doesn't move.
What mock mode actually does
When mock mode is enabled on a key, IBYOK intercepts requests to that provider and returns a plausible response in the same schema the provider would return. For OpenAI, that's a chat completion object with a generated-looking message. For Anthropic, it's a proper Claude response shape. For HeyGen, it's a job ID and eventual video URL. The response is structurally valid — your app can parse it, route it, log it — but no tokens were actually consumed and no credits were actually burned.
The content of the response is a sensible placeholder: "This is a mock response from IBYOK" with enough length and structure to exercise your downstream code. It's not going to give you a real answer to your actual prompt, because that requires the actual model. But it's going to give you a real response shape, which is 90% of what you need when you're debugging plumbing.
The iteration loop problem
Here's the specific workflow mock mode is built for. You're writing a wrapper around ChatGPT that takes user input, transforms it, makes an API call, parses the response, and writes to your database. You have maybe fifteen lines of code.
Testing those fifteen lines requires calling the API, which costs real money and returns a real response — different every time, because the model is non-deterministic. So you're debugging against a moving target and paying for each attempt. Both of those facts slow you down.
In mock mode, you call the IBYOK-wrapped endpoint the exact same way your code would in production. You get back a deterministic-enough response that you can assert against. You iterate on your wrapper — fix a parsing bug, handle an edge case, refactor a helper — and each iteration is instant and free. When you're done, you flip the key back to live and run your integration tests against the real API.
This is how every competent engineer works with mocked HTTP clients in traditional apps. Mock mode brings the same pattern to LLM development without requiring you to write your own mock infrastructure.
Mock mode on the free tier. Iterate infinitely, pay zero.
Why this matters more for non-technical builders
If you're a seasoned engineer, you probably already write mocks manually. You inject a fake OpenAI client in tests, return canned responses, run your test suite for free. It's a day-one habit and not a big deal.
If you're building with n8n, Zapier, or low-code tools — or you're a competent builder who just hasn't set up a formal test harness yet — you don't have that infrastructure. Every time you run your n8n flow to see if it works, you're hitting the real API. The per-run cost is small, but the cumulative cost of an afternoon of "click Run, see what happens, tweak, click Run again" is genuinely painful.
Mock mode gives those builders the same iteration economy that engineers take for granted. You flip the switch on your key during a building session, iterate freely, flip it back when you're ready to test against real data. No code changes. No special test environment. Just a toggle.
The integration test angle
If you do have a proper test suite, mock mode is useful there too. Your unit tests should run in mock mode by default — they shouldn't burn real credits just to verify that your code compiles and runs. Your integration tests can selectively hit the real API for the assertions that genuinely need model behavior.
This split is standard practice in traditional web dev: unit tests mock external services, integration tests don't. Mock mode makes the same pattern clean for LLM-heavy apps without requiring you to rewrite your wrapper code in two modes. The wrapper is the same; the key decides whether the call hits the real provider.
The demo-day scenario
Here's a non-obvious use case I've seen play out. You're demoing your agent to a prospect. They want to see it work. You don't want them to see you sweat through model latency, non-deterministic responses, or an unexpected API error mid-demo. But you also don't want to fake the demo with hardcoded data, because that reads as fake.
Mock mode is the middle path. The agent runs the real code path. The responses are deterministic, fast, and look like LLM output. The prospect sees a working product. You don't lose the demo to a flaky API or an accidental 500.
Is this "cheating"? I'd argue no — you're showing how the system works, not claiming the responses are from a live model. And mock mode reverts to real calls the moment you're done. For the five-minute demo where first impressions matter, it's a very reasonable tool.
What mock mode doesn't solve
The obvious limitation: mock responses don't reflect actual model behavior. If you're testing whether the model's output is good — does it follow your prompt correctly, is the tone right, does it handle edge cases — mock mode is useless. You have to hit the real API for that.
What mock mode is for is testing everything around the model: your request construction, your response parsing, your error handling, your downstream logic. That's the layer where iteration is highest and model quality is irrelevant. Save the real API calls for the layer where they actually matter.
What to try
Add a key to IBYOK. Flip mock mode on. Point your dev environment at IBYOK's endpoint instead of the provider's. Iterate on your code for an afternoon without checking your billing dashboard once. When you're ready, flip mock mode off and run your real integration tests.
The first afternoon, it'll feel weirdly freeing. That free feeling is what "iterate without the cost" is supposed to feel like. Most developers have forgotten that's what software development is supposed to feel like when the APIs aren't metered.
— Jeff
Mock mode, free tier, infinite iteration.