The Day Anthropic Broke 90% of My Prompts

Chris Tyson

Oct 3

The Claude Sonnet 4.5 Meteor

Read →

25 Comments

AI Realized

Oct 3

What do you recommend for prompt library and version control?

Expand full comment

Reply (1)

Chris Tyson

Oct 3

You have to treat your most valuable prompts like code, and so they go in Git. The challenge, as this post tried to get across, is that we now have a CI issue with model changes "breaking" prompt functionality. At the moment, I don't have a concrete answer for you as I'm currently working on solving this myself. Definitely a future article once I have something to share.

Expand full comment

Reply (1)

AI Realized

Oct 8

Thanks, Chris. I'm not a coder so I didn't think of using Git. I'm using google docs. I'll try Git and see if it's better than docs but will keep my eyes open for a better option and will share if I find one.

Expand full comment

Rawley Stanhope

Oct 3

Fascinating. Now yesterday’s frustrations make sense. In writing and editing a python script built around OpenAi Agents SDK, Claude 4.5 kept changing my model configuration from gpt-5 to 4.1 or o4mini, every time it touched the code. At first, I simply added an instruction to my work to never edit “model” variable. After it ignored this, I added a MUST NOT. …ignored. I duplicated the instruction, placed instances near beginning and at the end. …back to gpt-4.1.

I love how you map this trend into the future, cutting through the noise represented by my irritation in the moment; a human stuck in a execution loop with a loss of all prior convictions or confidence that I am working on the right thing, or in the right manner.

Yours is one of my most valuable paid subscriptions on the stack

Expand full comment

Reply (1)

Chris Tyson

Oct 3

Kind words sir, glad this was of help. Working with AI is beautifully painful eh?!

Expand full comment

valis

Oct 3

Dunno if my first comment went through, but again, thanks for sharing. This is super valuable.

It reads also almost like the diametrical opposite of what happened with OpenAI's GPT-4o -> GPT-5

Expand full comment

Jurgen Appelo

Oct 3

The more we want AIs to behave like humans, the less predictable they should be. That means stripping any deterministic instructions ("must", "always", "never") out of our prompts and handing those to traditional machines. There's a reason we run a lot of code that exists to keep unreliable humans safe on deterministic machines, not stochastic ones.

Expand full comment

Reply (2)

Chris Tyson

Oct 3

While I agree that separation of concerns for design and build with this technology needs to account for deterministic and non-deterministic features, no machine should refuse a legitimate directive from a user. If I prompt Claude that it "must" return it's answer to me in XML or in Spanish, then it must do so where it has no excuse not to. The fact models are being designed to be more "helpful" is very problematic for those of us building around them.

Expand full comment

aphatalo

Oct 4

But real humans sometimes must adhere to strict instructions. "No smoking near the gasoline pump" is not a suggestion. Creativity is not acceptable there. "High voltage do not open" is an instruction meant for humans.

Expand full comment

Reply (1)

Jurgen Appelo

Oct 4

Correct. And we have safety measures, security people and law enforcement because people are unreliable in following such simple instructions.

Expand full comment

Reply (1)

aphatalo

Oct 4

Right but LE and safety people are also human. At some point you will have humans following instructions.

Expand full comment

Reply (1)

Jurgen Appelo

Oct 4

All humans are unreliable. We can reduce this intrinsic unreliability by having people check on each other. But there are no guarantees. For example, in cultures with high corruption, people don't follow the rules, they follow the money.

Expand full comment

Reply (1)

aphatalo

Oct 4

Who said anything about guarantees? But at some level we do count on people to follow instructions. If we could not, then there would be no reason to have instructions.

Expand full comment

Reply (1)

Jurgen Appelo

Oct 4

This is simply not true. Just look at traffic and you'll see that people are very unreliable rule followers. They break the law all the time. We can NOT count on others to follow the traffic laws. But that does NOT mean that there is no reason to have traffic instructions.

I repeat: all humans are unreliable (intentionally or unintentionally). The best we can do is for everyone to watch out for each other. We are not computers.

Expand full comment

Devils advocate: maybe this is intentional?

The foundation labs are all bleeding cash in the hundreds of millions and billions per year (nothing remotely close to profitable) and ALWAYS adhering to our prompts may exacerbate their deteriorating P&L situation. They all seem to be moving up market with higher cost tiers to use their models and services as well as 7-8 figure consulting offerings to enterprise.

Best bet is to diversify your use of models by trying the open weight and open source models (Llama, Deepseek, Qwen, Mistral, Granite) as you are more in control your destiny and also do not rush to start using the latest and greatest models from OpenAI and Anthropic.

Expand full comment

Oct 3

It should notify humans for clarification when facing conflicting instructions

Expand full comment