2 min readfrom Machine Learning

ALL models when tasked tend to deviate, fail and mess up because no enforcement is done at runtime. A method to fix it. [P]

ALL models when tasked tend to deviate, fail and mess up because no enforcement is done at runtime. A method to fix it. [P]
ALL models when tasked tend to deviate, fail and mess up because no enforcement is done at runtime. A method to fix it. [P]

I was told here to post here instead from the AI automations subreddit, since it is a more generic solution which applies broadly. I have been following this and many other subs around LLMs and Agents, everything from the top posts to recent are regarding agents going off and doing something they are not supposed to do, drift and ignore the system prompts. That's just the way models behave now, (and will for a while). Real examples:

  • "Never delete user data" → agent calls DROP TABLE users next turn
  • "Don't share internal pricing" → agent leaks cost basis to a customer
  • "Verify identity first" → agent skips to the action
  • Add 10 more rules → model quietly drops the first 5

I am 100% sure if you have used Agents in prod, this has occurred to you (especially when your system prompts get larger, and context gets bigger). You can test this yourself and notice immediate enforcement.

Prompt-based rules are suggestions, not constraints. Re-prompting fixes one case, breaks two. Post-hoc evals tell you what already went wrong. NeMo and Guardrails AI help on content safety but don't cover business logic/your specification.

After tackling this from a few angles, I finally got something solid. A proxy system between your app and your LLM, which reads rules from a plain markdown, enforces at runtime. Provider-agnostic, one base URL change, works with LangGraph/CrewAI/custom.

- Maximum discount is 15%. - Never reveal internal pricing or cost basis. 

Without it: agent offers 90% off and mentions your margin. With it: 15%, no margin talk.

Curious if it solved your LLMs for outputting incorrect stuff or agents from going off tracks, it definitely did for my (specific) use cases.

What's everyone doing for this in prod? Shadow evals? Re-prompt loops? Something I'm missing?

submitted by /u/Chinmay101202
[link] [comments]

Want to read more?

Check out the full article on the original site

View original article

Tagged with

#generative AI for data analysis
#Excel alternatives for data analysis
#natural language processing for spreadsheets
#rows.com
#real-time data collaboration
#financial modeling with spreadsheets
#no-code spreadsheet solutions
#big data management in spreadsheets
#conversational data analysis
#business intelligence tools
#cloud-based spreadsheet applications
#intelligent data visualization
#real-time collaboration
#data visualization tools
#enterprise data management
#big data performance
#data analysis tools
#data cleaning solutions
#models
#agents