LLM (might) be flawed from the beginning
From MCP to LLM (should it be “from LLM to MCP”?)
I read a blog post dissecting MCP (Raz Blog: A Critical Look at MCP). and it’s good.
Then I happened to stumble upon various “LLM system prompt leak” GitHub repos claiming that they have “reverse engineered” system prompts. Some of them are even leveraging this fact and trying to sell their own businesses, which I hope is good.
It’s either a precautionary call to action to raise awareness for LLM downstream startups to treat their business models more seriously, or to let everyone know that modern LLM apps are essentially “400 lines of text + a frontend wrapper.”
It’s because system prompts, assistant prompts, user prompts (or whatever) are prompts regardless. LLMs see no difference, and there’s no special “channel” for LLMs to digest or treat those streams differently at the fundamental level.
During SFT, all text from training data is “glued” together as a single stream of strings with some special tokens like <|system|>, <|assistant|>, <|example|>.
So it is by design that prompts are data, data is prompts. Unlike traditional relational databases where SQL code is a meta-level instruction and not data itself (i.e. part of the query nor the returned data).
Researchers in NLP spend a lot of their efforts on the metrics and try to beat the current best model on benchmark leaderboards, clearly not scrutinising carefully enough for the long-term consequences of such design—which is fair, as that was not what they aimed for.
Surprised me that, perhaps due to the high cost of “re-designing something that works without breaking it,” after all these years, downstream developers are still inherently wrapping their applications on top of this flawed design.
Like the Raz MCP Blog mentioned at the beginning, the current LLM app tech landscape is driven by a “move fast” approach by design, neglecting many long-term system design aspects just to make things work.
Of course, this will continue unless companies accept the fact that there is no golden moat in LLM downstream, and there is virtually no way to guardrail your precious system prompt from prompters.