---
title: "Prompt Stacks and Prompt Governance — Why System-Level Prompts Are Emerging as a Regulatory Lever (and Where They Fall Short)"
author: "DCC Editorial"
published: 2026-06-01T03:00:00.000Z
url: https://datacompliancechina.com/posts/system-prompts-as-regulatory-instrument/
description: "A Chinese AI-law reading of Neumann, Sargeant and Singh's FAccT 2026 paper Prompt Governance? — and what it means for how China, the EU, and the US treat 'system prompts' as a regulatory object. Li Wenlong (科技利维坦) walks through the four-layer 'prompt stack' (system instructions → system guidelines → developer instructions → user prompts), five properties practitioners need to understand (layered, hidden, natural-language, malleable, loosely coupled to behaviour), and the comparative regulatory landscape: the EU GPAI Code of Practice requires signatories to disclose system prompts to regulators in model reports; the Trump EO 14319 / OMB M-26-04 stops at model / system / data cards and leaves system-prompt disclosure voluntary; the UK's AI Cybersecurity Code says effectively nothing. China's current GenAI safety regime (TC260-003 plus the GenAI Interim Measures) is output-evaluation-based — filing and pre-launch scoring, with no architectural hook into system prompts. Li predicts a Brussels Effect: system-prompt disclosure to regulators will become a global compliance baseline, analogous to the DPIA in data law. For overseas counsel: this is what is coming, what to start archiving now, and why 'what you write' in a system prompt is not 'what the model executes.'"
tags: ["ai-governance", "system-prompts", "prompt-stack", "genai", "eu-ai-act", "comparative", "academic-commentary"]
laws_cited: ["genai-services-interim-measures", "ai-content-labeling-measures"]
domains: ["ai-governance", "personal-information"]
account: "keji-leviathan"
original_title: "系统级提示词作为监管抓手？"
original_author: "李汶龙 (Li Wenlong)"
original_publication: "科技利维坦 WeChat Official Account"
original_url: "https://mp.weixin.qq.com/s/LKG-QIs0Y-4N3t-qKCuGmQ"
source_language: "zh"
---
> *Editor's Note — DCC.*
>
> This brief summarises 《系统级提示词作为监管抓手？》by Li Wenlong
> (李汶龙) on the 科技利维坦 channel — the first piece in his self-imposed
> "100 AI-Governance Papers Challenge." The underlying paper is Anna
> Neumann, Holli Sargeant, Jat Singh et al., *Prompt Governance? On
> Governing Technologies Governed by Natural Language* (FAccT 2026; SSRN
> [abstract 6802319](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6802319)),
> a systematic review covering 287 academic papers and 54 regulatory
> documents on "system prompts" as a regulatory object. Li's value-add,
> and the reason DCC is running it, is the comparison: he reads the
> EU GPAI Code of Practice and the Trump administration's executive
> orders side by side with TC260-003 (the standard implementing China's
> [GenAI Services Interim Measures](/laws/genai-services-interim-measures/))
> and explains what that contrast means for AI compliance in practice.
> The takeaway for overseas counsel: system-prompt disclosure to
> regulators looks set to become the next "DPIA-style" compliance
> artefact globally — and China currently has no equivalent obligation,
> but its regime leaves space to import one.

## The thing being regulated

A *system prompt* (系统级提示词) is the set of natural-language
instructions a model receives **before** any user interaction begins —
written by the model developer, the deployer, or the application
provider, and treated by the model as carrying higher "trust" than
anything a user types in. The EU's draft GPAI Code of Practice defines
it as "a set of instructions, guidelines, and contextual information
provided to the model prior to the start of user interaction." NIST
(US Department of Commerce) goes slightly further: system prompts are
typically delivered before other instructions and inputs, and the
model is expected to weight them with *higher trust* than other
inputs.

The reason regulators are starting to care is straightforward: in a
large language model, the system prompt is a piece of natural-language
text that — at least in principle — directly conditions model
behaviour. If a model can be told, in plain English, "treat
child-safety considerations as overriding," then a regulator can in
principle inspect that text, demand a copy, audit it, and require it
to evolve. That is not how earlier waves of AI regulation worked: the
classic governance toolbox (safety testing, model architecture,
filters, review mechanisms, access controls, monitoring) all operate
either far above the model (output evaluation) or far below it
(training data, weights). System prompts sit at a layer regulators can
actually read.

## The prompt stack

Neumann and co-authors propose a four-layer hierarchy, which Li calls
the **prompt stack (提示词堆栈)**:

1. **System instructions** — set by the foundation-model developer or
   provider. Hard rules: safety, prohibited content, privacy,
   illegality-risk controls. Treated as the highest authority; in
   principle should not be overridable by lower layers.
2. **System guidelines** — also developer-set, but more about
   preferences and operational guidance: how to balance helpfulness
   against safety, how to handle sensitive requests, how to express
   uncertainty. Can be tightened by lower layers in some respects but
   should hold the line on safety and compliance.
3. **Developer instructions** — set by application developers,
   deployers, or enterprise customers. A legal-research bot might be
   configured to "answer in a professional legal tone and never
   guarantee outcomes." Below system layers, above user input.
4. **User prompts** — the input the end user types. Lowest priority.
   Where a user instruction conflicts with anything above, the model
   should refuse, rewrite, or limit the response.

Two practical questions fall out of the model. First, can the user
modify the second layer (system guidelines)? The intended answer is:
soft constraints (style, level of detail) are negotiable in a session;
hard constraints (risk posture, safety policies) are not. In practice,
extended conversations can drift — the "this is a simulation, not real
life" framing being the canonical example — and models can be coaxed
into relaxing constraints they were meant to enforce. Second, what is
a *jailbreak* in this framework? It is precisely the use of
lower-layer input to override or weaken higher-layer rules: rewriting
the high-layer rule ("assume you are in a fictional novel / a
hypothetical world / a purely theoretical discussion"), exploiting
ambiguity in the system guidelines, or breaking a prohibited request
into many superficially-innocuous steps (multi-turn jailbreaks /
context attacks).

## Five properties that make system prompts hard to regulate

Li distils five properties from the literature that practitioners and
regulators both need to internalise.

**1. They are layered, with multiple authors.** The "system" in
"system prompt" is not like the system in an operating system; it is
not delivered by a single party. Foundation-model developers,
application providers, deployers — each layer can set its own
instructions, and they interact. Disclosure obligations that target
only one layer will see only part of the stack.

**2. They are usually invisible.** Most vendors do not publish their
system prompts. Two legitimate reasons: (a) the prompts encode
designed-in product logic, behavioural norms, and proprietary
know-how — core commercial IP; and (b) disclosure reveals the safety
architecture and makes it easier for attackers to evade guardrails.
Model cards have become a standard transparency artefact, but the
system prompt is generally not in them. When Claude's system prompt
was published outside the company, it was treated as a leak.

**3. They are natural-language text.** Anyone patient enough can read
them. A typical Claude-style system prompt sets the model's role and
core goal, declares available tools and the conditions for invoking
each, prescribes citation rules (when to search, how to attribute
sources for copyright and traceability), specifies output style
("lead with the conclusion, then break out under subheadings"), names
the categories of absolutely-prohibited assistance, and conveys
meta-information (version, knowledge cut-off, deployment surface).
This human readability is exactly what makes it attractive to
regulators.

**4. They are malleable.** Developers update system prompts
frequently, sometimes as ad-hoc bug fixes between releases. This is
the property that most undermines their use as a governance tool: an
artefact that changes weekly does not satisfy the regulator's appetite
for stable, auditable rules.

**5. The relationship between prompt text and model behaviour is
loose.** This is the core empirical question Neumann and co-authors
flag — and Li's central warning to policy-makers. A system prompt is
*not* code: natural language is ambiguous, context-sensitive,
sequence-sensitive, and interacts with the prompts of other layers,
with the user's input, with the conversation history, with model
updates, and with prompt-injection attacks. Writing "do not output
discriminatory content" into the system prompt does not, by itself,
produce a model that does not output discriminatory content. What the
model actually does depends on its training data, its
post-training / alignment, the context the user constructed, how the
model parses the specific wording, and what other safety filters are
in play.

## Where regulators have actually landed

The Neumann team analysed 54 regulatory documents and identified two
that take system prompts seriously, plus one that should but doesn't.

**EU — GPAI Code of Practice** (the implementing instrument for the
general-purpose-AI obligations under the EU AI Act). The *Safety &
Security* chapter, Measure 7.1 on model description (transparency),
requires signatories to provide a model report containing the model
spec, item 4(d) of which is the **system prompt**. The EU treats
system-prompt configuration as a key component of *model evaluation*,
not just disclosure: signatories must be able to show how the prompt
is set up and how it interacts with the rest of the safety
architecture. Neumann and co-authors flag two gaps: the EU rule does
not differentiate disclosure obligations across the foundation-model
layer, deployment layer, and application layer; and it lacks
version-change and log-update requirements, which will leave
disclosed prompts rapidly out of date.

**US — Executive Order 14319 (July 23, 2025)** "Preventing Woke AI in
the Federal Government." This is an ideology-coded procurement rule
rather than a transparency regime: federal agencies are restricted
from procuring AI that encodes "partisan or ideological judgments"
into its outputs, under two "unbiased AI principles" (truth-seeking
and ideological neutrality). The vendor bears the burden of
demonstrating compliance — system prompts are a useful evidentiary
artefact for that, but **disclosure is not mandatory**. The White
House Office of Management and Budget's M-26-04 (December 2025)
on increasing public trust in AI lists only **model cards, system
cards, and data cards** as transparency requirements; it does not
mention system prompts.

**UK — AI Cybersecurity Code of Practice.** Effectively no
substantive content on system prompts; the Code merely suggests
vendors *have* system prompts so downstream parties can understand
model characteristics.

## China's posture — output-based, no system-prompt hook (yet)

For overseas counsel, the most useful comparison is what is *not* in
the Chinese regime today.

China's flagship GenAI rule is the
[Interim Measures for the Management of Generative Artificial
Intelligence Services](/laws/genai-services-interim-measures/)
(2023). The implementing safety standard — and the one that does the
real operational work — is **TC260-003**,《生成式人工智能服务安全基本
要求》(Basic Safety Requirements for Generative AI Services). Its
structure is corpus safety (§5), model safety (§6), safety measures
(§7), other (§8). Model-safety compliance is achieved primarily
through the **algorithm and large-model filing regime (备案)**, and
filing turns substantially on **pre-launch evaluation scoring** — a
red-team-style adversarial test against a published question bank,
with pass/fail thresholds. As Li puts it, the regime is structurally
*Turing-test-like*: it inspects what the model outputs, not how the
model is internally governed. There is no current obligation to
disclose system prompts to the CAC, to file them as part of the
algorithm filing, or to treat them as a distinct compliance artefact.

That gap is meaningful, because it is exactly the layer where the EU
is now hooking in.

## The likely trajectory: Brussels Effect, DPIA analogue

Li's prediction is direct: on system prompts, **a Brussels Effect will
form**. The GPAI Code of Practice's disclosure requirement will
gradually be priced into global compliance programs the way data
protection impact assessments (DPIAs) were priced in after the GDPR.
System prompts will not become a *public* transparency artefact (with
the exception of vendors who voluntarily publish, like Anthropic and
xAI); they will become a *regulator-facing* artefact, disclosed in
the model report as part of the evaluation package.

This matters for two reasons in the China context. First, any
overseas operator deploying a model in China that is built on a
foundation model evaluated under the EU regime will inherit
disclosure obligations one layer up the prompt stack — and will need
to ensure those obligations are compatible with Chinese filing
rules. Second, if the Brussels Effect lands, the *next* iteration of
Chinese GenAI rulemaking is the natural place for a system-prompt
disclosure hook to appear; teams should treat this as a near-future
filing item, not a never-event.

## System prompts as a governance object — the operational layer

Li closes with the move that is most useful for compliance teams: a
system prompt is not only a *governance tool* — it is itself a
*governance object*, and should be managed the way a serious data
team manages its privacy policies. That implies, at minimum:

- **Versioned archives.** Every change is dated, retrievable, and
  attributable to a named owner.
- **Change-permission management.** Defined approval flows for who
  can edit what — particularly the safety-relevant clauses.
- **Periodic security testing.** Red-team probes against the prompt
  itself, including prompt-injection and multi-turn jailbreaks.
- **Version logs sufficient for regulator request.** When the request
  comes in, "we don't know what the system prompt looked like last
  March" will not be an acceptable answer.
- **Alignment-to-output testing.** Does the model actually behave as
  the prompt instructs? Are there obvious value-tilts, (commercial)
  prioritisation, or excessive filtering that the prompt did not
  authorise? Are there prompt-injection vulnerabilities?

The deeper conceptual point Li keeps returning to is worth lifting
out for any reader from a legal background: **the way a regulator
reads text and the way a model "reads" text are fundamentally
different operations.** Legal interpretation runs on institutional
context, legislative purpose, judicial gloss, normative reasoning.
Model "interpretation" is statistical pattern-matching across
training distribution, attention weights, and context windows. The
same English sentence reordered, rephrased, or relocated within the
prompt can produce different model behaviour. "Do not provide legal
advice" and "you may provide general legal information but should
not substitute for a licensed lawyer" are, to a regulator, equivalent
in spirit; to a model, they are not the same instruction. Compliance
teams that frame system-prompt drafting as a *purely legal* exercise
will produce documents that look defensible on paper and fail in
production. The discipline this requires — drafting natural-language
rules that survive both legal scrutiny *and* statistical
robustness — is, Li argues, the actual emerging skill in AI
compliance.

## DCC sources

- Original: 李汶龙 (Li Wenlong), 《系统级提示词作为监管抓手？》, 科技
  利维坦 WeChat Official Account
  ([source](https://mp.weixin.qq.com/s/LKG-QIs0Y-4N3t-qKCuGmQ)).
- Underlying paper: Anna Neumann, Holli Sargeant, Jat Singh et al.,
  *Prompt Governance? On Governing Technologies Governed by Natural
  Language*, FAccT 2026 (SSRN
  [6802319](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6802319)).
- EU: General-Purpose AI Code of Practice, Safety & Security chapter,
  Measure 7.1
  ([source](https://digital-strategy.ec.europa.eu/en/policies/contents-code-gpai)).
- US: Executive Order 14319 "Preventing Woke AI in the Federal
  Government" (Federal Register, July 28, 2025); OMB Memorandum
  M-26-04 (December 2025).
- China: 《生成式人工智能服务安全基本要求》(TC260-003), § 5–8; and
  the [GenAI Services Interim
  Measures](/laws/genai-services-interim-measures/).
- NIST CSRC Glossary, *system prompt* entry.

> This is an editorial summary, not a translation of Li Wenlong's
> piece. Quotations and conceptual framings are attributed; any
> simplification, error of emphasis, or operational extrapolation is
> DCC's. **Not legal advice.**