Editor’s Note — DCC.
A surprising number of overseas data-compliance discussions skip the foundational question — what is data? — and jump straight into classification regimes, lawful bases, and cross-border paths. Wang Qinglan’s primer fills the gap with a toy storage room metaphor that overseas readers will find unusually accessible. The piece is sequel to her data governance / management / compliance disambiguation, and reads cleanly as a stand-alone primer too. DCC’s framing emphasizes where the conceptual building blocks anchor to the formal Chinese regime.
Data isn’t “records” — it’s records made under rules
Wang opens with an exercise. Imagine you’re cataloguing the toy cars in your home storage room and someone hands you this string:
“3+, mom, cherry red, 3-6, square, red, 2023, ages 3 to 6, plastic, ef555, 250, Shenzhen, 239,85,82, pre-school…”
That’s raw recording — observations captured in arbitrary form. If you tried to put this into Excel, you’d be unable to count anything. “Red,” “cherry red,” “ef555,” “239,85,82” — all describing color, in incompatible formats. “3+,” “3-6,” “pre-school” — all describing age, in incompatible formats.
So Wang’s first move: a working definition. Data is the objective recording — under rules — of phenomena relevant to the business. The rules are what separate data garbage from data that can be turned into a data resource, and ultimately a data asset.
The Chinese regulatory regime’s three-tier vocabulary (per the NDA Common Data Terms (First Batch)) maps onto this:
- Raw data (原始数据) — first-collected recordings, unprocessed.
- Data resources (数据资源) — raw data, primarily processed, with potential for value creation.
- Data assets (数据资产) — data resources that are lawfully held or controlled, can be measured in monetary terms, and can produce economic or social benefit.
The progression raw → resource → asset requires rules at every step.
What rules look like, concretely
To turn the cluttered toy-car notebook into something useful, Wang prescribes four kinds of rule. Each maps onto a formal compliance vocabulary overseas readers will recognize.
Rule 1 — “Required dropdowns”: master data and metadata
You don’t let people type “big car” or “excavator-thing” in the type field. You constrain the field to a fixed enumeration: engineering vehicle / car / racecar / motorcycle / other. Same for color, age range, weight, etc.
This is master data management + metadata management. The fields are typed; the values are constrained; the recording is consistent across users. Wang’s example is Taobao’s typed inputs (quantity, color, size are dropdowns, not free text) — the architecture is identical.
Rule 2 — Unified standards: ontology
“Battery capacity 6000mAh” / “2 hours charging gives 1 hour of play” / “excellent battery life” — three ways to describe the same thing. None of them comparable. None of them queryable.
The rule fix: define an ontology of measurable attributes. Battery life is measured in mAh. Playtime is measured in hours. Now the data is comparable and the records support analysis.
Rule 3 — Automated capture: digital business process
Install a simple sensor in the storage cabinet. Take a toy out — clock starts. Put it back — clock stops. The “playtime” attribute is captured automatically, with no manual error.
In enterprise data-compliance vocabulary: digitalize the business process. Don’t capture data from human attestation; capture it from instrumented systems. This is what the NDR’s risk assessment and security incident response obligations assume — that the underlying business processes are digitalized and observable.
Rule 4 — Hard requirements: the law
“This data must be stored within China.” This is not a design choice — it’s a hard requirement that overrides everything else. It must be in the rulebook.
For the storage room, this might be: “Receipts and bills tied to toys must be retained as records for tax purposes.”
For an enterprise: “Important data must be stored in the PRC.” “Sensitive personal information requires separate consent.” “Cross-border transfer of PI above the threshold requires CAC security assessment.” These are the legal floor rules — they bound everything the rulebook can authorize.
When all four rule types are combined, the storage room has a Family Toy Car Data Pact — a written record-keeping standard that turns raw observations into a usable data resource. Wang’s metaphor: an enterprise’s data governance framework is the same pact, scaled up.
What compliance actually means
With the Pact in place, the question shifts: am I following it? This is compliance. Wang’s three-tier taxonomy (introduced in her previous primer) reappears:
- Legal rules (法规) — what the law mandates. “Important data must stay in country.”
- Ethical rules (德规) — what the enterprise voluntarily commits to. “Don’t sloppily fill in records to make our reports look good.”
- Promised rules (诺规) — what the enterprise publicly promised. “Toy usage times accurate to the minute.”
All three end up in the Pact. All three must be followed.
The compliance workflow Wang describes — “three steps, in plain language” — is the operational discipline:
Step 1 — Select the rules
Decide which rules apply. Two inputs:
- What is the storage room’s situation? — i.e., the enterprise’s internal and external compliance environment.
- Who interacts with the storage room and what do they want? — i.e., stakeholder requirements.
But you cannot select every rule that might apply. Wang cites Professor Chen Ruihua (陈瑞华)‘s risk-oriented compliance model — focus first on the highest-risk mandatory rules. PIPL Article 29’s separate-consent requirement for sensitive PI is the storage-room equivalent of “don’t leave sharp toys in reach of toddlers.” Miss it once and the consequence is a regulatory or reputational injury.
Beyond the legal floor, there are optional rules — annual data security assessments, industry ethical standards, public commitments to customers. These aren’t mandatory, but they earn trust from regulators, partners, and customers.
Critically, rule-selection is not a once-and-done exercise. New business lines, new jurisdictions, new regulations all trigger re-selection. The discipline is “accurate and dynamic.”
Step 2 — Allocate the responsibility
The selected rules become a compliance obligation register. Each obligation gets:
- An owner — whose job is it?
- A process — what concrete workflow embodies the obligation? (“PI processing requires 3-tier approval.”)
- A control — how does the owner verify the process worked?
Wang’s storage-room version: “Daddy collects engineering vehicles; Mommy collects regular cars; child collects blocks.” The rule has names attached.
This is also the moment where external rules become internalized institutional culture. Without internalization, the rule lives only in the obligation register — a paper compliance program. With internalization, it becomes how the organization actually behaves.
Step 3 — Execute
This is the simplest step in concept and the hardest in practice. Do the things on the obligation register. If you don’t do them, you have a compliance failure — possibly a compliance risk event.
Wang’s risk taxonomy:
- Inherent risk — the risk before any controls. Storage room with no lock and no rules: theft is just a matter of time.
- Residual risk — the risk after controls are in place. Lock installed, rules written, but someone occasionally forgets the lock. Risk reduced but not zero.
Wang’s blunt observation: “It’s impossible to be 100% compliant — humans are uncertain, business is dynamic, there’s always something to adjust.” What matters is the framework — risk-allocated obligations, written process, executable controls.
Two organizational shapes for the compliance system
Wang’s practical advice on building the compliance system:
- By position (job role). “Customer-facing staff protect user info; operations record data sources.” Each role has a defined set of obligations.
- By business process. “From data collection → storage → use, each step has its own controls.” Each step has a defined set of obligations.
Both work. Pick whichever organizational shape fits the enterprise. Either way, the clear logic matters more than the absolute zero-error target.
Why this matters for overseas compliance teams
Three operational takeaways from Wang’s primer:
- Don’t skip the “what is data” question. Many overseas counsel jump from PIPL provisions straight to lawful-basis analysis, missing that the enterprise has not yet operationalized what counts as data, what attributes it carries, and where the records are. The PIPL framework only works once the underlying data is well-formed. Build the master data + metadata layer first.
- The three-tier compliance taxonomy is not just academic. A compliance team that conflates legal floor with ethical commitment either over-burdens itself (treating optional commitments with mandatory rigor) or under-protects (treating mandatory rules with optional flexibility). Wang’s three-tier model is the practical sorting mechanism.
- Inherent vs residual risk are the diagnostic axes. When something goes wrong, the first question is which one: was the inherent risk un-controlled (no rule for this scenario), or was a control bypassed (rule existed but not followed)? Different diagnoses, different fixes.
The deeper point in Wang’s piece is that data compliance starts before the law. The law constrains what an enterprise can do with data; but the enterprise’s data-handling discipline — what counts as data, what rules govern it, who owns each rule — determines whether compliance is achievable at all. Without the discipline, no amount of legal review will produce a compliant operation.
— Wang Qinglan (王青兰), 数据的奇妙真相:从生活实例看它的真面目 (The Magical Truth About Data — Seeing Its Real Face Through Everyday Examples), 青兰数据观察 WeChat Official Account, August 28, 2025. Original article (Chinese).
Not legal advice. The above is DCC’s structured summary of Wang’s commentary; not a verbatim translation. The author’s views are her own and do not represent her employer.