Editor’s Note — DCC.
Hong Yanqing is one of the most influential voices on Chinese data-protection law — a scholar with policy proximity to the regulators who write the rules he comments on, and an unusually careful writer. When he picks a fight inside the Chinese data-compliance discourse, the fight is almost always conceptual rather than tactical.
This essay is a fight. Hong argues that the mainstream Chinese reading of Article 21 of the Data Security Law — the article that establishes the “data classification and grading” regime — has been confused from the start. Important data, he says, is not a high rung on a ladder running from general data to important data to core data. It is a separate category, identified by the legal interest at stake, and the difference matters in ways that affect cross-border transfer, enforcement, and how PIPL and DSL stack on each other.
We rewrote rather than literally translated his essay because the conceptual move he is making is exactly the kind of thing that gets lost in plain rendering but reshapes how an overseas compliance reader should understand a regime they thought they knew.
China’s Data Security Law turned five last year. By now, anyone working on cross-border data has met the phrase “data classification and grading” (数据分类分级). It is the foundational concept of Article 21 of the DSL — the article on which the security assessment, the important data identification catalogues, and the localization rules all rest.
There is a standard way Chinese practitioners describe this regime. Data sits on a ladder. At the bottom is general data (一般数据), governed only by ordinary cybersecurity hygiene. Above it is important data (重要数据), which triggers heavier obligations, including the cross-border transfer security assessment. At the top is core data (核心数据), reserved for things that touch national security. The higher the rung, the stricter the rules.
Hong Yanqing thinks this picture is wrong — and he has been writing about it for years.
In a May 4, 2026 essay on his WeChat channel 网安寻路人, Hong returns to a fight he has flagged before: important data is a category, not a tier. It is identified by what legal interest it implicates, not by where it sits on a severity scale. The mainstream account, Hong argues, makes a conceptual error at the very start of the data-classification regime — and that error then propagates into every downstream rule.
This sounds like word-play, until you trace it through.
Why category-vs-tier isn’t word-play
Start with the most operational consequence. If important data is a tier — the level you get when data is “more sensitive than the ordinary kind” — then its status depends on the comparison set. A dataset can be a high tier inside enterprise A and a low tier inside enterprise B. It can be top-tier in the financial sector and unremarkable in retail. As the data flows across owners or industries, its grade shifts with the surrounding population.
For a state-level regulatory regime, Hong argues, that is a disaster. The whole point of identifying important data is to attach a stable regulatory identity to it — one that travels with the data across owners, across industries, across borders. If the identity floats with each new holder’s internal sensitivity grading, the regulator loses the unified handle it needs.
If important data is a category — identified by the legal interest the data touches, not by where it ranks in someone’s filing cabinet — the identity sticks. A dataset that materially affects public health and safety is important data whether it sits at a hospital, a research institute, or a third-party cloud provider. The only thing that can change its status is a material change in its risk profile — anonymization, de-identification, splitting, aggregation that increases risk — not a transfer from one filer’s hands to another’s.
The practical consequences for overseas compliance teams are at least these two:
First, enterprise self-grading is not a way out. Under tier-thinking, an overseas company facing an outbound-data-transfer obligation might be tempted to argue: “We classify this dataset as internal-use-only, so it’s not in our top tier — therefore not important data.” Hong’s view says this argument is structurally wrong. The data is important data if it touches the relevant public legal interest. Your internal grading does not dispose of that question; at best it reflects how you have chosen to protect what the state has already identified.
Second, important-data status doesn’t dissolve at the border. Once a dataset has been identified as important data inside China, that identity follows it through downstream transfers — including overseas ones. This is the conceptual basis for the persistence of the cross-border transfer regime.
The three-segment conceptual order
The technical core of the essay is a three-segment ordering of the concepts Chinese practitioners have been conflating:
-
Interest-based category (法益型类别). This is the true regulatory classification. It answers the question: what legal interest does this data implicate, and does that interest require state-level protection? Personal information implicates personal dignity and informational rights. Important data implicates public interest, economic operation, public health, social stability. Core data implicates national core interests — state security, the lifeline economy, major public welfare. These are different interests, not three rungs on the same ladder.
-
Business-process classification (业务流程分类). This is the operational tool enterprises use to organize their data — R&D data, production data, customer data, transaction data, log data. It is essential for asset inventory, access control, and lifecycle management. But it is not a regulatory classification. The label “R&D data” does not tell you whether the data is a trade secret, a state secret, personal information, or important data. It only tells you which department generates it.
-
Tiering (分级). This is the protection-strength configuration. Given that data has been identified as belonging to a regulatory category, how heavily should it be protected? Inside an enterprise this shows up as access levels, encryption requirements, audit frequency. In state regulation it shows up as security-assessment requirements, security review, localization mandates. Tiering comes after category identification — and it does not retroactively define what the category means.
The mistake Hong attributes to the mainstream is the collapse of these three layers into one. The phrase 分类分级 (“classification and grading”) gets used as a single compound operation. Enterprise asset-inventory thinking gets imported into state-level legal-interest identification. Operational language migrates into the legal-interest layer where it does not belong.
What the reframing fixes
Several familiar confusions become tractable once the layers are separated.
The “upgrading” fallacy. Practitioners often say that “mass personal information can upgrade to important data” (海量个人信息升格为重要数据). On a tier reading, upgrade suggests the data leaves its old identity behind. Under PIPL, that would be alarming — does the personal-information regime no longer apply?
On Hong’s reading, the dataset does not upgrade — it gains a second identity. The personal-information regime continues to apply (because the data still identifies natural persons). The important-data regime also applies (because the dataset, at scale and granularity, now implicates public interest). Both regimes stack. Conflicts get resolved by familiar principles — specialty, the stricter rule, purpose limitation, minimum necessary — not by one identity displacing the other.
Sensitive personal information is not a parallel category. It is a sub-state inside the personal-information category, with intensified handling rules. Same legal interest, stricter protection. Calling it a separate category at the same level as personal information is grammatical drift, not conceptual structure.
“CII-related data” is not a freestanding category. Hong is firm on this point. Treating “data related to critical information infrastructure” as a regulatory category in its own right confuses a context label with a legal interest. The relationship to CII is a flag — useful for identifying which data within a CIIO’s holdings might rise to important data or core data. It is not itself the category.
“General data” is the residual, not a parallel category. It is the residual space of data that no specific regulatory category has captured. It can still be protected — by contract, by tort, by unfair-competition law, by ordinary cybersecurity duties — but not by the Article 21 data-classification regime.
How to read the standards
Hong anticipates the obvious objection. China’s national standards, and a great deal of industry guidance, already talk about core / important / general data as “levels.” Doesn’t the current standard text sink his argument?
His answer is patient. Standards are engineering documents. Their job is to make a legal regime operable for enterprises — to give them something to put into a spreadsheet, a control matrix, an audit checklist. Using the language of “levels” is convenient because it maps to existing internal-control vocabulary. But engineering convenience is not legal definition. The legal definition has to do the work of identifying a legal interest, not just signalling severity. Standards can keep their level-language as a shorthand; the underlying concept is still a set of categories.
The implication for overseas readers: when a Chinese standard or sector catalogue renders important data as a tier, treat it as serving an operational purpose — not as the last word on the concept’s legal content.
Why an overseas compliance reader should care
For overseas counsel and compliance teams the practical takeaways are roughly these:
- Don’t expect enterprise-level grading to control regulatory status. Whether a dataset is important data is a question of legal interest, not of internal sensitivity ranking. You cannot grade your way out of an obligation that attaches by law.
- Expect overlap, not replacement. When a personal-information dataset reaches scale, expect PIPL and DSL regimes to apply together. Neither one swallows the other.
- Read sector catalogues as inventories, not as definitions. The important data catalogues that industry regulators publish are mediators. They help identify which data, in a given sector, belongs to the important-data category. They do not independently constitute the category.
- Expect cross-border persistence. Once data is identified as important data, the identity follows it. The point of the regime is precisely not to let identity drift across borders or across owners.
The deeper point in Hong’s essay — and the reason it is worth a careful read — is methodological. The Chinese data-protection regime is sometimes treated by overseas observers as a translation of GDPR with Chinese characteristics. It is not. The conceptual primitives are different. Where GDPR centers on the data subject and the rights they hold, Hong’s reconstruction centers on the legal interest the state is protecting. Personal information, important data, and core data are categories carved out by different legal interests — not points on a single severity scale, and not analogues of GDPR’s personal-data tiers. The category-vs-tier distinction is just the most concrete example of why importing GDPR’s conceptual furniture into the Chinese regime is not a safe shortcut.
— Hong Yanqing, Reconsidering the Nature of Important Data: Category vs. Tier (重要数据性质的再认识:级别概念 vs. 类别概念), 网安寻路人 WeChat Official Account, May 4, 2026. Original article.
Not legal advice.