---
title: "Why Upstream Won't Operate Its Data — Control Degradation, Derivative Data, and Irreducible Uncertainty"
author: "DCC Editorial"
published: 2026-06-08T02:30:00.000Z
url: https://datacompliancechina.com/posts/data-operation-right-why-upstream-wont-share/
description: "Part three of Hong Yanqing's (洪延青, 网安寻路人) study notes on China's 'separation of three rights' framework turns to the Right to Operate Data (数据经营权) — the right to provide data externally by transfer, licence, capital contribution, or pledge — and asks a question prior to 'what does operation transfer?': in real conditions, *will* an upstream party operate its data at all? His answer: yes, but narrowly. Control-dependent upstreams (platforms, holders of core user or irreplaceable industrial/training data) tend not to provide open, raw, autonomous access, and shift to controlled use or simply decline. The reason is structural. Once a downstream party is licensed to use data, the derivative data it produces is a *new object*: the upstream's *erga omnes* (对世) control over the raw data does not reach it, leaving the upstream — at most — a contractual claim against one counterparty. Hong then catalogues the uncertainties an upstream faces *ex ante*: some that attribution rules could touch but can't eliminate (qualification of the output, default ownership, good-faith of the processor, measurement of remedy), and some no rule can reach (combinatorial/unforeseeable value, undetectable misuse, the privity-and-insolvency chain, fusion and co-ownership, abstraction leakage into model parameters and learned skills, personal-information exposure, and counterparty hold-up). DCC's read for overseas counsel: this is the rigorous explanation of why Chinese data 'supply' is thin and why sandbox / privacy-computing structures dominate — defining a right does not supply the conditions to exercise it."
tags: ["data-property-rights", "data-operation-right", "data-economy", "three-rights-separation", "data-twenty-articles", "data-trading", "derivative-data", "privacy-computing", "academic-commentary"]
laws_cited: ["data-foundation-system-opinions", "common-data-terms-batch-2", "pipl", "dsl", "network-data-security-regulations", "data-property-rights-registration-guide-draft"]
domains: ["data-economy", "data-security", "personal-information"]
account: "wangan-xunluren"
original_title: "上游为何不愿对外经营数据？控制降级、衍生数据与不确定性下的经营决策"
original_author: "洪延青"
original_publication: "网安寻路人"
original_url: "https://mp.weixin.qq.com/s/1cOaLNxF6VO83Le-apCRdQ"
source_language: "zh"
---
> *Editor's Note — DCC.*
>
> This is DCC's summary and analysis — not a translation — of
> 《上游为何不愿对外经营数据？控制降级、衍生数据与不确定性下的经营决策》, the
> **third** study note by **Hong Yanqing (洪延青)** on his **网安寻路人** channel in
> his series on China's "separation of three rights" (三权分置) data-property
> framework. It follows
> [Two Paths for the "Right to Hold Data"](/posts/data-holding-right-two-paths/)
> (part one) and
> [When the "Right to Use Data" Goes External](/posts/data-use-right-externalization/)
> (part two). Where part two asked what externalising a use right *transfers*, this
> note asks the prior, more practical question: *will the upstream provide the data
> at all?* The original is linked at the foot; the framing for overseas counsel is
> ours.

## From "what operation transfers" to "whether it happens"

In the **"Data Twenty Articles" (数据二十条)** structure, the **Right to Operate
Data (数据经营权)** is the right to provide data externally — by transfer, licence,
capital contribution, or pledge — the analogue of *disposing* of tangible
property, meant to push data property out into the market. Part two showed that
what operation usefully hands over is **licensed use**, and that once the
downstream produces **derivative data (衍生数据)** a new object forms in its hands
and the upstream's control changes.

Hong's question here is one step earlier: under real conditions, **will an upstream
exercise its operation right and provide data outward at all?** His judgment: it
will, *but quite narrowly.* The upstreams that **rely on data to sustain a
continuing relationship** — the "control-dependent" type from part two (platforms,
holders of core user data, owners of high-value industrial or irreplaceable
training data) — tend **not** to provide open, raw, autonomous access. They turn to
controlled use, or decline. Not because they undervalue the data, but because
external operation forces them to face a cluster of **irreducible, mostly
structural uncertainties.**

## From licensed use to control degradation

The upstream's control over **raw data** has *erga omnes* (对世) effect: the data
sits with the upstream, a downstream must be authorised to use it lawfully, and
that control binds the world without needing a contract with any particular
person. The **derivative data** the downstream then produces, however, is a **new
object** on which the *downstream* — not the upstream — stands as creator. On the
prevailing view, derivative data's chain of succession from the source is severed,
and the downstream independently holds, uses, and operates it. So the upstream's
*erga omnes* control over the raw data **does not automatically extend** to
derivative data; against derivative data, the upstream has at most a **contractual
claim** that binds one counterparty.

This degradation holds **even if the contract is perfectly drafted and fully
enforced**, because it concerns the *nature* of the upstream's claim, not its
enforceability: *erga omnes* control reaches the raw data, not the derivative; over
the derivative, the upstream is at most a claimant against a specific party, not a
right-holder against the world. The upstream's position drops from an automatic,
world-binding right to a **per-item, counterparty-only contractual claim**.

And the degradation is **uneven**. For *abstract* derivatives — models, scores,
indices — it is most complete: the raw data no longer exists inside them, the value
has been extracted, and there is neither *erga omnes* control nor a way to recover
the extracted value. For derivatives that still *contain* the raw data, or are
*fused* from several parties' data, the upstream may keep a **co-holding** position
good against others (the official line on fusion is that each party may co-hold and
that external circulation in principle needs the other participants' consent).
Control degrades from the abstract end toward the fusion end.

## Two scholarly fixes — both tilt downstream

The hard case is not where the parties agreed, but where the contract is **silent,
unclear, or breached**: who owns the derivative data, and can the upstream get it
back? Two representative approaches both target this gap, and Hong notes they
**converge** where it matters.

- A **law-and-economics** approach treats it as a conflict-allocation problem, using
  the **Calabresi–Melamed** property-rule / liability-rule framework to switch
  between rules as transaction costs and courts' valuation error change. Its baseline
  favours the **processor**: where transaction costs are low, a good-faith processor
  takes the derivative-data interest outright without compensation; it resists
  co-ownership (to avoid an anticommons), counts the processor's own input — and even
  the value of *other* parties' fused-in data — as processing value-add, and pushes
  the upstream's protection onto the *liability* side (an IP-style **compulsory
  licence fee** in place of unjust enrichment).
- A **doctrinal** approach reasons by analogy to the Civil Code's **accession (添附)**
  rules, treating derivative data as a new object independent of the raw data, with
  identification stacked on *substantial change + marked value increase +
  irreversibility*. Ownership follows agreement; absent agreement, it vests in the
  processor by contribution and "putting data to fullest use" (数尽其用) — and the
  processor takes the derivative right **even if it does not hold a use right in the
  raw data**, so that even illegally-scraped source data affects only *liability*,
  not the **attribution** of the new product. The upstream's protection splits into
  personality interests (always retained by the individual) and *property* remedies
  via unjust enrichment or tort.

Methodologically far apart, the two **agree on two points**: each reduces the
upstream's protection over derivative data from *erga omnes* control to a
**counterparty-only claim** (unjust enrichment, tort, or a compulsory fee); and each
defaults the residual interest, absent agreement, to the **downstream processor**,
justified by data's non-rivalry, the survival of the source data, and the incentive
to innovate. That is exactly the **control degradation** above, accepted as a
premise.

## Ex-post allocation vs. ex-ante participation

Both schemes solve the same thing — *the data has been provided, a dispute has
arisen, who gets the derivative and how much compensation* — and they do it finely.
But both **presuppose the data was shared.** The prior question is: under such a
regime, will the upstream provide the data **in the first place?**

Solving ex-post allocation does **not** solve ex-ante participation — and a
downstream-favouring default can make participation *worse*. The more the default
tilts to the processor, the larger the upstream's expected loss from operating; the
larger that loss, the more it declines to share; the less it shares, the less source
data the processor has. The rule **incentivises downstream utilisation while
suppressing upstream supply.** (The law-and-economics camp half-sees this — it warns
that over-discounting source-data interests causes under-investment in data — but the
more decisive margin is the upstream simply **refusing to share data it already
holds.**)

## The uncertainties an upstream faces ex ante

Hong sorts them into two classes. **Class one** — attribution rules can engage, but
cannot eliminate them before the fact:

1. **Qualification.** Will the processed output count as *derivative data*
   (independent, vesting in the processor, leaving the upstream a compensation claim
   at most) or as *still the original data* (upstream interest intact)? It is binary
   and decisive — yet the test for derivative data is unsettled (one view requires
   substantial change + marked value increase + irreversibility; another makes marked
   value increase the core and demotes irreversibility to evidence), so *ex ante* no
   one can predict which side an output lands on.
2. **Default ownership.** Contracts are never complete; the gaps fall to default
   rules that are doctrinally divided and, in their firmer parts, tilt to the
   processor. Predictability does not cure unfavourable content.
3. **Subjective state.** Whether the processor acquired the source data in good or bad
   faith may or may not affect its ownership of the derivative, depending on the
   approach — and the *broader* the licence scope, the harder it is to find the
   processor exceeded authorisation, so the *easier* it is good-faith and takes the
   full derivative right.
4. **Remedy measurement.** Even winning yields a claim of **uncertain amount** —
   floating between a licence fee, profit share, and full disgorgement, benchmarked
   against IP licensing ratios. The upstream trades a definite, world-good position
   for an indefinite, counterparty-only claim.

**Class two** — no attribution rule can reach these, and they are the **main
deterrent**:

1. **Foreseeability and drafting.** Data value is *combinatorially emergent* — the
   most valuable use is often the downstream recombining the source with other data
   and models, **unforeseeable at signing**. You cannot pre-limit what you cannot
   foresee; and derivatives stack (second- and third-order derivatives sit beyond a
   clause that bound the first). The contract is **necessarily incomplete**, and its
   gap falls exactly where value and risk are highest.
2. **Discovery and tracing.** Whether the downstream trained a model, exceeded scope,
   or re-licensed is often unknowable to the upstream — derivative data is intangible,
   internal to the downstream, fusible, and can pass de-identification off as
   anonymisation. Hard to detect; hard to prove or trace after fusion and abstraction.
3. **Privity, chain, and payment.** A contract binds only the counterparty. If it
   transfers on to a third party in breach — or a third party simply **scrapes** the
   downstream's data product — the upstream has no hold (and, on the doctrinal view,
   the scraper acquires a full derivative right). The counterparty may also go bankrupt
   or be acquired, leaving the upstream's claim worthless.
4. **Fusion and co-ownership.** Once the source is fused with others' data, whether the
   upstream keeps a position good against the world is **unsettled** — one view rejects
   co-ownership and vests in the processor; another excludes such products from
   "derivative data" via the irreversibility test.
5. **Abstraction leakage.** Even a contractual duty to **delete** the derivative
   dataset cannot recover the parameters a model has already learned or the skills the
   downstream's people have absorbed — that value has changed form, beyond any
   attribution rule or damages.
6. **Compliance and personal information.** If personal information is involved,
   "provision" extends compliance duties and joint exposure back to the upstream; and
   the upstream often cannot tell whether the downstream's derivative is truly
   anonymised — most "anonymisation" is **de-identification**, still personal
   information — so its exposure does not necessarily end on delivery.
7. **Counterparty strategy.** Once data is delivered, incentives shift: post-possession
   delay and renegotiation (hold-up); information asymmetry hides intent and capability
   ex ante; worst of all, the counterparty may use the capability built on the source
   data to **compete with the upstream**.

Across both classes: the deterrent uncertainties cluster in **class two**, which no
attribution-or-compensation scheme can touch; and where class-one rules *could*
engage, their tests are contested and tilt against the upstream. So **no allocation
scheme can eliminate, ex ante, the uncertainty that actually drives whether the
upstream provides the data.**

## The operation right contracts — within limits

Hence the operation right tends, in practice, to **contract**: data-dependent
upstreams avoid open, raw, autonomous provision. Hong adds three boundaries:

- **Contraction is not cessation.** Upstreams respond without needing omniscience —
  data sandboxes, privacy computing, "data does not leave the domain," federated
  modelling, strict purpose limits, and grant-back audits all **bound** the
  uncertainty with technology and contract, substituting **controlled use** for raw
  delivery. They do not stop providing *use*; they stop providing **use detached from
  control.**
- **Monetisation upstreams are excluded.** A one-off seller — data broker, dataset
  sale — bears no consequence from the buyer's loss of control over derivatives; it
  has already realised the value in the price. The thesis targets only upstreams that
  mean to keep a **continuing relationship and control.**
- **Not sharing has a cost too.** Data depreciates; competitors may move first. So this
  is a **marginal, directional** claim — uncertainty raises the upstream's reservation
  price and shrinks the deals it will do, pushing it toward controlled forms, not a
  blanket refusal.

This is also why "but isn't the default rule there *to reduce* uncertainty?" doesn't
rebut the point. A default rule at most trims some *ex-post* allocation uncertainty;
its content is contested (so still unpredictable ex ante), and its predictable part
tilts downstream (so foreseeable loss of control does not raise the upstream's
willingness to provide). The real deterrents — foresight, detection, privity, fusion,
abstraction leakage, compliance, strategy — sit **outside the rules' range**. Default
rules govern *how to allocate after the fact*, not *whether to act beforehand.*

## Establishing a right is not guaranteeing its exercise

Hong's close ties the series together. The operation right is **conceptually clear** —
the right to provide data externally and move data property into the market. But a
clearly-defined right and an *exercised* one are two different things. Licensing use
forms a new object the upstream cannot reach, dropping its control from *erga omnes*
to a personal claim; placed in real conditions, the upstream then faces a layer of
irreducible, mostly structural uncertainty that the two scholarly fixes — however fine
on ex-post allocation — neither reach nor relieve, and that their downstream-tilting
defaults can worsen. So the operation right **contracts**: relationship-keeping
upstreams move to controlled operation, or decline.

It is one thread with the first two notes. **Holding** is thin, its boundary supplied
by behavioural norms; the **use** right is real but not self-sufficient, its boundary
supplied by contract and technology; the **operation** right's *exercise* depends on
the surrounding allocation of risk — which the three-rights modules cannot themselves
create or arrange. A framework can establish the **type** of a right; it cannot supply
the **conditions** to exercise it — and here those conditions, in contract, technology,
and public law, are not yet adequately supplied.

## Why overseas counsel should care

- **This is the rigorous answer to "why is Chinese data supply so thin?"** When a data
  exchange listing, a sourcing pitch, or an AI-training-data deal stalls on the
  *supplier* side, the cause is usually not price but **structural control loss** — the
  upstream cannot recover value once it leaves, and no contract fully fixes that.
- **Controlled access is the equilibrium, not a quirk.** Sandboxes, privacy computing,
  "data does not leave the domain," and federated modelling are the rational upstream
  response — design your China data projects to consume **outputs and model results**,
  not raw datasets (the same pattern across
  [part one](/posts/data-holding-right-two-paths/) and
  [part two](/posts/data-use-right-externalization/)).
- **If you are the upstream/licensor, price and bound the loss you cannot reverse.**
  Use grant-back, no-train/no-fusion, sub-licensing bans, output review, and audit —
  but assume detection and tracing will be costly, and put a price on the
  control you will lose rather than relying on recovery.
- **If you are the downstream/processor, your derivative work is comparatively
  well-positioned — but document it.** China's defaults tend to vest models, scores, and
  labels in the builder; still, record your value-add and lawful sourcing, because
  good-faith and scope-of-authorisation will decide marginal cases.
- **Don't read a clean three-rights label as a clean deal.** Defining holding, use, and
  operation does not, by itself, make data tradeable; the **risk-allocation plumbing**
  around the modules — contract, technology, PIPL/DSL compliance — is what determines
  whether a transaction actually happens.

## DCC sources

- **Original:** Hong Yanqing (洪延青),
  《上游为何不愿对外经营数据？控制降级、衍生数据与不确定性下的经营决策》, on the
  网安寻路人 channel —
  [mp.weixin.qq.com](https://mp.weixin.qq.com/s/1cOaLNxF6VO83Le-apCRdQ).
- **Series on DCC:** part one —
  [Two Paths for the "Right to Hold Data"](/posts/data-holding-right-two-paths/);
  part two —
  [When the "Right to Use Data" Goes External](/posts/data-use-right-externalization/);
  part four —
  [Data "Parallel Property Rights"](/posts/data-parallel-property-rights/).
- **Cross-references on DCC:** the [Data Twenty Articles](/laws/data-foundation-system-opinions/)
  (source of the three-rights structure) · the
  [Common Data Terms, Batch 2](/laws/common-data-terms-batch-2/) (official definitions of
  the operation right and derivative data) · [PIPL](/laws/pipl/) · the
  [Data Security Law](/laws/dsl/) · the
  [Network Data Security Regulation](/laws/network-data-security-regulations/) · the
  [draft Data Property Rights Registration Guidelines](/laws/data-property-rights-registration-guide-draft/).
- Part of the [data-economy](/domains/data-economy/) domain on DCC.

> This is an editorial summary and analysis of Hong Yanqing's commentary, written
> in DCC's own words for overseas readers — not a translation of his article, and
> not a reproduction of it. Quoted phrases are short and attributed; the full
> argument is his, at the link above. **Not legal advice.**