What Is a Private Data Collection?

A private data collection is a combination of the two elements that are shown below.

Elements of Private Data Collections

  • Private data

    Private data is sent peer-to-peer via a gossip protocol only to the organizations authorized to see it. This data is stored in a private state database (often referred to as a side database) on authorized peers. A side database is kept separate from the database holding the channel ledger. Private data can be accessed from a chaincode invoked by authorized peers. An ordering service does not see private data.

  • Private data hash

    A hash of private data is endorsed, ordered, and written to the channel ledger. A hash serves as evidence of a transaction and is used for state validation. A private data hash can also be used for audit purposes.

When private data collections are referenced in a chaincode, the transaction flow slightly differs from a regular one. Changes to the transaction flow aim at protecting the confidentiality of private data at the proposal, endorsement, and commitment stages.

More information about a regular transaction flow can be found in the Hyperledger Fabric: Components and Concepts Review chapter.

The transaction flow that includes private data collections is the following:

  1. A client application submits a transaction proposal to invoke a chaincode function (interacting with private data) to endorsing peers that are part of authorized organizations of the collection. Private data, or data used to generate private data in a chaincode, is sent in the transient field of the proposal.

  2. Endorsing peers simulate a transaction and store private data in a temporary storage, also known as a transient data store, which is local to peers. They propagate private data to other authorized peers via a gossip protocol.

  3. Endorsing peers send a proposal response back to the client. The proposal response includes an endorsed read/write set with public data, as well as hashed private data. No plain private data is sent back to the client.

  4. The client application submits the transaction (which includes the proposal response with private data hashes) to an ordering service. Transactions with private data hashes get included in blocks as normal. The block with private data hashes is distributed to all the peers. In this way, all peers on the channel can validate transactions with hashes of private data in a consistent way without knowing the actual private data.

  5. At the block commit time, peers use the collection policy to determine if they are authorized to have access to private data. If they do, they will first check their local transient data store to determine if they have already received private data at the chaincode endorsement time. If the data was not received, peers will attempt to pull private data from other authorized peers and validate it against the hashes in the public block to commit the transaction and the block to a local ledger copy. Upon completion of the validation and commitment process, private data is moved from the transient data store to the peer’s private state database.

When should you prefer private data collections over channels? A channel keeps the entire ledger confidential within a set of channel members. However, if you would like to share transactions among a set of organizations, but want only a subset of those organizations to have access to particular data within a transaction, private data collections are an optimal choice. Additionally, since private data is disseminated in the peer-to-peer manner, while bypassing an ordering service, private data collections can be used to “hide” data from ordering service nodes.

Last updated

Was this helpful?