Decentralized storage: a solution for hybrid data management

Use of blockchain technology, smart contracts, data security, cost reduction, resilience … A real magnet and engine of innovation, there are many qualities of decentralized storage. Used in addition to traditional storage, it can be a relevant solution for individuals and businesses.

Towards decentralization of data storage

Attractive in its simplicity, centralized online storage is used extensively by businesses , organizations and individuals. The hundreds of petabytes (1 petabyte = 1024 terabytes) of data passing through it are rarely encrypted.

Their accessibility is conditioned by the proper functioning and the good will of the operator who stores them. They are collected in Data Centers , veritable data factories which are subject to various risks that could affect their security and sustainability:

To reduce these risks, one solution is to increase the number of data centers. However, this only decentralizes the data storage from a geographical point of view. This always remains centralized , because it also remains in the hands of a limited number of actors.

Even if the French Orange Business Services, Scaleway, 3DS Outscale or OVH offer efficient solutions, the online data storage is for the most part managed by a predominantly American oligopoly (Alphabet, Amazon, Apple, Dropbox, Facebook, Microsoft, Oracle ,…).

This involves risks of compromise , freedom of expression (censorship), legal and economic security, or even sovereignty. In addition to the resilience of access to data, one of the challenges of decentralized storage lies in the elimination of the centralization of storage in the hands of a small group of actors, towards a very large number of actors. having no possibility of monopoly or scrutiny over the data they store.

Many definitions exist to more or less precisely define decentralized storage. To put it simply, let’s remember that decentralized storage is a system that shares the conservation of computer data between many independent operators.

Together, these independent operators form a sort of peer-to-peer storage network where data exchanges (storage and retrieval) can take place directly between different computers connected to the system, without passing through a central server. Operating in a network as well as encrypting data complicates and drastically reduces the risk of attack . Attacking a single operator is possible, attacking a network that has thousands of them is also possible, but technically and economically much more difficult to support.

The reliability and security of this type of storage are directly correlated with the number of operators using the network. By multiplying the number of operators, a decentralized storage network allows economies of scale by allowing anyone (companies or individuals) to participate as a storage provider (host) while monetizing their free space on their disk. hard.

How does it work?

A kind of infrastructure as a service (IaaS) of cloud computing , decentralized storage networks rely heavily on blockchain technology, an essential condition for their security, reliability and autonomy. Concretely, this is one of the methods of using a decentralized network.

When a user (tenant) wishes to store files online in a decentralized manner, he must use the services of an organization (company, foundation, association) of decentralized storage . This will provide it with software (generally open source) to access a decentralized network. This software will proceed to its “anonymization” (even if in certain cases the term “pseudonymization” would be more appropriate) , then will create different segments (or fragments) of its files before they are uploaded to the network.

These segments are encrypted and copied many times (redundancy), then distributed randomly to different hosts. This distribution over the host network is governed by smart contracts which set prices, availability commitments, and other aspects of the relationship between tenants and hosts. These smart contracts are then executed on the blockchain of the network supported by the decentralized storage organization . This allows the traceability and validation of segments of tenant files.

Based on blockchain technology, smart contracts are automatically generated and applied on the network. To execute, smart contracts need proof validated by the blockchain . For example, smart contracts may require proof of storage (proof of storage), of space (proof of space), of conservation (PoH), of service (PoSe), of space-time (proof of spacetime), replication (proof of replication), etc. These methods do not require any intermediary to function. If proof appears on the blockchain within a certain period of time, the hosts are automatically paid. Otherwise, the host is penalized.

When a tenant wants to access their files, they request it from the network which can use distributed hash tables to locate all segments and put them together. The tenant who downloads them has his own private key to validate ownership. Redundancy (multiplication of segments) makes it possible to completely recover a user’s original file with only a few segments (in a peer-to-peer fashion, without going through a central server). A guarantee of resilience, this means that even if one or more network operators are offline, data access is still possible.