Hive: Democratizing Data

When it comes to the technology path we are on, it is evident how important data is. This was likely always the case yet we can publicly see the battles taking place. Big Tech is on a hunt for more since they have to feed their neural networks in ever growing amounts. There is debate whether the degradation of synthetic data makes it less valuable over time.

There are a number of lawsuits filed against OpenAi. The accusation is the company trained their models on data that was under copyright protection. In other words, OpenAI did not have the rights to utilize it.

How the courts will rule remains to be seen. This is virgin ground with little-to-no case law.

What we do know is that Big Tech can do what it wants. OpenAI is backed by Microsoft, hence legal fees are not an issue. The same is true for Google, Elon Musk, and Meta. If they do decide to venture out, they can endure years of litigation.

Smaller entities cannot do that.

For this reason, we have to seriously consider where our data is going.


Image generated by Ideogram

The Democratization of Data

We know how much of our digital world is controlled by the silos. There are a number of companies that are watching everything we do. Our entire existence, in this realm, operates with their permission.

Whatever we are doing requires us to log on to something. Each application or network is giving us approval. If they say no, access is denied.

Of course, when we are operating, tracking is taking place. This is feeding an enormous amount of data into the system. Each time we click on another page on Facebook, Zuckerberg's system is watching. The same is try for your viewing on Netflix, movements with your phone, and, now, speed of your automobile.

The key here, for this conversation, is the data created is proprietary. This does not belong to you. Technology companies decided long ago that, if you are on anything relating to their domain, what you generated was theirs. It is something we all agree to via the 200 page terms of service we approve.

While a lot of this operates in the background, much of data generation is voluntary. We cannot really counter the data that our phone carrier collects. Switching companies only alters the server system where the data goes.

That said, when it comes to actions such as social media, we volunteer. Nobody is forced to post on X or Instagram. The same is true for watching Netflix. These are choices we consciously make.

One could state there isn't much in the way of alternatives and, to a degree, that is correct.

However, we do see some options opening up.

Permissionless Blockchains

blockchain does provide a solution that is worthy of consideration.

When dealing with a permissionless network, a couple things stand out. To start, anyone can write to the network. There is no approval required other than having the key to authorize the transaction.

The second component is the data is stored in a decentralized manner. For many of these systems, the data is replicated on many servers, operated by unrelated entities.

With something like a bank, as an example, all the transactions are recorded on different servers that are under the control of that entity. Hence the data is totally centralized in terms of its control.

Contrast this with Bitcoin where the transactions are recorded in a database that is run by each block producer. Hence, anyone with the proper equipment can run the software, i.e. update the database.

The Bitcoin network is most financial transactions. This is data that doesn't have a great deal of value for training purposes. Fortunately, this is only one type of database that we see with blockchain.

Hive offers a more powerful opportunity in this regard. It is a blockchain that can actually democratize data due to the type of data which can be stored.

Unlike those with just financial transactions, Hive can natively store text. This means much of the Internet data that is used in LLM training can be written to this type of network.

In other words, it provides decentralized storage of text data that anyone who sets up an API can access. The entire database is available to anyone.

No Direct Transaction Fees

A number of networks offer the same potential. There is one major difference: transaction fees .

There is a cost of writing to a server. Regardless of the structure, compute requires money. When we are dealing with the major technology companies, this is covered through the use of the data, mostly processing it and selling advertising.

In the blockchain world, most simply charge a transaction fee. This covers, among other things, the cost of writing to the database.

When conducting a financial transaction, pay a fee is simply part of the deal. Most are accustomed to this. However, when we leave a comment on YouTube or Facebook, that is not the case.

Hive offers a similar structure by having no direct transaction fees. Individuals are free to write as much as they desire to chain, as long as they have enough stake. This eliminates the direct costs and enables the posting similar to other networks.

This provides us with a powerful decentralized storage system.

Necessary For Humanity

Does a future with mega-tech controlling everything sound appealing to you?

We already see how the entire Internet is a siloed system. A few major platforms control most of the activity that takes place.

As technology gets more powerful, we are seeing this power move even deeper in their favor. LLMs are an example of how large amounts of compute coupled with massive datasets are necessary. Over time, compute costs come down, at least on an per operation basis. Data, however, is still a problem.

The story of data is those who have it are much better served than those without. Eventually, all the existing public data will be scraped and companies will need to access the private stuff. This will require bringing money to the table.

For Google and Amazon, no problem. Smaller entities are not so lucky.

A blockchain such as Hive can aid in this endeavor. The value of a network of this nature going forward is the democratization of the data. We can see an obvious need as the technology companies lock down (restrict) access to their network data.

Decentralized blockchains that are permissionless can help to offset this somewhat. The key is to focus the activity in a manner where data storage of public information is not under the traditional client-server system. Here is where even something such as Wikipedia is still controlled to a large degree. Who controls the server access?

Hive offers something different. All the text written to the database is accessible. It cannot be locked away or kept from anyone. Companies, large and small, could pull the data for their own purposes.

This is democratization in practice.


What Is Hive

Posted Using InLeo Alpha

Sort:  

No doubt big bro, Hive's approach to decentralized data storage seems promising. It could potentially shift power dynamics by allowing broader access to information, countering the dominance of these tech giants like Facebook and YouTube. The data belongs to us and it should be that way

It is something we all agree to via the 200 page terms of service we approve.

Which I did not read and just checked the I agree.😁

I thought Hive was all about blogging and earning, but when I explored more of the platform, it offered more than that. Just like how you shared a great knowledge about technology and crypto. Through this, I learned beyond what I see in the environment. My interest in blockchain has been awakened. Truly, Hive offers something different.😊

i see the web2 more of a 'tenancy data' game. It's not your house although you are leaving in it. And as you said, big companies can always pay for their damages as long as they have their hands over what matters most.
That's where hive and other web3 platforms will step in. The statement will be, 'you can't pay fines because you can't even break rules'. Hoping hive continues to make this small impacts that will overtime sum up. Thanks for the article friend

The closing remark indeed was clear summary of your point. Democratization of data is a very interesting concept and one that seems to stand against big data watching each and everyone of our actions, I do hope more people get enlightened by this.

Thank you for this post which makes us understand how we are controlled today and how important personal data is. Hive is certainly better than Facebook

I agree. Information that is available to all is good for the public. I can understand why some are behind paywalls, since they required a lot of research and effort, as well as to pay for their operating costs. But that is why Hive is good. They can switch to Hive and earn from their posts. No need for paywalls.