Web 3.0 And Data: What The World Needs

in LeoFinance24 days ago

A few months back, Reddit agreed to "sell" its data to an AI company for $60 million.

This gave us some insight into the value of data. This is something that is only increasing in scope. Of course, lately, much of the attention seems to be on compute. However, they truly go hand-in-hand.

It is impossible to have one without the other.

For this reason, society is going to have to expand its data output for the proliferation of AI to take place. While synthetic data does help, the quality degrades over time if that is all the models use.

The problem with the Reddit deal is the fact that very few companies can afford $60 million. Certainly, this is a rounding error for Big Tech. That said, it shows how limiting things can be.

It is a situation that is only getting worse.


Image created by Ideogram

More AI Companies Cutting Data Deals

Over the last couple days, we got evidence this is a growing trend.

Reddit announced another deal to "sell" its data. This time it is giving OpenAI access to train its models. No figure was given regarding the deal.

Sam Altman is a major shareholder in Reddit. It is important to note the incestuous relationships that are arising.

OpenAi was not done.

They also announced a deal with News Corp to use those articles to for training.

This is a multi-year deal with, again, no terms reported.

Of course, we have to keep in mind that OpenAI is far from open. Here we have a company that is intent on taking control in the AI race while actively fighting for legislations that outlaws open AI models.

Does this sound like a good way to approach the future?

Altman's view, at least based upon what he claims, is that models must be closed to protect them from being duplicated by nefarious players. We simply cannot trust the general public which this. Overlooked is the question of whether someone like Altman can be trusted.

The history of Big Tech seems to conclude otherwise.

Feeding The Right Animal?

If we take a look where most people focus their online attention, we obviously come up with names such as X, Facebook or Instagram, and YouTube. These are still, by far, the top social media entities.

Going one step further, the ones behind these platforms are Elon Musk, Meta, and Google. Over the past year, we can associate these with the names Grok, Llama, and Gemini.

Do you see a pattern?

There is a reason why a company like OpenAI is having to cut deals like this. They do not have the social media platform which is continually feeding it more data. The others do.

Hence, it is not only that data is valuable. A more important component is the accessibility. The reason there is such a high cost is the amount that is accessible to smaller entities is relatively small.

This is, perhaps, the greatest responsibility of Web 3.0.

Open Source Data

What makes a public blockchain different is the data is available to all. Anyone can set up an API and "mine" the data. This is not the case with something like Reddit.

That company understands the value of what it is holding. The deals being cut are all in the millions of dollars.

Basically, what we are looking at is the centralization of AI. While the concept of centralized before moving towards a more decentralized state is common, what is going to foster that?

The answer is Web 3.0.

In fact, it is the only way for this to come about.

Consider the idea of the cost of training a model declining. This follows with the laws of IT. If training a model such as Llama 2 costs roughly $10K in a couple years, the field of companies that can generate something like that grows a great deal.

However, where are they going to get the data?

This is a part of the process that gets skipped. Even if the training is inexpensive, the access to data is not. In fact, with more entrants into the market, there is a good chance the prices increase even more.

We have an old saying about finding a problem and solving it. Here is a challenge the entire world is facing as evidenced by the deals struck with these companies. Yet, if we looked at the application usage on people's phone, where is the focus going? Is it Web 2.0 or Web 3.0?

We all know the answer to this.

The issue there is we keep feeding the same animals. Elon and Zuckerberg will keep taking the data, buying more processors, resulting in AI as they see it. While I give them both credit for going the [open source](](https://inleo.io/@leoglossary/leoglossary-open-source) route, we can see by the Reddit deals that it could be feeding those who seek to close everything off.

In my view, this is not what the world needs.


What Is Hive

Posted Using InLeo Alpha

Sort:  

I really need to give credit to the Reddit team for getting all these deals done, and striking while the iron is hot. While I don't like it, I think it is a good business move. With how important data is, I can see companies and websites protecting theirs even more, and striking similar deals in the future.

They have something valuable and they know it. Became a true business as they moved closer to the IPO.

Cant fault what they did the last year.

Your post was absolutely eye-opening. I found your discussion on decentralization and user control over data particularly very very enlightening man, It’s really exciting if you think about the huge potential for greater privacy and empowerment. Thanks for sharing such valuable insights bruv

Altman's view, at least based upon what he claims, is that models must be closed to protect them from being duplicated by nefarious players.

Does this really even make sense? Say some bad guys get ahold of the model and the data used to train it. What then? They train the model to be a bit better, faster, whatever?

Or are they worried that somehow, someone else will take their model and develop AGI?

Meanwhile, there are some bad guys USING OpenAI's already trained model to do bad stuff. But that's OK because business.

This is obviously just to stifle competition, nothing more.

Yeah.

And even taking him at face value, have to presume to believe he/OpenAI are trustworthy.

History shows Big Tech is anything but that.

Some call it Intelligence, for what it does is more like Archiving Indexes. A renamed spyware program, at least what Copilot is designed for.

Simulating a random processor like the human brain is technologically very far in the future, if ever. For sure LLM:s are versatile, but would be crazy to leave decision making to algorithms coded by people only motivated by money and educated to force everyone to think the same.

...educated to force everyone to think the same.

New age diversity of thought...as long as it is all the same.

AI in centralized hands is just a replica of the old business. Just as you said about trust to a man running the web2 pattern, it's always all about keeping all the goodies.

Let's forget about their fancy talks. Only wishing the web3 industry don't miss out while AI is still hot.

25 years of data breaches shows that Big Tech cant protect crap.

I guess that's not their main concerns; 'eyes on the money'. Hahaha .

Yet data security happens to be what users are panting for.

It seems like most of us have our data on the dark web being sold repeatedly.

Everyone has been hacked.

Centralization of AI data impedes innovation. Open-source data and Web 3.0 can democratise access, paving the way for a more decentralised AI future.

That is my view.