Its now increasingly easy and convincing to auto generate content using various large scale language models. Earlier tools were pretty simple and limited with their knowledge to generate convincing content.

Some examples are here:



[This GPT-3 based predictive text generator known as ChatGPT also has some data indexed about Hive. ]

Problem statement

How to address auto generated text which is not plagiarized & easily detectable posted to the Hive blockchain for monetization ?

In the past we had expensive bots like @cheetah and some others looking for plagiarized content. Now with content generated by newer predictive text generation methods by Large Language models, the job of identifying is not easy.

Here is a rather cute example of a subject about which content is generated:

Prompt: Please write an article which I can post to Hive blockchain. The article must be about the tip of a safety pin


Is there a way to identify auto generated content ?

There are evolving work to identify auto generated content. OpenAPI themselves has a tool published here . Other auto generators including the older ones will need different technologies and APIs.

Auto generated images

The same is applicable for auto generated images from various AI generators like Stable Diffusion & DALL·E and many others.


  • Do we, Hive has a tool to identify auto generated content ?
  • Are we planning to allow such content, ie auto generated content ?

I think it would be unwise to fight it with like a zero policy but a way to detect ai generated posts would be great as then we can see and maybe sort out / block ai generated posts if you do not like to see them. Just my 21 cents without 2 much insight in the matter as of now. Thanks for raising an important subject! :)

Yea, its a complex topic and we need to formulate the plans around it.

I think at the end it comes to reputation and pseudo identity building. If users are willing to risk it, at some point they will eventualy be discovered. ... thing about hive, its a long term play...

If someone is writing content entirely auto generated, then yes. What if excerpts of the content is AI generated, if so how much is allowed ? How much is too much ?

I use AI often to help generate photos and help with the writing process and its common practice with many more advanced publishing outlets. I think its something to be embraced because the Genie is out of the bottle. I think its important to lets users know that you use ai to help generate some of your content and for assistance with the writing process.

Thanks for your response.

I use AI often to help generate photos and help with the writing process and its common practice with many more advanced publishing outlets.

Its where the above is going to get tricky. The images that you have created are essentially a mashup other art work or photographs by individuals. Those individuals when publishing or making such work available online what conditions were given for re-use/remix of the work ?

If you take my personal experience, I have this particular photograph shared under Creative Commons Attribution license in 2007 and is available here in wikipedia

Now, the same image used by multiple AI image generation tools and only Stable Diffusion has provided a means to report and remove the image from their database. Basically I don't even have an easy way to find who all are violating the license under which the art / photograph is shared.

Informing users is definitely needed but beyond that the rights and lively hoods of artists and content creators also must be protected. Platforms like ours for example is established to help with content creators and its high time we start having more clarity on next steps.

I have contributed quite a few of my photos to AI without any strings attached, but ideally it would be cool to train a local AI with all the original photos I have taken. I know you can run some AI tools locally utilizing multiple GPUS to render content. Its something I am looking into, but the hardware requirements for fast results could be prohibitive.

Training local images is a cool idea - can help to bypass the hardware requirements though its paid.

