HiveWiki is a specialized Hive interface. It shows an archive of informational articles with interconnected links. It is different from blog-centric interfaces like Hive.blog/PeakD. All of the content lives in the Hive blockchain, in posts visible on other interfaces, too.
It's nice to see a project going on where I actually have some expertise in the field being discussed. I've been on the web since before there was a web, I've been working with wikis and the predecessors to wikis for most of my professional life, and beyond that I spend a lot of time thinking about and working with information management systems. So this is absolutely in my space and I have a lot of questions which are lined up which go to "how to make it go."
The first question, one that's absolutely required, is to determine whether or not the wiki is intended to be specifically about and discussing Hive and the underlying blockchain technology or if it's intended to be a general-purpose wiki for the use of everybody on the blockchain to use freely. Either one is an acceptable answer (though the latter is far more interesting).
We start getting into the weeds pretty quickly. Consider this:
To enable multiple users to edit the same article, it is a system where the newest version wins. An editor re-posts the entire article with edits, and annotates the post to indicate what they changed, wrapping the section in {{ }} double brackets. Brand new articles are exempt.
I don't know if you have ever been involved in any of the edit wars on Wikipedia, but they can get pretty ugly. Given our experience in the Whale Wars on the Steem blockchain, there's no reason to think that they would be any less nasty here. With a mechanism that is as simple as "newest version wins," I can guarantee you in under 10 minutes someone is going to have put together a bot that monitors a given page/tag and simply reposts their preferred version of the page anytime it pops up. It's trivial and there are a lot of people who would have both the capability and interest in doing so.
(To be evenhanded, imagine how much fun you're talking about with pages for both [[@berniesanders]] and [[@themarkymark]]. Or even a page for [[Whale Wars]]. Or [[Justin Sun]].)
This is a real problem, and it's a real problem that non-blockchain wikis deal with a lot and generally by simply securing who is capable of writing to the wiki. That's not really an option here, given the set up. A DPoS solution would implies some sort of voting and the most recent, highest voted branch edit would become the life page, but I think we all know what the issue with that is. It doesn't really solve the problem.
On top of that, there is the issue of continually saving information that you don't need. With a wiki page edit, all you really need to save is the diff of the original page and the edited one. If there is a 32 K original page and I change one, it's really silly for me to repost the entire payload when all I need to do is post a header that says that this is the diff for a wiki page, the diff, and that's it. For any wiki that gets a lot of use, this is going to be a big deal.
It might even be a more serious problem taken into account with the branch priority issue I mentioned before.
Not good.
Initially edit annotations would be loosely enforced and later on enforcement checks the diffs between revisions. Single word edits are possible but ideally less incentivized compared to meaningful edits. HiveWiki built-in editor annotates automatically & lists prior contributors.
That's an easy thing to drop in philosophically – but it's something that needs to be built-in mechanically from the beginning, if you ever intend to have it. Also, as described, this creates a perverse incentive to absolutely and completely deface a page rather than making edits for grammar and punctuation, for example. As described, this would reward people who aggressively change the content of the wiki as opposed to those who refine it. (This is a subset of the problem of trying to reward people for contributing to the site. It's trying to reward the wrong behavior. Editing a page shouldn't be rewarded. Editing a page that no one else has edited subsequently or only you have edited subsequently, meaning the content is stable, should probably be rewarded because what you want is a stable repository of information that is only updated when there's something new to say. The entire idea of being rewarded for interaction with the system may simply be wrongheaded and misdirected for this purpose.)
Plagiarism and abuse are problems for this system. Editing existing pages is encouraged while copying from other sources without attribution is discouraged. Wiki submissions will be excluded for not meeting certain criteria.
This comes down to, "that's nice, but how can you possibly enforce it?" It is, after all, a blockchain specifically predicated on the idea (if not fact) that ideas can't/shouldn't be censored or controlled by a central authority. Or, more succinctly, "who makes this decision and why should I accept it?"
Either everyone can contribute to it and plagiarism and abuse are inevitable, so the tools for controlling them need to be devolved to the level of the individual experiencing them with really the only/best power being too eliminate seeing them for that person and not for everyone, or you accept centralized, top-down control with a group or individual making the decision of what gets seen.
There are ways to implement distributed webs of trust which allow individuals to say who they trust and then lens pages through that web so two different individuals might get very different perceived contents, but that seems to be the only reasonable outcome if you want to stick to decentralized/non-authoritarian methodologies.
HiveWiki exclusion criteria includes:
- global blacklist API for spam networks
- post is missing the tags mentioned above
- post fails to correctly annotate the section of the document changed
- other forms of abuse I haven’t thought of
We've already dealt with this question in the context of the Steem/Hive blockchain, but who decides on what elements going to the global blacklist API? There is a point of centralization. How do you determine whether a post correctly annotates the section of the document changed? Why is that up to the user and not the job of the interface? (Which means the editor needs to be one of if not the first thing implemented to make this work.)
Note that these are not solved problems even for the underlying technology of the blockchain which you are using as a backend database for the wiki.
Personally, I'm not even sure that a distributed ledger is the right methodology for saving this kind of information. From a data-centric point of view, all it does or can do is log transition changes committed from outside. That means that any page that actually has significant evolution is going to have to build the entire thing from the entry point by having the blockchain spit out every single changed to the beginning of the page, temporally, and then put it together. Take it from someone who has worked with underlying databases which have linear access constraints, that's a bad situation. Your most interesting and active pages get slower and slower and put a greater and greater load on just the database section because the individual blocks which underlie the representation don't actually have anything to do with each other. To solve the problem, you have to build a meta-index, but at that point you might as well have started with that as the underlying database, probably using something from the graph DB family, in the first place.
Do you think a system like this would work? What would you do differently?
I think it would "work," for "it functions" definitions of work. I don't think it would be effective, efficient, or make much good use of the underlying blockchain for anything particularly helpful. You could almost get all of the benefits possible by tying it to the Hive blockchain by setting up an independent wiki using a proper underlying database, sharing account names with the Hive blockchain, doing nominal validation, and then simply posting updates that user did to the proper wiki to Hive as posts from them, telling the public what they've done and where they can link to the page that got edited. It doesn't do anything to solve the "what branch should take precedence" problem, but that's a bigger issue.
If you haven't, you should definitely check out Roam – a new personal information manager/personal wiki/group accessible wiki that has been kicking some serious ass on Twitter lately and I've been using for a couple of months now. They are talking about how to develop things like shared wiki spaces for more global access and I think it would be excellent for you to catch up with what's going on over there. They still have to deal with some issues like "what about colliding name spaces?" But they're doing a good job. Absolutely worth checking into.
While this project is interesting – I don't think it is a profitable line of research until some of the more elemental questions have good answers.
Wow! Much more valuable feedback than I expected to receive. You have my upvote, sir.
I envisioned this to be a general purpose tool for information storage, beyond Hive-related topics. In fact, the idea was sparked from this article talking about other crypto projects being banned from Wikipedia. https://cointelegraph.com/news/just-like-bitcoin-before-it-cardano-is-banned-from-wikipedia
The issue with posting only a diff is you pollute other interfaces with junk posts. You could solve this by posting the diff in a custom-JSON operation, but you lose the visibility and the upvoting/downvoting/commenting social mechanics.
Valid points about edit wars .. we need to think about that a bit more.
I considered the option of keeping the content in a separate database for performance reasons, but decided against it. The content has to live on-chain, IMO, either in posts or under the surface in custom-JSON ops. The off-chain data would be a meta-index of the articles, which would be easy to rebuild by replaying previous blocks of the chain.
There is already an off-chain media wiki instance running for Hive specific information: https://www.beewiki.dev
Checking out Roam...
Just stumbled upon this from browsing Propolis Wiki (@propolis.wiki) - how I would love to get some extended feedback from you too!