A Document to Reference

in LeoFinance8 months ago

As negative as I am about the impacts that some of the generative AI is going to have on society as it competes against and beats the skillsets of the majority of people, across nearly every skill that we are capable of, there is an inevitability to it. And, it has been heading that way for a very long time already, even though the majority of people don't really pay attention to the way information is handled, and how we as humans interact with it.

image.png

For instance, for those of us who work in typical company, how are you handling your documentation? Typically, it looks something like a Sharepoint, Box or Drive repository, using a naming convention and file structures to ensure that a particular document is stored in the right place. Then, there are likely duplicates of that document, for instance, a spreadsheet that gets stored in several locations depending on which stakeholders need it and then, it has also been distributed by email, getting stored locally. On top of this there are edits, so new versions are created, making it near impossible to know what is the latest version. And then on top of this, there are access models for who is able to see what and when. Then there are other repositories, like Teams file storage to contend with, further splitting information.

But, what is interesting is, all of these are stored in a hierarchical structure like a physical filing cabinet, which is logical based on what we have done previously, but doesn't scale when the cost of creating documents has gone from high, to free. The amount of documents have exploded in the digital era, even if we aren't printing them into a physical form.

And, it is because of this explosion of documentation that we have developed increasingly clever ways to remove the need to dig through folders. Sure, we probably still have them on our desktops, but considering that the entire internet is essentially stored in folders, how do you search for the football scores? Instead of diving through the news folders using a convention unique to that site, then moving to dog through folders of another convention to see the weekly weather on another site, we have put interfaces over the top so that we get the information we need, in a consumable format, that doesn't require us knowing the source from where it is arriving. We don't know where it is stored, nor do we care, we just want to know if it will rain or not, and if our team won.

What the Large Language Models (LLMs) are doing, is essentially scraping through all of those folders to pull bits of information out that it deems relevant and then sticking it together in a comprehensible way, so we can understand it. It is pretty clever, but this isn't actually the solution to the problem of information integrity, as the source matters.

Firstly, a far better way to store documents is in a timeline, which is what a blockchain does with bits of information. The reason this is so valuable is that it provides time accuracy for the content that can be referenced when needed. But, by itself, this is not very useful for an organization, because knowing when something was created to find it is harder than a filing system. So, to be useful, it requires cross-referencing with contextual meaning. For instance, if it is a contract that was written for Customer X. However, there might be multiple versions of that document, as it has moved through drafts and revisions before the final. If every revision is also a "new" document on the time line, but referencing the one prior, it creates a single lifecycle line for the document, meaning there is only one copy of the document, multiple versions, but like a blockchain transaction, it is always trackable.

And then, for this to be useful, we need to be able to visualize it as some kind of final document, or visualize a version of that document in time, like a previous draft. What this is doing is essentially creating a snapshot view of the document in the same way a block explorer can show a particular block.

Corporations need blockchains just to manage their information flows.

But, corporations are also going to start changing what they mean by "document" because what the LLMs are able to do is take small slices of a larger document and put it together to create something else. For instance, if it was asked to create a powerpoint presentation on the financial numbers for the quarter, plus whatever marketing has added into the mix, it would be able to go through SalesForce, scrape reports in Sharepoint and take highlights from marketing materials to create a composite document. Once it is tweaked a bit and learns what it needs to do more precisely, it can repeat that report with updated information automatically, rather than having paid person manually locate, search through the documents and cut and paste into the powerpoint.

If your job is creating powerpoints, start retraining.

The AI tools for finding, extracting and creating views of data are going to get better and better, but that differentiator is going to be the integrity of the information it is using. Right now, the LLMs are largely seen to be used scraping the internet, but where they are likely going to be the most valuable is in closed corporate environments, making sense of all the information that is coming into the various repositories around the company and across organizations. Because the AI doesn't care about the location, a timeline with context is the most sensible way for document and information creation. So, an enduser will create something like a contract and save it, without knowing where it is saved, just that it is saved. From there, the next person in the chain doesn't need to know where it is either, they just need to know what they are looking for. And in between, the AI is creating handshakes.

Because information is timelined and contextualized across multiple reference points like who created it, as well as being able to be compared with other similar content, the quality of information goes up, and gets handed to the right person at the right time they need it.

Ever struggled to find a document at work?

Now, for a corporation, immutability isn't something they need (or want), but they do want traceability for as long as they are legally obligated to have a piece of information. So, a pseudo-blockchain suits their purpose for the trackability of the references, not the documents themselves. For instance, it isn't required for the chain to hold all the image data, or videos, it just needs to hold the references, whilst other storage holds the content. This gives them the ability to track all documentation, as well as filter granularly based on an infinite number of search filters to find just what they want, or see it in the way that suits them.

For instance, https://hiveisbeautiful.com/ is a great site that visualizes Hive transactions that look like this.

image.png

Embedded into a single transaction are multiple reference points, so depending on what is required, only slices are used. In that bunch there, you can see some upvotes, some claims, some Splinterlands, some Hive-Engine etcetera. Each relevant interface will use a bit of that information for its usecase, whether it is a transfer or a submission of a team into a battle. It is just a document with random bits of information on it.

This is the future of business documentation.

A future where information floats about in a type of data soup with tags attached to it, and AI filter that information based on its programmed needs. Because everything is on a timeline, it will be able to increase the relevancy, know which is first and last, and ensure that the most correct information goes into the view given to the user. And, relevancy is far higher when there is some control over what kinds of information is in the system already. Since it is all coming from the same corporation, information trust is higher than if it is coming from random internet sources where there may be no known track record of who created it.

For years, we have already been moving in this direction, which is why the people who have grown up on mobile phones only don't have good folder structure methods and can struggle in corporate environments using them - which is most. And, there is a massive amount of human error in systems that rely on SOPs and naming conventions to ensure document integrity, because people just aren't consistent enough. Automation is the only way, and because it is also the one that will bring the most profits, it is the way it will go.

As simple as ledger logic is, it is going to fundamentally change the way businesses handle their information, because it aligns so well with the automation processes they want to employ. It is designed to be logical and have integrity, which is what is missing on the internet of information at the moment. However, given some time, the internet will start to reorder "itself" into a more logical structure that the AIs are able to better manage and rely on. This means that it will start to weed out low quality information in a process of "no confidence" voting mechanisms, where content that doesn't have strong enough references, are omitted from search requests.

This is a form of web of trust.

@blocktrades has been dabbling with webs of trust for a while now and at scale, it is going to need AI support to really make it useful, because the amount of relationships between individual pieces of information is very high, and then being able to consolidate it into something useful in a timely fashion takes a lot of processing. Humans can't do it, which is why we use heuristics to judge our world in order to think fast, and the AIs will do the same except at a much greater amount of information input, and subsequent usecases and view outputs we will demand of it. But, for an individual company, it doesn't take that much effort, because there is already narrow context and known rules that can be applied to the usecase, laws, industry etc.

Document management isn't something most people think about as they search the internet, or even when they have lost that report they need for their meeting in the morning. It just isn't sexy. But, all the information we consume digitally, is stored in a document of some kind somewhere, whether it be a news story, or a dataset. But, it is fundamental to the way we live our lives and the industry is evolving to run parallel to blockchains, even though they are still far behind and are yet to really understand what they are looking to do. And even as they chase the tech, they still don't see the application for blockchains.

Taraz
[ Gen1: Hive ]

Posted Using LeoFinance Alpha

Sort:  

Easy storage and retrieval of information or documents, I agree, is very ineffective when managed by a file / folder type of means. But, before shooting headlong into blockchain or AI, most use cases can be effectievly handled with less exotic tooling, like a document management system (database in front of the information / documents) or a NoSQL front.

Blockchains strengths lie in being trustless. For information or document retrieval, it may work at small volumes, but not when it reaches any significant level, even for a small business - compared to a Doc management system or key-value store (NoSQL). And it boils down to being ACID compliant within milliseconds and plus quick recovery, which blockchains are not. A whole host of problems arise without data, or documents, being ACID.

AI... better for creation in areas like marketing, while not so good for discrete, repeatable, auditable, provable and legally compliant proof of record.

So, what makes documents, information easy to find or discover? Part of that answer is it has to be fast. So far, I haven't seen an indexing solution that out performs a database or key-value store. So why not use them? And AI... no. I'm not bashing AI, in fact it has a ton of uses, but for discrete document or info storing and retrieval, especially for financial applications, there's long proven, effective ways that serves us well.

So, why can't people find "stuff"? They simply don't use a tool and insist they should just be able to throw information any dam where they please. Then, some magical, all access bot, should scoop it up and add it to a neural network, train it at $1,000 an hour so you can hire a "prompt artist" to concoct magical phrases to extract a semblance of the data, dripping in a trendy designer layout... uh, yea.. right.

This is the part where you chuff, bow out your chest, cross your arms and screech, "OK, smarty pants... how would YOU do it?" TO which one simply replies "Settle down... Francis..." and show examples:

Any piece of info, receipt, user manual, article I store into document management. It could be an image, hyperlink, pdf, audio, video, etc and tag it by group, type and description. One place to store and look for any and all types of info, like this showing some of my crypto notes

But that's not the trick - the trick is... you have to be able to handle EVERYTHING, which is not a problem as I also store calendar appointments, stock tickers, receipts, legal docs, tax filings, maintenance schedules, programming notes, health info, home and garden, employment records, retirement rules, computer specs, ...

Here's just a few of those categories that store computer info (helps when I build rigs out)

Just thought to shine light on some old skool software that's hellish performant and strong like bull. Me.. luddite? Nah, I'm adding AI interface to it currently... after all, it's go to be everything...

most use cases can be effectively handled with less exotic tooling, like a document management system (database in front of the information / documents) or a NoSQL front.

There are plenty of DMSes, none of them do the job well enough at this point.

Blockchains strengths lie in being trustless. For information or document retrieval, it may work at small volumes, but not when it reaches any significant level, even for a small business - compared to a Doc management system or key-value store (NoSQL).

This is why I mentioned "pseudo" blockchain. It doesn't have to be a blockchain, just resemble components of it. No company wants their data immutable.

So, why can't people find "stuff"? They simply don't use a tool and insist they should just be able to throw information any dam where they please.

I don't think you understand the complexity of the problem at scale. We are talking about globally distributed organizations, handling hundreds of millions of documents yearly, localized legal models, external collaboration, multiple repositories, complex user access models and then, automation based on the changing information within the documentation. It isn't simple storage and retrieval.

However, what does work is using a database or key value store similar to what you are using (based on your screenshots) as the "blockchain" of references. Then it doesn't matter where the information is actually stored, as they are all linked to the timeline backbone, including the versions, which at an interface level can appear as a single document.

This allows people to throw information wherever they want and as long as it is appropriately labelled at that point (done automatically or user-defined), it will be allowed to join into the stream.

One place to store and look for any and all types of info, like this showing some of my crypto notes

"One place" just doesn't work at enterprise level for so many reasons. One of them is of course that human data hygiene is terrible. One of the others is just the practicality across quite different tech stack needs based on department usecases.

Enjoying the solution ideas. And if you don't mind me with additional foods for thought...

References are fine within an entity, as there's control and knowledge over it's availability and existence. But linking in any other case, I'm gunna balk.

It's all good and fine when all linked systems are up and behaving, but one broken or unavailable reference can degrade validity quick. If every document or info isn't available 100% perfectly all the time 24x7x365, else there's that inevitable finger pointing shootout scene, when there's data that needs to be produced and it's either incomplete or late. Other people's / department's / company's systems ALWAYS go down when you need them most - that I learned from trading... nah, it's always been true.

However, federate it into "one place" and you have reproducable results and control over the data, even if transient. 100 million documents.. decent sized, doable.

I prolly sound jaded, but having been called into meetings and seeing Bob blaming remote Jane blaming remote Sally, ad infinitum, gets old fast. I've personally witnessed it in airlines, electrical grid systems, all the way down to daycare, schooling, local govts...

None of the DMs do the job well enough? At 100 million documents I wouldn't suggest a third party solution - coder up.. way cheaper, faster and functionality out the wazoo.

However, federate it into "one place" and you have reproducable results and control over the data, even if transient. 100 million documents.. decent sized, doable.

One place just doesn't work for a global corporation, as there are localized laws that prevent it.

It's all good and fine when all linked systems are up and behaving, but one broken or unavailable reference can degrade validity quick.

I prolly sound jaded, but having been called into meetings and seeing Bob blaming remote Jane blaming remote Sally, ad infinitum, gets old fast.

I am sure it does. I have gone into these companies too, and trained them on solutions that do work at scale :)

I may have reservations about the rapid growth of AI, but as you pointed out, it's an undeniable inevitability. In the current professional landscape, there is a heightened focus on big data and automation. I'm genuinely concerned about certain administrative and repetitive job roles, as they seem susceptible to being replaced by AI.

In my current job, I'm responsible for familiarizing myself with these automation tools to facilitate process improvements. However, I can't help but worry that my efforts to bring about these "enhancements" might ultimately result in someone losing their job 😖

I'm genuinely concerned about certain administrative and repetitive job roles, as they seem susceptible to being replaced by AI.

People are going to have to be very good at their non-AI enabled jobs. A lot of the admin work will be gone in a decade.

However, I can't help but worry that my efforts to bring about these "enhancements" might ultimately result in someone losing their job

I know your fears.

In my current job, I'm responsible for familiarizing myself with these automation tools to facilitate process improvements.

What kinds of tools are you looking into?

Some of the software applications that I am using are UiPath, Automation Anywhere, Alteryx, etc. These are powerful tools capable of seamlessly taking over most routine tasks. Consequently, it's really important for us to constantly upgrade ourselves and remain relevant and valuable in the role that we are in.

Having not worked at that level of corporate, only some of this made sense (an "understood all the words individually but not how they're put together" kinda problem XD). Though some of it did sound like how I have my notes in Obsidian (but probably a lot messier seeing as I'm the only one who has to deal with it XD).

Amusingly I've always thought that this kind of application would make the most sense and be the first to be done but no, art (and art forgery) XD

When it comes to simple document management, there have been some okay solutions for a while, but they fall down at large scale and complexity. However, there is no thing as "simple documentation" at enterprise level, as the needs vary so much within a single organization and they also have to collaborate externally, securely. Much harder than people think it is :)

The most I have to do with documentation is filing incident reports (and I haven't had to do any recently).

Do you have a lot to do with documentation for your job?

I worked in IT, and automation has always been the ideal scenario for a lot of things. Extending tablespaces, freeing up space in the system, creating new tablespaces, upper management always want these to be automated and streamlined. Now generative AI is the next step of that. From what I've read, there are some companies firing their developers. They only leave 1 low level developer that can work with the AI and generate code, and a high level developer than can overlook the resulting code and check.

Outside of IT, generative art are getting so good that artists are having difficulty with the competition. If you look at Golem Overlord [a hive game, and maybe terracore], they use AI generated art for a lot of their game art. I've seen some posts about movies looking into AI for writing media, and might have been one of the causes of the writers strike.

AI is the future, and the earlier we learn to adapt to it, the better.

They only leave 1 low level developer that can work with the AI and generate code, and a high level developer than can overlook the resulting code and check.

AI doesn't have to be perfect, it just has to be slightly better than the average. At some point, there will be 100 brilliant coders in the world, adding slivers of code that the AI hasn't worked out yet - until there is 10, then 1 then zero coders.

If you think about how the studios are using AI, it goes to show what they care about - profit. It isn't about making the best experience for humans. This is the same with all business. What the companies don't realize yet is eventually, AI will do all they need cheaply, but there will be no one with a job who can afford to pay for their product.

Yes exactly. Once everything is automated, and people don't have a job, what happens then. Even low level jobs like fastfood restaurants have tried self service kiosks for ordering. Some are using bot delivery. Self driving cars are getting better. Airplanes are mostly on autopilot with minimal pilot influence. In a few years, almost everything will be automated or ai generated.

I have also read a few suggestions of introducing a universal basic income/handout in the US in preparation for this. It really feels like we're heading into a dystopian future.

Besides the art side of things, I haven't really played around with AI too much. It just hasn't appeared to me that much and I haven't seen enough real world examples to really appreciate what it can do. I might need to take the time to play around with it more.

It is going to be invaluable at scale. Most of us only think about the usecases we have as individuals, but if you have 100K employees speaking 50 different languages, distributed around the world, then dealing with a million externals and need everything to be secure, there are more considerations.

For instance, most countries have laws that say "this information can't leave the country borders", which includes on cloud services. This means a document repository has to be able to keep a particular piece of information stored within borders, whilst allowing other bits of information outside, without a person having to handle that. There are solutions, but a single repository doesn't cut it.

Yeah, that makes sense. I think for me one of the biggest things I struggle with is documentation. Most specifically with processes. Being able to leverage AI to help with that might be a good thing.

The search for a document has often been a headache for me. It is like looking for a needle in a haystack. Sometimes I try to be tidy, but having so much information, sometimes my brain does not remember where and under what nomenclature I have stored it. I agree that A.I., if we know how to use it, more than an enemy, will be our ally in many areas. Greetings

Sometimes I try to be tidy, but having so much information, sometimes my brain does not remember where and under what nomenclature I have stored it.

This is common, especially at scale. Naming conventions don't really work.

I agree that A.I., if we know how to use it, more than an enemy, will be our ally in many areas.

It is going to take quite a lot of the basic work out of admin, but it is also going to take a lot of jobs with it.

I know that the shining usecase for this stuff is definitely information management. Being able to sift through a shit load of things to find information we need is crucial. Right now having to search my company’s intranet or documentation repository for what I’m looking for is horrendous. I look forward to the day where I can give an AI a few prompts and have it go digging through that shit hole to find me what I need lol

Right now having to search my company’s intranet or documentation repository for what I’m looking for is horrendous.

Looking for a better way? ;D

My company used to do physical storage and the documents to another city. It was real donkey work. For more than five years, we have switched to digital storaging. We keep documents in the network of the company, every department has their own folder, thus searching and finding a document is not a problem if you keep them correctly and regularly.

thus searching and finding a document is not a problem if you keep them correctly and regularly.

Just wait.

Humans are humans. People are in a rush. There are typing areas, differences in spelling of locations. A million fail points. :) There are solutions that tie these things together though.

I have been a playboy in Notion lol. I spend a lot of time trying to document my finding and knowledge and that has been pretty well journey so far. I feel that document management and also handling and processing knowledge as we acquire it and then filter the gist of it is kind of human life. And yes we do need AI to handle this in near future. Just not sure how all of this would connect dots like each person has different way to manage the data, interpret it's meaning etc. So we have plenty of things left in to cover over.

Just not sure how all of this would connect dots like each person has different way to manage the data, interpret it's meaning etc. So we have plenty of things left in to cover over.

The trick is to handle the labeling of the information at a structured base level (semi-manual model or automated) so that it doesn't matter where it is, it can be found. The interpretation comes down to human (or AI) however, as always.


I think the bottom line is that AI
(artificial intelligence)
It is going to replace many jobs and it would be good to start adapting to these new changes or else the change will eat you alive.
AI is not the enemy, but the future that many cannot and will not accept.
pd: I do not accept it because it will only be a matter of time before it becomes a weapon of military destruction.


It already is a weapon of military destruction.

So many things has been made easy with the help of Al, and so many people have lost their jobs because of this same AI which is sad however AI is here to help, it only serves as an assistant to man but let the truth be also spill out, it is also a threat to man

The biggest threat to man, is man himself.

I never really considered document management but it does make sense. A lot of data is out there and even AI is ingesting that data to improve. I have seen how AI is pretty amazing and I have seen a video of Geoguesser pro versus a Stanford AI. They can take so many different values such as images and then they also add in text. Imagine all the parameters they can consider and it will be very hard for humans to compete later.

I'm pained about the fact that AI is snatching our jobs and doing what humans should do but look at us. Do you realize that so many people are really adopting the use of artificial intelligence? They are seriously embracing it so what do we do?

As an editor, I am constantly overhauling these systems for better efficiency.

Timeline model is the best, I always want to be able to work with the most recent copy, while also being able to go back and reference any prior drafts.

But I felt it a year ago at Cointelegraph. Unless you are the one writing the prompts, there will be limited human input int he copy/edit process as people shift to AI/steaming video. I'm getting ready to throw in the towel on writing content. I'm moving to 3SPEAK!