When is a Ruler Not a Ruler?

in #fractal2 years ago (edited)

Author: Mike Manfredi, ƒractally team member

Personal Mission: Build Beautiful Things

Note: These are my thoughts and views, not necessarily those of the ƒractally team.

Genesis Fractal Stats: Some thoughts on what we are measuring and how bad the ruler is.

Introduction

ƒractal democracy, or ƒractal governance, is a brand new means of organizing people to act together without corruption and loss of their mission. We meet weekly to innovate the process, and we have the privilege of having some extraordinary members who are bringing mathematical and model-based rigor to our experimental method for better human coordination without centralized authority. While my thoughts below are specifically a result of reading Matt Langston’ work, I also want to call out Jordan Masters and Jorge Ramos. They hold us at ++ƒractally++ to a high standard of scientific and innovative rigor, for which we are grateful. A huge thank you goes out to them for paving the way, collecting and analyzing the data, and taking the first steps to bring scientific and meaningful analysis to what we’re doing. None of the thoughts that follow would have would have happened without their pioneering work.

In his most recent blog post, ++A Model-Independent Method to Measure Uncertainties in Fractal Governance Consensus Algorithms++, Matt Langston characterizes the precision and accuracy of the subjective measurements the members of the Genesis Fractal make each week they meet.

Another source of rigor and high-quality thinking in our midst is James Mart. For a technical and specific response to Matt’s post, you can read ++James’ response++. I, on the other hand, will offer some higher-level / conceptual ways to think about the data and what I believe is actually happening during the Genesis Fractal’s weekly meetings.

Matt discusses the accuracy and precision of a single member’s evaluation of the value of members’ contributions to the Genesis Fractal community as compared to that of other members’. James offers a thermocouple as a way to relate to the accuracy and precision of a person as a measuring device. See his article for more on that idea.

We Rank in Meetings; We Don’t Measure

To expand on James’ idea of the thermocouple, consider a restaurant in hot weather with a thermistat near the door. The thermistat will register wildly different temperatures as people come and go from the restaurant, sometimes measuring the outdoor temperature as the hot air rushes in and sometimes measuring the cool air-conditioned air inside as the door remains closed for a few minutes at a time.

The equivalent idea in the ƒractally meetings is a member in an individual meeting. A member who finds himself in a group of historically high-ranking people is forced to rank one of those top achiever’s contributions that week as Level 1, simply because everyone’s contributions must be ranked Level 1 to 6 in group meetings. It’s obviously no different if a group is composed of members whose average contributions are Level 1: one of them must be ranked as a Level 6 that week.

Even if each member is an accurate “measuring device”, they are forced to rank the 5 or 6 contributions rather than measuring the contributions by assigning a level independent of the others in the room. Ranking 6 thermometers’ outputs doesn’t tell you anything about the actual temperatures they’re measuring. So where we’ve talked about members as “measuring devices” in the past, that now sounds like an inaccurate description. All members measuring repeatedly over time will yield a measurement of actual member contributions, but any one person’s “measurement” in one weekly group meeting can only be a relative ranking.

It probably needs to be said: this isn’t a problem nor is it wrong. It’s simply the nature of the measurement system. The smaller groups rank their 6 people, and over time the averaging tells us who is consistently ranked highly (consistently providing value to the fractal). The critical point is simply that one person’s weekly measurement isn’t representing the measurement device properly, when the nature of the measurement is a rank order.

Interesting side note on what we can and can’t measure with this system: Any one-week blip representing someone knocking it out of the park one week will never show up in the averaging, other than a small bump in that person’s average around the time of the larger-than-normal contribution. What the device is measuring is consistent and high-value contribution. If a person makes a ridiculously valuable contribution in a single week, they would need to campaign for numerous weeks to be ranked highly until that contribution had been fully compensated. Further, we would be unable to distinguish whether a group’s ranking of a contribution as Level 6 was due to high value or done on a whim.

I would love to see another iteration of Matt’s analysis where the rank order of contributions established in that meeting is compared to the ordering of those same members in the recent historical stats.

The Thing We’re “Measuring” is Changing

Each week, we’re ranking the contributions that each member made. When a new, quiet member has a particularly insightful conversation in Week 3 and kicks off a wildly productive project the community deems highly valuable, the measured value of their contributions may go from a sustained 1 to a sustained 6 (and their average will then start changing and change for weeks until the multi-week averaging “catches up”). The first derivative of their historical average will suddenly change and stay positive until the weekly average catches up.

While the trend of their contributions’ value is climbing, the “measurements” taken in each group meeting will be divergent from the historical averages. To get a better view of the “accuracy” of a member’s measurement in a particular week, it would likely be best to compare that member’s measurement that week to an average centered on that week and extending over the previous few weeks as well as the subsequent few weeks. This would begin to align the single “measurement” with the change in trend.

Humans Are Noisy

But it’s worse than just changes in trends. Humans are, well, humans. We’re fickle. We’re emotional. We’re unpredictable. We’re ignorant. And most importantly: we learn.

If members are “measurement devices”, what are they measuring? The ++Genesis Fractal++ has a mission statement:

- Mission Statement - Genesis Fractal seeks to build the technology, raise awareness and grow adoption of ƒractal governance, as outlined in ++ƒractally’s white paper++.

The mission statement theoretically governs how members value contributions, given we’re all in this boat trying to row somewhere, and the mission statement is supposed to be the destination we agreed we’re interested in rowing towards. However…

I’ve been in groups who voted members Level 5 or 6 one week because they were new, and the group thought a high ranking might give them a great first experience with the fractal. I’ve been in a group who took someone who had been achieving an average ranking around Level 3 and elevated them to Level 6 that week because what he did had enough value and he had never been rewarded a Level 6. I’ve also seen some people bring such exciting, inspiring, potential-holding, not-before-seen in the fractal contributions that they were ranked Level 6 despite their contributions being questionable fulfillments of the fractal’s mission statement or their “contributions” being promises for the future rather than results already produced. On the flip side, I’ve seen extremely humble people downplay their own contributions and argue their way lower in the rankings.

So what kind of measurement device do you build out of humans? Imagine a thermometer that reported a number with a degrees symbol next to it, but sometimes it was reporting the temperature; sometimes it was reporting the barometric pressure; sometimes it was reporting the combined score from the recent Laker’s game. Humans can make some pretty poor measurement devices. However… given time, and averaged over time, we can average out the blips and have great, clarifying discussion and… most of all… learn.

A Measuring Device that Can Learn

Learning is a crazy thing for a measurement device to do! a thermometer that changes its readings over time as it “learns” to read temperature?! This led me to the idea of AIs and training sets. In AI, it’s common to give a system a “training set”. For instance, training an AI to recognize pictures of cats would involve giving it millions of pictures and labeling each one “cat” or “not cat” (so the AI knew the right answer), and letting the model figure out how to tell the difference. After a few million examples, the AI might be able to tell the difference with a high success rate.

If you were measuring “catness” using the output of this AI, you might find it super accurate. The training set was large enough, it learned well, and it can automate the tiresome task of identifying cats in pictures.

What wouldn’t make sense is to start taking the AI’s answers about catness starting from the very first picture of the training set (the first picture it ever looked at). When the AI starts out and has no idea what a cat looks like, it will answer poorly nearly every time. It’s only after millions of iterations that the answers it gives start being accurate. Communities will go through a similar process as they define and refine their mission statement. They will clarify the how, the strategy, they will use to fulfill their goals. When a community sets out on a mission, perhaps no one in the group has any idea how to produce the result, e.g., end world hunger or cause world peace.

The members may value what they think fulfills their mission only to find months or years in that their approach is an abysmal failure and needs to be rethought, or perhaps a new member offers an insight that leads to a completely different approach that’s far more effective. Even more interesting: what if that brilliant new approach is only recognized as brilliant by a few people initially who start ranking it super high and talking about it. Maybe the rest of the community is stuck in its ways or don’t see the value in the new approach and resist it for months. During that time, the “believers” in the idea will be ranking those ideas very highly when enough of them are in a group together, skewing the averages for the member who contributed the idea.

Wrapup

We at ƒractally are delving into what we’ve been calling “subjective consensus” (to distinguish from the “objective” consensus algorithms commonly used in the blockchain industry). Subjective consensus is very messy compared to any kind of scientific measurement. Who’s to say what will best fulfill a mission statement? We employ the wisdom of the crowds to approach an answer to that question, but that relies on aggregated numbers, averaging the rankings from weekly group meetings into something we can call a measurement.

We are likely still in the very early days of understanding what’s actually happening, how best to measure it, and the best/fastest way to achieve subjective consensus. Come join us in this grand experiment. It’s not just fascinating; it’s quite a bit of fun as well!

Sort:  

Nice write-up.

I'm still convinced that we can use objective measurements of value rather than relying on rank orders.
The easiest objective measurement to use is the US dollar. (Note: Our current hegemonic US dollar once represented 0.822 troy oz of fine silver and allowed people to compare the dollar (ie. a specific amount of silver) to other goods & services. Nowadays the dollar is the global hegemonic standard. These objective measurements will allow contributions to be compared across groupings.

Other objective reference points are: time-wage rates or piece rates. In 'A Measure of Sacrifice', Nick Szabo wrote about the history of the clock, industrialization and the advent of time-rate wage (ie. hrs spent) and compares that to piece rates (ie. amount produced). There are deficiencies in using either time-rate wage or piece rates and these tend to be better for evaluating commodified goods & services, but nonetheless these are well-understood measures. For example when measuring my fractally blog post contributions I keep track of # of words (piece rates) and # of hrs (time-wage rates). These measurements focus on cost and cost is a 'proxy' of value. The dollar is more useful to measure subjective value than costs because people are used to the dollar as a measurement of both value & cost. For example people can come up with 'million dollar ideas' and people can relate to the value of innovation. We can even put dollar values on priceless art for example when that Van Gogh painting sells at Sotheby's. So the dollar is the best current reference of value and it's easily understood by most people around the world.

In the future we can probably use Bitcoin as the unit-of-account because it is the modern form of money & has energy as the unit-of-account.

Measuring the value of this comment:
Number of words: 292
Time spent: ~30 minutes
Value: ??? (How many views did this get? Was this educational? Did it help the fractally team? Did it lead to improvements in the fractally measurement system?)