Bounded vs Centered Sets

in #language4 years ago

In recent years I've come across an interesting distinction that I keep having to explain to people. To make it easier I am writing it up in a single place for reference.

Specifically, there's two different ways that humans can conceptualize a definition or category: Bounded sets and Centered sets. It seems to be a discussion that's all the rage in religious circles, but I first ran across it in an entirely secular context in the IT field.

Bounded set

A bounded set is a set defined by its boundary. That is, there's a (reasonably) clear edge to the set, and something is either in it or not. There is some definition of objective attributes that something must have in order to be part of that set, and it either matches those attributes or it does not. That leads to a very binary sense of membership.

  • One is a graduate of a given school, or not, depending on the criteria "has a legally issued diploma."
  • A piece of fruit is either an apple, or it is not, depending on the criteria "is roundish, fleshy, and grows on the Rosaceous tree."
  • A person is a citizen of a given country, or not, depending on the criteria "has a passport issued by that government or is eligible to obtain one."
  • A person is an employee of a given company, or not, depending on the criteria "has an active and legally valid employment contract."

In all these cases there is, ultimately, a reasonably objective yes/no answer. A round fleshy fruit that comes from an Orange tree is... not an apple, as it fails at least one required criteria. I am a citizen of the US, and not a citizen of Canada. Additionally, every example is definitionally equal within the set. One person is not "more" of a graduate of a given school than another. One fruit is not "more" an apple than another. A graduate is a graduate, an apple is an apple, etc. (at least as far as membership in the set is concerned).

Centered sets

A centered set, by contrast, is defined by an archetypal "center," or platonic ideal definition. An object is then defined by how close it is to that idealized center, and by whether it is moving closer to that idealized definition or further away from it at any given point in time. There is no real boundary where we can clearly say that something is not that thing, but conceptually something can be far enough away from it that most would define it as not close enough, rather amorphously. That also means that it's possible to say one object is "more" or "less" that category.

Consider "tall people." We can generally agree that certain people are "tall," without a specific cutoff number of inches. There's also an ability to rate certain people as "taller" than others, and one's height may vary over time making someone more or less tall. Or consider "healthy," where we have a mental conception of perfect health that no living human being achieves (and neither do the dead ones, obviously), but we can still say that one person is more or less healthy than another, and someone with a severe debilitating disease is rather far from that center.

Interestingly, other research shows that while English tends to favor bounded-set words, our brains tend to categorize using centered sets. Consider the set "bird." When I say "bird," you likely have some image flash into your head of what an archetypal bird is. It probably is something like a hawk, or a sparrow, or some classically "bird-like" thing: Wings, feathers, flying, small enough to land on you without crushing you.

Of note, though, your center-point for that definition and mine may be somewhat different. I may have thought of a hawk or eagle, you thought of a sparrow or finch. Yet we'd both agree that all of the above are "birds." It's likely that none of us thought of an emu, however, or a penguin. Yet those are birds, just farther from your mental point of reference. (There is a scientific bounded set for "birds," but I'm referring here to the colloquial and psychological usage of the phrase.) We'd probably agree, though, that a crocodile is not a bird, although it is, in fact, closer to a bird than humans are. (Viz, a crock is closer to the archetypal bird than a human is, yet both are "far enough" from that archetype to not be included in that set.)

Complicated cases

Centered sets get very interesting and often contentious, as their edges are by definition highly fuzzy and dynamic. The edge moves over time, depends on which direction something is moving (toward or away from the archetypal case), the archetypal case may not be universally agreed upon, etc.

For example, "American citizen" is a clear bounded set. "American," however, is a centered set with an extremely contentious political history on which there is a great deal of violent disagreement (often literally). I don't want to get into exactly what that category's center point is, or what its drop-off is, or the complex history of it, just note that distinction as a comparison of bounded vs centered sets and where it gets complicated when one can't decide which is which.

The same is true of virtually any term with cultural implications: It is a centered-set, not a bounded-set, so its edges are squishy and its platonic ideal definition usually not universally defined.

Legal challenges

The law is often criticized for being obtuse, complicated, and excessively verbose. That criticism is not entirely invalid. However, like most complexity it exists for a reason: The law is concerned with bounded sets. Most human behavior is centered-set. The complexity of law comes, in large part, from trying to convert a centered set into a bounded set.

Law can't really work with centered sets. If an action is going to result in some punishment, then there will be cases that are very obviously that action, cases that are very obviously not, and... lots of squishy gray area. Lawyers are trained to nitpick, because nitpicking is how you figure out which side of the bounded set line something is when it's unclear (or can be argued to be unclear).

As a classic example, there were for a long time different import taxes for "dolls" vs "toys." (Whether that's wise or not is another matter; there used to be.) But... what's the difference? A doll is a toy in human shape, but many action figures are also in human shape. A statue is also in human shape, but you're not supposed to play with it. Except plenty of kids do. So is a given human-shaped object a doll, a toy, or a statue? If the figure it an animal then does that make it not-a-doll? What is the bounded set line between "doll" and "toy?"

And that is how you end up with Marvel arguing in court that the X-Men are not humans. Mixing bounded and centered sets gets messy at times.

In software

The bounded vs. centered question is also a large part of what makes software development, especially custom software development, very hard. Human processes have millions of edge cases baked into them, which our centered-set brains can handle. Is this task "done" or does it still need work? "Done" is a somewhat fuzzy term, and something may be "good enough" to be done according to one person but not to another. Software, however, cannot really deal with "kinda done." The software needs a fixed, bounded-set, binary way to determine if a given condition is met or not. That means writing software consists primarily of arguing with other humans about what those conditions are, and all the edge cases that are implied in the human centered-set definition but beyond a computer to derive on the fly.

That's what is then interesting about machine learning. (Calling it "AI" is frankly too generous, and a misnomer.) Machine learning is primarily based on pattern matching, that is, using a dataset to turn a large number of tiny bounded sets into an aggregate centered set, and then determine how much some input matches that centered set. That's why, for instance, image matching processing will give a percentage chance that a given picture is of a cat; it's saying that an image is 75% of the way, or 64% of the way, or whatever toward to centered-set definition of "cat" that it has built up, by saying that 7,500 of the 10,000 tiny binary set questions it is asking are true.

That's actually an important observation for humans, too: Your centered-set is not the same as someone else's centered-set, because their centered-set is built from a different data-set than yours, just like two machine learning systems fed different input data will disagree about what is or isn't a cat. That is especially true with the definition of words that are not firmly concrete; "socialism," "communism," "capitalism," etc. are all centered-set words, with vastly different center-points depending on who you talk to. That's part of what makes any conversation using those terms fraught, because the odds that you're referring to the same thing are surprisingly low.

In psychology

While I am not even slightly an expert on the subject, a number of my autistic friends have expressed to me their difficulty dealing with fuzzy, implicit situations. What they are describing, often (and they've confirmed this to me when I discussed this concept with them), is that they really want bounded sets. They want to be able to rely on a yes/no bounded answer, but they're stuck with human situations that are almost entirely centered-set squishiness.

I am not a doctor nor do I play one online, but when interacting with autistic people this bounded/centered dichotomy may be one to keep in mind. It can help to avoid confusing people. (It helps even with non-autistic people.)

Conclusion

There are no doubt a hundred other examples of bounded vs. centered set distinctions that we could come up with, but the basic model is, I've found, surprisingly universal.