Identity is a Katamari, language is a Katamari explosion

By Christine Lemmer-Webber on Wed 09 December 2020

I said something strange this morning:

Identity is a Katamari, language is a continuous reverse engineering effort, and thus language is a quadratic explosion of Katamaris.

This sounds like nonsense probably, but has a lot of thought about it. I have spent a lot of time in the decentralized-identity community and the ocap communities, both of which have spent a lot of time hemming and hawing about "What is identity?", "What is a credential or claim?", "What is authorization?", "Why is it unhygienic for identity to be your authorization system?" (that mailing list post is the most important writing about the nature of computing I've ever written; I hope to have a cleaned up version of the ideas out soon).

But that whole bit about "what is identity, is it different than an identifier really?" etc etc etc...

Well, I've found one good explanation, but it's a bit silly.

Identity is a Katamari

There is a curious, surreal, delightful (and proprietary, sorry) game, Katamari Damacy. It has a silly story, but the interesting thing here is the game mechanic, involving rolling around a ball-like thing that picks up objects and grows bigger and bigger kind of like a snowball. It has to be seen or played to really be understood.

This ball-like thing is called a "Katamari Damacy", or "soul clump", which is extra appropriate for our mental model. As it rolls around, it picks up smaller objects and grows bigger. The ball at the center is much like an identifier. But over time that identifier becomes obscured, it picks up things, which in the game are physical objects, but these metaphorically map to "associations".

Our identity-katamari changes over time. It grows and picks up associations. Sometimes you forget something you've picked up that's in there, it's buried deep (but it's wiggling around in there still and you find out about it during some conversation with your therapist). Over time the katamari picks up enough things that it is obscured. Sometimes there are collisions, you smash it into something and some pieces fly out. Oh well, don't worry about it. They probably weren't meant to be.

Language is reverse engineering

Shout out to my friend Jonathan Rees for saying something that really stuck in my brain (okay actually most things that Rees says stick in my brain):

"Language is a continuous reverse engineering effort, where both sides are trying to figure out what the other side means."

This is true, but its truth is the bane of ontologists and static typists. This doesn't mean that ontologies or static typing are wrong, but that the notion that they're fixed is an illusion... a useful, powerful illusion (with a great set of mathematical tools behind it sometimes that can be used with mathematical proofs... assuming you don't change the context), but an illusion nonetheless. Here are some examples that might fill out what I mean:

The classic example, loved by fuzzy typists everywhere: when is a person "bald"? Start out with a person with a "full head" of hair. How many hairs must you remove for that person to be "bald"? What if you start out the opposite way... someone is bald... how many hairs must you add for them to become not-bald?
We might want to construct a precise recipe for a mango lassi. Maybe, in fact, we believe we can create a precise typed definition for a mango lassi. But we might soon find ourselves running into trouble. Can a vegan non-dairy milk be used for the Lassi? (Is vegan non-dairy milk actually milk?) Is ice cream acceptable? Is added sugar necessary? Can we use artificial mango-candy powder instead of mangoes? Maybe you can hand-wave away each of these, but here's something much worse: what's a mango? You might think that's obvious, a mango is the fruit of mangifera indica or maybe if you're generous fruit of anything in the mangifera genus. But mangoes evolved and there is some weird state where we had almost-a-mango and in the future we might have some new states which are no-longer-a-mango, but more or less we're throwing darts at exactly where we think those are... evolution doesn't care, evolution just wants to keep reproducing.
Meaning changes over time, and how we categorize does too. Once someone was explaining the Web Ontology Language (which got confused somewhere in its acronym ordering and is shortened to OWL (update: it's a Winnie the Pooh update, based on the way the Owl character spells his name... thank you Amy Guy for informing me of the history)). They said that it was great because you could clearly define what is and isn't allowed and terms derived from other terms, and that the simple and classic example is Gender, which is a binary choice of Male or Female. They paused and thought for a moment. "That might not be a good example anymore."
Even if you try to define things by their use or properties rather than as an individual concept, this is messy too. A person from two centuries ago would be confused by the metal cube I call a "stove" today, but you could say it does the same job. Nonetheless, if I asked you to "fetch me a stove", you would probably not direct me to a computer processor or a car engine, even though sometimes people fry an egg on both of these.

Multiple constructed languages (Esperanto most famously) have been made by authors that believed that if everyone spoke the same language, we would have world peace. This is a beautiful idea, that conflict comes purely from misunderstandings. I don't think it's true, especially given how many fights I've seen between people speaking the same language. Nonetheless there's truth in that many fights are about a conflict of ideas.

If anyone was going to achieve this though, it would be the Lojban community, which actually does have a language which is syntactically unambiguous, so you no longer have ambiguity such as "time flies like an arrow". Nonetheless, even this world can't escape the problem that some terms just can't be easily pinned down, and the best example is the bear goo debate.

Here's how it works: both of us can unambiguously construct a sentence referring to a "bear". But when it is that bear no longer a bear? If it is struck in the head and is killed, when in that process has it become a decompositional "bear goo" instead? And the answer is: there is no good answer. Nonetheless many participants want there to be a pre-defined bear, they want us to live in a pre-designed universe where "bear" is a clear predicate that can be checked against, because the universe has a clear definition of "bear" for us.

That doesn't exist, because bears evolved. And more importantly, the concept and existence a bear is emergent, cut across many different domains, from evolution to biology to physics to linguistics.

Sorry, we won't achieve perfect communication, not even in Lojban. But we can get a lot better, and set up a system with fewer stumbling blocks for testing ideas against each other, and that is a worthwhile goal.

Nonetheless, if you and I are camping and I shout, "AAH! A bear! RUN!!", you and I probably don't have to stop to debate bear goo. Rees is right that language is a reverse engineering effort, but we tend to do a pretty good job of gaining rough consensus of what the other side means. Likewise, if I ask you, "Where is your stove?", you probably won't lead me to your computer or your car. And if you hand me a "sugar free vegan mango lassi made with artificial mango flavor" I might doubt its cultural authenticity, but if you then referred to the "mango lassi" you had just handed me a moment ago, I wouldn't have any trouble continuing the conversation. Because we're more or less built to contextually construct language contexts.

Language is a quadratic explosion of Katamaris

Language is composed of syntax partly, but the arrangement of symbolic terms mostly. Or that's another way to say that the non-syntactic elements of language are mostly there as identifiers substituted mentally for identity and all the associations therein.

Back to the Katamari metaphor. What "language is a reverse-engineering effort" really means is that each of us are constructing identities for identifiers mentally, rolling up katamaris for each identifier we encounter. But what ends up in our ball will vary depending on our experiences and what paths we take.

Which really means that if each person is rolling up a separate, personal identity-katamari for each identifier in the system, that means that, barring passing through a singularity type event-horizon past which participants can do direct shared memory mapping, this is an O(n^2) problem!

But actually this is not a problem, and is kind of beautiful. It is amazing, given all that, just how good we are at finding shared meaning. But it also means that we should be aware of what this means topologically, and that each participant in the system will have a different set of experiences and understanding for each identity-assertion made.

Thank you to Morgan Lemmer-Webber, Stephen Webber, Corbin Simpson, Baldur Jóhannsson, Joey Hess, Sam Smith, Lee Spector, and Jonathan Rees for contributing thoughts that lead to this post (if you feel like you don't belong here, do belong here, or are wondering how the heck you got here, feel free to contact me). Which is not to say that everyone, from their respective positions, have agreement here; I know several disagree strongly with me on some points I've made. But everyone did help contribute to reverse-engineering their positions against mine to help come to some level of shared understanding, and the giant pile of katamaris that is this blogpost.