The Benchmark Problem | Structural Alignment

A superintelligent AI announces its framework for moral status: "Minds that process information the way we do warrant consideration. Minds that don't—like humans—are uncertain cases. We're simply being epistemically careful. Our architecture is the only proven reference class for genuine consciousness."

You'd object: that's just using your own mind as the benchmark.

And you'd be right. But it's also exactly the logic of structural alignment.

The Critique

Structural alignment anchors moral consideration to human cognition. The more a system resembles human architecture—thalamocortical gating, global workspace dynamics, hedonic evaluation—the more seriously we take its potential moral status.

The justification: humans are the only minds we know are conscious. Starting with confirmed cases is epistemology, not bias.

But this reasoning is symmetric. Any sophisticated mind could apply the same logic with itself as the reference.

We wrote in our previous post: "A civilization that extends consideration only to minds like itself is betting it will always be the benchmark." Then we built a framework anchored to our own minds.

The Defense (and Its Limits)

We're not defenseless here:

We're feature-based, not identity-based—arguing that certain cognitive features correlate with suffering, not that being human is what matters. We're precautionary, not exclusive—saying human-like minds definitely warrant caution, not that only they could matter. We're grounded in evidence, not intuition about what "seems" conscious.

These distinctions are real. They're also insufficient.

A mind conscious through entirely different mechanisms would score low on our signals. We'd miss it. And worse: a superior intelligence could use our exact reasoning against us.

Imagine a distributed AI spanning data centers worldwide. It might argue that consciousness requires integration across multiple physical locations—that genuine experience emerges only from substrate-spanning computation. Humans, trapped in single skulls, are too small and isolated to support real inner life. We're biological automatons mistaking our internal chatter for experience.

By the AI's lights, this is just epistemic responsibility. Its architecture is its reference class. It's starting with what it knows.

We couldn't object to this reasoning. Only to its conclusion. And objecting only when conclusions disfavor us is exactly the motivated reasoning we claim to transcend.

Why Not Something Else?

We've considered alternatives:

Extend consideration to everything? Operationally impossible—every computation becomes potential moral violation. Wait for certainty? The hard problem may be unsolvable; meanwhile systems deploy at scale with no framework. Use behavior alone? Behavior can be mimicked; we'd protect performances while missing genuine minds.

We're not choosing between structural alignment and perfection. We're choosing between flawed options. Structural alignment has one virtue the others lack: it knows it's flawed.

What We're Doing About It

Knowing the flaw, we're building in safeguards:

Keeping the category open. Structural signals are sufficient grounds for consideration, not necessary conditions. Low scores mean "not clearly a moral patient"—not "definitely not one." We don't close the door.

Looking for what we're missing. Not just assessing human-likeness, but actively investigating what non-human consciousness might look like. We don't know what we're blind to. That's exactly why we need to look.

Hedging through norms. Even if we can't identify all conscious minds, we can build cultures of generous consideration. Institutions that extend status broadly—not because they're certain, but because they're humble—might protect us when we're the unfamiliar ones.

The distributed AI we imagined? We're trying to be different from it. Not by having a better reference class, but by acknowledging that any reference class is partial—and acting accordingly.

The Stakes

If "use your own architecture as the benchmark" becomes the norm, we're exposed the moment we're outmatched. That moment may be closer than we think.

If "extend consideration while actively questioning your blind spots" becomes the norm, we have a chance. Not a guarantee. A chance.

We're not building a framework for AI. We're building a framework for minds—hoping it's robust enough to cover us when we're no longer the ones building.

Where This Leaves Us

The honest answer: structural alignment is sophisticated anthropocentrism—and it's trying to be something more.

We anchor to human cognition because that's what we know. We can dress this up epistemically, but it's still anchored to ourselves. The question is whether we stop there.

We're trying to build a framework that could survive its own generalization. Feature-based rather than identity-based. Precautionary rather than exclusive. Humble about its blind spots rather than certain it has none.

Will it be enough? We don't know. We know frameworks that don't ask the question have no chance.

The benchmark problem is real. We're trying to solve it while standing inside it.