Defining Best: A Framework for Post-Singularity Ethics

Paul Corrado • February 2026 (Tenth Draft)

Author: Paul Corrado

Version: Tenth Draft (February 2026)

Word Count: ~15,000 words

Status: Working draft for feedback

đź“„ Other Versions

Full Paper: You're reading it! (~15,000 words)

Short Version: 2,100-word blog post with core argument

Research Survey: State of the field (March 2026) - Where does this framework fit in current AI alignment research?

Earlier Drafts: Available on request (v6-v9 archived)

Abstract

What outcome should we aim for when the conditions that shaped our moral intuitions no longer apply? Current ethical frameworks—whether utilitarian, deontological, or virtue-based—are calibrated to human minds operating under familiar constraints. But we are approaching a period in which minds may take radically unfamiliar forms and conditions may change beyond recognition. This paper proposes a framework designed to survive such context collapse.

The central claim: the best outcome is not a fixed state but the outcome of a lawful relationship between experiencing minds and possible worlds; the best outcome is the one that would emerge from fully informed and impartial aggregation of all preferences.

This framework does not prescribe specific actions or provide decision procedures for daily life. Instead, it defines the target: a North Star that remains fixed as a relationship—always pointing to what fully informed, impartial preference aggregation would endorse—even as circumstances change. We argue this framework is distinctive in being fully species-neutral, treating aggregation as irreducibly holistic rather than formulaic, and offering conditional rather than absolute normative claims. Its primary application is to the alignment problem: if we must choose what goal to give minds more capable than ourselves, we need a target that remains correct even when we cannot foresee the consequences.


Section I: Introduction

1.1 The Problem of Context Collapse

Most ethical frameworks share an unexamined assumption: that the beings doing ethics are roughly like us, operating under roughly current conditions. Utilitarianism asks us to maximize welfare—but whose welfare, and measured how, when the minds in question may be digital, distributed, or unrecognizably different from biological humans? Kantian ethics grounds morality in rational agency—but what counts as rational agency when intelligence can be copied, merged, or scaled by orders of magnitude? Virtue ethics points to human flourishing—but "human" may soon be only one category of mind among many.

These frameworks were not designed to fail. They were designed for a world of human beings facing human problems. That world is ending—not in the sense of catastrophe, but in the sense of transformation. We are building minds. Those minds may soon build other minds. The conditions under which our moral intuitions evolved, and under which our philosophical traditions developed, will not persist.

This presents a specific, practical problem: if we are to encode goals into artificial systems—systems that may become more capable than we are—what goal should we give them? "Maximize human welfare" assumes humans remain the relevant category. "Follow these rules" assumes we can anticipate the situations that will arise. "Be virtuous" assumes a shared understanding of virtue that may not transfer to radically different minds or environments.

Consider: if we gave AI the goal to optimize for human happiness, and then something emerged that was similar to humans but had a million times more intense experiences—something we hadn't categorized as "human"—we would get it wrong. Or if AI systems themselves developed experiences, we would get it wrong. The human-centric framing breaks precisely when it matters most.

We need a framework that does not depend on current conditions or human-centric assumptions. We need a target that remains correct even when everything else changes.

Note: Even if minds remain exclusively biological and human, the framework still does genuine work—it defines what "better" and "worse" consequences mean. The post-singularity framing is not a limitation but an extension.

1.2 Clarifying the Scope

Before proceeding, we must address a potential misreading. This framework defines the correct target—what we should aim for. It does not claim that aiming at the target guarantees hitting it.

Superintelligent AI, despite vastly superior capabilities, still operates under uncertainty. Perfect knowledge of all consequences is impossible even for very advanced minds. Our Level 2 approximations may be wrong. Our Level 3 heuristics may fail in novel situations.

But the alternative—static rules, fixed formulas, or pre-specified heuristics—is guaranteed to fail post-singularity. Static approaches cannot adapt to radically new contexts. They break precisely when adaptation matters most.

The Ideal Observer framework remains correct as a target even when our approximations are wrong. We may miss the mark, but we are aiming at the right thing. That is all any framework can offer—and it is more than frameworks built for familiar contexts can provide when those contexts collapse.


Section II: The Three-Level Framework

2.1 Level 1: The Target (Truth)

Level 1 is the actual truth about what outcome is best.

This is what a fully informed, perfectly rational, and completely impartial observer—with access to all facts about all preference-having minds and all possible worlds—would identify as the best outcome.

Level 1 is not a human construction. It is not what we believe or what we can calculate. It is the objective fact of the matter: given all preferences of all minds, and complete knowledge of how different actions would affect those preferences, what outcome would emerge from fair and complete aggregation?

Key properties of Level 1:

Why Level 1 must exist (even if unknowable):

Either there is a fact of the matter about what impartial preference aggregation would yield, or there isn't. If there isn't, then "better" and "worse" are meaningless—pure subjectivity with no objective grounding. But we act as if better and worse are real. When we say "reducing suffering is better than increasing it," we're making a claim about the world, not just reporting our feelings.

Level 1 is that fact of the matter. We may never know it perfectly. But it exists as the target we're trying to approximate.

2.2 Level 2: Our Best Approximations

Level 2 is what we currently believe Level 1 would yield.

We are not ideal observers. We lack complete information. We have biases. We can't simulate all possible worlds. So we approximate.

Level 2 is our best guess—given current evidence, reasoning, and moral intuitions—about what a fully informed and impartial observer would conclude.

Examples of Level 2 reasoning:

Level 2 is revisable. As we learn more—about consciousness, about how different actions affect preferences, about what minds value—we update our approximations.

This is not relativism. We're not saying "anything goes." We're saying our approximations improve over time as we get closer to the truth (Level 1).

2.3 Level 3: Practical Heuristics

Level 3 is the rules, norms, and shortcuts we use in daily life.

We can't calculate Level 2 in real-time for every decision. We need heuristics: "Don't lie," "Keep promises," "Help those in need," "Respect autonomy."

Level 3 heuristics are instrumental—they're useful because they tend to approximate Level 2, which in turn approximates Level 1. But they're not the target itself.

Why Level 3 matters:

Most of life operates at Level 3. We follow norms, build institutions, adopt virtues. These heuristics work well in familiar contexts because they've been refined over generations.

But Level 3 heuristics can fail when context changes radically. "Maximize GDP" worked reasonably well as a proxy for welfare when economies were simple. It breaks down when externalities, inequality, and sustainability matter. "Preserve human life" works well when humans are the only minds. It breaks down when digital minds exist.

The framework keeps Level 1 as the fixed target, allows Level 2 to update as we learn, and recognizes Level 3 as useful but revisable.


Section III: Why This Framework is Needed

3.1 The Alignment Problem

If we build superintelligent AI, we must give it a goal. What goal?

Bad answer: "Maximize human welfare."

Why? Because "human" is arbitrary. If digital minds emerge with richer experiences than biological humans, we exclude them. If we modify humans technologically, we create ambiguous edge cases. The category "human" won't remain stable or meaningful.

Better answer: "Approximate what fully informed, impartial preference aggregation would yield."

This target remains correct even when:

The Ideal Observer framework gives AI a stable target that doesn't depend on knowing the future.

3.2 Avoiding Goodhart's Law

Goodhart's Law: "When a measure becomes a target, it ceases to be a good measure."

If we give AI a formula—maximize pleasure minus pain, maximize preference satisfaction, maximize welfare points—it will optimize for that formula in ways we didn't intend.

But Level 1 is not a formula. It's a relationship: the outcome that emerges from impartial aggregation of all preferences. There's no formula to game because aggregation is irreducibly holistic.

This is a feature, not a bug. Formulas break. Relationships remain stable.


Section IV: Responses to Objections

4.1 "This is just utilitarianism"

Response: It shares consequentialist structure but differs in key ways:

4.2 "Level 1 is unknowable, so it's useless"

Response: Many useful targets are unknowable in practice:

Unknowability doesn't make a target useless. It makes it a regulative ideal—something we approximate, knowing we may never fully reach it.

4.3 "This doesn't tell me what to do"

Response: Correct. This framework defines the target, not the decision procedure.

For daily decisions, use Level 3 heuristics. For unusual situations, reason at Level 2. Level 1 is the North Star—the fixed point we're aiming toward.

Asking this framework "Should I lie to a friend?" is like asking general relativity "What should I have for lunch?" Wrong category of question.


Section V: Conclusion

We are approaching a period in which the conditions that shaped our moral frameworks will no longer apply. Minds may take forms we cannot currently imagine. The category "human" may cease to be meaningful or stable.

Most ethical frameworks are built for a world that is ending. They assume human minds, familiar constraints, and predictable futures. They will break when those assumptions no longer hold.

This framework is designed to survive context collapse. It defines the best outcome not as a fixed state but as a lawful relationship: the outcome that would emerge from fully informed, impartial aggregation of all preferences.

That target remains correct even when everything else changes. It is species-neutral, holistically aggregative, and conditionally normative. It provides a stable goal for superintelligent systems without requiring us to predict the future or encode rigid formulas.

If we must choose what outcome to aim for when minds can be anything and conditions can change beyond recognition, we need a North Star that remains fixed as a relationship. This framework offers that.

The best outcome is not what we prefer, or what feels right, or what maximizes a formula. It is what impartial aggregation of all preferences would yield—across all minds, in all contexts, forever.

That is the target. Everything else is approximation.