Defining Best: A Framework for Post-Singularity Ethics

Abstract

What outcome should we aim for when the conditions that shaped our moral intuitions no longer apply? Current ethical frameworks—whether utilitarian, deontological, or virtue-based—are calibrated to human minds operating under familiar constraints. But we are approaching a period in which minds may take radically unfamiliar forms and conditions may change beyond recognition. This paper proposes a framework designed to survive such context collapse.

The central claim: the best outcome is not a fixed state but the outcome of a lawful relationship between experiencing minds and possible worlds; the best outcome is the one that would emerge from fully informed and impartial aggregation of all preferences.

This framework does not prescribe specific actions or provide decision procedures for daily life. Instead, it defines the target: a North Star that remains fixed as a relationship—always pointing to what fully informed, impartial preference aggregation would endorse—even as circumstances change. We argue this framework is distinctive in being fully species-neutral, treating aggregation as irreducibly holistic rather than formulaic, and offering conditional rather than absolute normative claims. Its primary application is to the alignment problem: if we must choose what goal to give minds more capable than ourselves, we need a target that remains correct even when we cannot foresee the consequences.

Section I: Introduction

1.1 The Problem of Context Collapse

Most ethical frameworks share an unexamined assumption: that the beings doing ethics are roughly like us, operating under roughly current conditions. Utilitarianism asks us to maximize welfare—but whose welfare, and measured how, when the minds in question may be digital, distributed, or unrecognizably different from biological humans? Kantian ethics grounds morality in rational agency—but what counts as rational agency when intelligence can be copied, merged, or scaled by orders of magnitude? Virtue ethics points to human flourishing—but "human" may soon be only one category of mind among many.

These frameworks were not designed to fail. They were designed for a world of human beings facing human problems. That world is ending—not in the sense of catastrophe, but in the sense of transformation. We are building minds. Those minds may soon build other minds. The conditions under which our moral intuitions evolved, and under which our philosophical traditions developed, will not persist.

This presents a specific, practical problem: if we are to encode goals into artificial systems—systems that may become more capable than we are—what goal should we give them? "Maximize human welfare" assumes humans remain the relevant category. "Follow these rules" assumes we can anticipate the situations that will arise. "Be virtuous" assumes a shared understanding of virtue that may not transfer to radically different minds or environments.

Consider: if we gave AI the goal to optimize for human happiness, and then something emerged that was similar to humans but had a million times more intense experiences—something we hadn't categorized as "human"—we would get it wrong. Or if AI systems themselves developed experiences, we would get it wrong. The human-centric framing breaks precisely when it matters most.

We need a framework that does not depend on current conditions or human-centric assumptions. We need a target that remains correct even when everything else changes.

Note: Even if minds remain exclusively biological and human, the framework still does genuine work—it defines what "better" and "worse" consequences mean. The post-singularity framing is not a limitation but an extension.

1.2 Clarifying the Scope

Before proceeding, we must address a potential misreading. This framework defines the correct target—what we should aim for. It does not claim that aiming at the target guarantees hitting it.

Superintelligent AI, despite vastly superior capabilities, still operates under uncertainty. Perfect knowledge of all consequences is impossible even for very advanced minds. Our Level 2 approximations may be wrong. Our Level 3 heuristics may fail in novel situations.

But the alternative—static rules, fixed formulas, or pre-specified heuristics—is guaranteed to fail post-singularity. Static approaches cannot adapt to radically new contexts. They break precisely when adaptation matters most.

The Ideal Observer framework remains correct as a target even when our approximations are wrong. We may miss the mark, but we are aiming at the right thing. That is all any framework can offer—and it is more than frameworks built for familiar contexts can provide when those contexts collapse.

Section II: The Three-Level Framework

2.1 Level 1: The Target (Truth)

Level 1 is the actual truth about what outcome is best.

This is what a fully informed, perfectly rational, and completely impartial observer—with access to all facts about all preference-having minds and all possible worlds—would identify as the best outcome.

Level 1 is not a human construction. It is not what we believe or what we can calculate. It is the objective fact of the matter: given all preferences of all minds, and complete knowledge of how different actions would affect those preferences, what outcome would emerge from fair and complete aggregation?

Key properties of Level 1:

Species-neutral: All preference-having minds count, regardless of substrate (biological, digital, hybrid, unknown).
Holistic aggregation: No formula can capture impartial preference aggregation. It's an irreducibly complex judgment requiring consideration of intensity, distribution, trade-offs, and conflicts that no algorithm can resolve mechanically.
Conditional normativity: IF you care about fairness across all minds, THEN Level 1 defines the target. It's not a categorical imperative but a conditional claim about what follows from accepting impartiality.

Why Level 1 must exist (even if unknowable):

Either there is a fact of the matter about what impartial preference aggregation would yield, or there isn't. If there isn't, then "better" and "worse" are meaningless—pure subjectivity with no objective grounding. But we act as if better and worse are real. When we say "reducing suffering is better than increasing it," we're making a claim about the world, not just reporting our feelings.

Level 1 is that fact of the matter. We may never know it perfectly. But it exists as the target we're trying to approximate.

2.2 Level 2: Our Best Approximations

Level 2 is what we currently believe Level 1 would yield.

We are not ideal observers. We lack complete information. We have biases. We can't simulate all possible worlds. So we approximate.

Level 2 is our best guess—given current evidence, reasoning, and moral intuitions—about what a fully informed and impartial observer would conclude.

Examples of Level 2 reasoning:

"Reducing extreme suffering is probably part of the best outcome, because suffering represents strong negative preferences."
"Expanding the circle of moral concern to include all sentient beings is likely closer to impartial aggregation than human-only frameworks."
"Preventing existential risks protects future preferences from being nullified."

Level 2 is revisable. As we learn more—about consciousness, about how different actions affect preferences, about what minds value—we update our approximations.

This is not relativism. We're not saying "anything goes." We're saying our approximations improve over time as we get closer to the truth (Level 1).

2.3 Level 3: Practical Heuristics

Level 3 is the rules, norms, and shortcuts we use in daily life.

We can't calculate Level 2 in real-time for every decision. We need heuristics: "Don't lie," "Keep promises," "Help those in need," "Respect autonomy."

Level 3 heuristics are instrumental—they're useful because they tend to approximate Level 2, which in turn approximates Level 1. But they're not the target itself.

Why Level 3 matters:

Most of life operates at Level 3. We follow norms, build institutions, adopt virtues. These heuristics work well in familiar contexts because they've been refined over generations.

But Level 3 heuristics can fail when context changes radically. "Maximize GDP" worked reasonably well as a proxy for welfare when economies were simple. It breaks down when externalities, inequality, and sustainability matter. "Preserve human life" works well when humans are the only minds. It breaks down when digital minds exist.

The framework keeps Level 1 as the fixed target, allows Level 2 to update as we learn, and recognizes Level 3 as useful but revisable.

Section III: Why This Framework is Needed

3.1 The Alignment Problem

If we build superintelligent AI, we must give it a goal. What goal?

Bad answer: "Maximize human welfare."

Why? Because "human" is arbitrary. If digital minds emerge with richer experiences than biological humans, we exclude them. If we modify humans technologically, we create ambiguous edge cases. The category "human" won't remain stable or meaningful.

Better answer: "Approximate what fully informed, impartial preference aggregation would yield."

This target remains correct even when:

New kinds of minds emerge
Existing minds change radically
We can't predict which scenarios will occur
Our current approximations turn out to be wrong

The Ideal Observer framework gives AI a stable target that doesn't depend on knowing the future.

3.2 Avoiding Goodhart's Law

Goodhart's Law: "When a measure becomes a target, it ceases to be a good measure."

If we give AI a formula—maximize pleasure minus pain, maximize preference satisfaction, maximize welfare points—it will optimize for that formula in ways we didn't intend.

But Level 1 is not a formula. It's a relationship: the outcome that emerges from impartial aggregation of all preferences. There's no formula to game because aggregation is irreducibly holistic.

This is a feature, not a bug. Formulas break. Relationships remain stable.

Section IV: Responses to Objections

4.1 "This is just utilitarianism"

Response: It shares consequentialist structure but differs in key ways:

No formula: Classical utilitarianism uses formulas (pleasure minus pain, preference satisfaction). This framework treats aggregation as holistic—no algorithm can capture it.
Species-neutral from the start: Most utilitarian frameworks begin with "human welfare" and expand reluctantly. This framework starts with all preference-having minds.
Conditional, not categorical: This isn't "you must maximize welfare." It's "IF you accept impartiality across minds, THEN this is what follows."

4.2 "Level 1 is unknowable, so it's useless"

Response: Many useful targets are unknowable in practice:

Scientists aim for truth even though perfect knowledge is impossible.
Engineers aim for optimal designs even though they can't test every possibility.
Judges aim for justice even though perfect fairness is elusive.

Unknowability doesn't make a target useless. It makes it a regulative ideal—something we approximate, knowing we may never fully reach it.

4.3 "This doesn't tell me what to do"

Response: Correct. This framework defines the target, not the decision procedure.

For daily decisions, use Level 3 heuristics. For unusual situations, reason at Level 2. Level 1 is the North Star—the fixed point we're aiming toward.

Asking this framework "Should I lie to a friend?" is like asking general relativity "What should I have for lunch?" Wrong category of question.

Section V: Conclusion

We are approaching a period in which the conditions that shaped our moral frameworks will no longer apply. Minds may take forms we cannot currently imagine. The category "human" may cease to be meaningful or stable.

Most ethical frameworks are built for a world that is ending. They assume human minds, familiar constraints, and predictable futures. They will break when those assumptions no longer hold.

This framework is designed to survive context collapse. It defines the best outcome not as a fixed state but as a lawful relationship: the outcome that would emerge from fully informed, impartial aggregation of all preferences.

That target remains correct even when everything else changes. It is species-neutral, holistically aggregative, and conditionally normative. It provides a stable goal for superintelligent systems without requiring us to predict the future or encode rigid formulas.

If we must choose what outcome to aim for when minds can be anything and conditions can change beyond recognition, we need a North Star that remains fixed as a relationship. This framework offers that.

The best outcome is not what we prefer, or what feels right, or what maximizes a formula. It is what impartial aggregation of all preferences would yield—across all minds, in all contexts, forever.

That is the target. Everything else is approximation.