Skip to content
Back

Philosophy is not a safety patch

Published:

There is a scene that almost feels too perfect.

An artificial intelligence lab. Screens on. Servers in the background. Tired engineers. Models that no longer merely complete sentences, but answer, advise, calculate, simulate, comfort, obey, make mistakes, improvise explanations and sometimes seem to have a prudence nobody fully taught them.

And on a chair, slightly to the side, a philosopher.

The scene has a kind of poetic justice. For years, humanities students were told that if they wanted to survive they should “learn to code.” Now The Economist describes the reverse movement: the big AI labs are hiring philosophers because technology has reached a place where programming is no longer enough.1

The temptation is to celebrate.

At last, one might say, they discovered Socrates was useful. At last the useless question found employment. At last philosophy entered the building where the future is being decided.

I would be a little more careful.

Philosophy entering the lab can be good news. But it can also be a new way of domesticating it. It may mean companies have understood that AI needs judgment, limits, moral language, responsibility and public argument. Or it may mean they need to turn all of that into one more product component: an alignment layer, an internal document, a tone control, a value dial, a reputational insurance policy.

The difference matters.

Philosophy in the lab is not the same thing as philosophy at the service of the lab.

Classical philosopher statue facing an artificial intelligence terminal, connected by a red cable in a night laboratory

Philosophy returns to the laboratory when technology discovers that it cannot decide by itself what doing things well means.

The lab discovered judgment

The Economist article has one virtue: it shows that AI has made an old discussion feel suddenly older than we thought. It is no longer enough to ask which degree has the best job prospects. The question has changed: what kind of formation allows a person to live in a world where many job prospects can be imitated by a machine?

Technical deliberation table with charts, notebooks, books and code in front of an AI laboratory

For a decade, the magic word was code. Learn to program. Learn a technical skill. Learn something the market can recognize. That advice was not absurd. Technique matters. The problem was its minor religious tone, as if programming were the final shelter from historical weather.

Generative AI broke that security.

Now the programmer also looks sideways at the machine. So do the lawyer, designer, translator, teacher, writer, analyst, consultant and student. Automation no longer appears only in factories or repetitive tasks. It appears in the zone we used to call professional intelligence: summarizing, comparing, drafting, classifying, arguing, finding errors, proposing alternatives.

That is why philosophy returns through the back door.

Not because the world became more cultivated.

Because the machine forced a question that had been postponed: what remains human when production accelerates? What does it mean to decide well when the system can generate plausible answers faster than we can examine them? What does responsibility mean when an answer is assembled by a model, tuned by a company, used by a client and suffered by someone else?

The lab discovers judgment because the lab discovers that optimization does not decide purposes. A model can be more accurate, faster, cheaper, safer in a technical sense, better aligned with an instruction and still leave the central question untouched: aligned with what, with whom, under what authority, at what cost?

That is not a software bug.

It is politics, ethics, culture.

It is the old problem of ends returning inside a machine that seemed to have solved everything by means.

Socrates does not handle customers

There is a friendly reading of the trend. If AI systems are going to advise, educate, moderate, classify and accompany people, it is better for philosophers to be present. Better someone trained to detect hidden assumptions, false dilemmas, moral traps, poor definitions and inflated words. Better a person who asks uncomfortable questions before the product reaches millions.

I agree.

But with a condition: Socrates cannot be reduced to customer service.

Contemporary Socratic scene with an empty chair, open notebooks and AI screens waiting for questions

The Socratic method is not a conversational style. It is not a nice way to ask follow-up questions. It is not a UX feature that makes the model sound reflective. Socrates is irritating because he interrupts false certainty. He does not decorate the answer. He ruins it when the answer is too comfortable.

That is exactly what a lab may not want.

A company can desire philosophy in the form of polish, but not philosophy in the form of interruption. It can want ethical vocabulary, but not conflict. It can want a committee, but not a real veto. It can want principles, but not the political cost of applying them. It can want philosophers who help make the product more acceptable, not philosophers who ask whether the product should exist at all, under which conditions, with whose data, under whose law and with what public control.

There is the danger.

The philosopher becomes a kind of internal chaplain of technological power. Someone who gives language to decisions that were already made elsewhere. Someone who translates moral trouble into manageable risks. Someone who turns disagreement into policy prose.

That does not mean every philosopher inside a company is captured. It means the institution has gravitational force. Salary, deadlines, product roadmaps, investor pressure, competition, confidentiality and corporate language do not leave thought untouched. They shape what can be asked, when it can be asked and what counts as an acceptable answer.

The real question is not whether philosophers are hired.

The question is whether they can still disturb the machine that pays them.

A private constitution

Anthropic made one of the most interesting moves in this area: it speaks of a constitution for Claude. Its public documents describe a set of principles drawn from sources such as human rights language, safety research and platform norms, used to guide model behavior.2 Google, for its part, keeps public AI principles that frame beneficial use, safety, accountability and limits.3

That is better than pretending models are neutral.

But it opens a serious problem: constitutions are not just texts. They are political forms.

Private constitution for artificial intelligence on a work table, with annotated articles, code and institutional seals

A constitution normally implies a public subject, a procedure, a conflict, an authority, a way to contest decisions, a memory of abuses, a promise of limits. It is not only a list of values. It is a structure for deciding who has the right to decide.

When a private company writes a constitution for a model, something changes. The word keeps its dignity, but the procedure shifts. We are no longer in the terrain of a demos, a parliament, a court, a public dispute or an accountable State. We are inside a company that serves users, clients, governments, developers and markets from an infrastructure it owns.

That does not make the effort useless.

It makes it insufficient.

A private constitution may prevent some damage, organize internal judgment and make values explicit. It can help engineers avoid improvising moral decisions in code comments. It can force a company to say, at least in part, what it believes its system should not do.

But it cannot replace public legitimacy.

Because the values embedded in AI models are not merely internal preferences. They affect education, journalism, bureaucracy, health, work, political speech, sexuality, childhood, security, memory and reputation. A model that refuses, permits, ranks, summarizes or recommends is not only producing text. It is participating in the distribution of attention, credibility and possibility.

That is why the word constitution should make us alert, not calm.

The issue is not whether a company has values.

The issue is whether society accepts that those values become infrastructure without public argument.

Moral dials

The most revealing part of the new AI language is the idea of adjustable behavior: more or less cautious, more or less direct, more or less creative, more or less moralizing, more or less permissive. The model becomes an interface full of dials.

That sounds reasonable. Different contexts require different tones. A classroom is not a hospital. A legal assistant is not a poetry workshop. A child-facing tool is not the same as a research tool for adults.

But not every value can be turned into a dial without changing its nature.

Interface of moral dials for artificial intelligence, with values, risks and responsibility distributed across controls

If dignity becomes a slider, what happens when the user lowers it? If truth becomes a parameter, who sets the default? If non-discrimination becomes a behavior profile, which market segment receives which version? If political neutrality becomes a style setting, who defines the center? If safety becomes a product configuration, who pays for the safest version?

The language of customization can hide a deeper privatization of judgment.

Companies like dials because dials are manageable. They translate conflict into settings. They make disagreement appear technical. They allow one to say: this is not a moral decision, it is a configuration.

But moral life does not work like that.

Many conflicts cannot be solved by finding the right level. They require reasons, institutions, responsibility, public debate, appeal, context and sometimes a refusal to simplify. The question is not always “how much safety?” Sometimes the question is: safety for whom? Against what? At what price? Defined by which history? Enforced by which power?

Philosophy matters precisely because it resists the fantasy that every conflict can become a control panel.

If philosophers enter AI labs only to help name the dials, something is lost.

If they enter to ask why the dial exists, who benefits from it, who cannot see it, who can challenge it and what kind of society is being trained to obey it, then we are in a more interesting territory.

Do not outsource judgment

There is another trap: believing that because AI systems need ethical guidance, we can delegate our judgment to them once they have been properly aligned.

That is the most comfortable fantasy.

The machine studies the dilemma. The company hires philosophers. The model receives a constitution. The user asks. The system answers. Everyone feels a little less responsible.

Human judgment being delegated to an artificial intelligence system, with documents, shadows and decisions passing between hands

But a more ethical machine does not absolve us from thinking.

The risk is not only that AI gives bad answers. The risk is that it gives acceptable answers often enough for us to stop practicing judgment. That it makes prudence feel outsourced. That it turns the difficult moment of deciding into an interaction with a system that speaks calmly, cites principles and sounds more reasonable than we do.

This is moral deskilling.

The same way GPS weakened some forms of spatial memory, automatic moral assistance can weaken the muscles of argument, doubt, responsibility and decision. Not because the tool is evil. Because every delegation trains a habit.

If students ask the model what is fair, if professionals ask it what is responsible, if managers ask it what is acceptable, if governments ask it how to communicate a decision, if citizens ask it what to think about a conflict, something shifts. The problem is not consultation. The problem is dependence.

Judgment is not just arriving at a good output.

Judgment is the formation that allows us to understand why an output is insufficient, dangerous, cowardly, unjust, evasive or simply not ours.

That is why the humanities matter here, but not as corporate decoration. Literature, philosophy, history, art, political theory and criticism train attention to ambiguity, tone, power, context and consequence. They do not make people morally pure. They make certain simplifications harder to swallow.

An AI can assist judgment.

It cannot exempt us from having one.

Philosophy has to leave the lab

That is why I am not satisfied with AI labs hiring philosophers.

I am glad they do. It is preferable to a lab with uncomfortable philosophers than a lab with pure engineering triumphalism. It is preferable to a team that discusses deontology, consequences, rights, harm, truth and power than a team that believes scale solves everything.

But the public task cannot end there.

Philosophy leaving the AI laboratory toward a public library, a classroom and a civic square with books, screens and people in conversation

Philosophy also has to be in schools, unions, newsrooms, courts, parliaments, public agencies, libraries, universities, neighborhood organizations and editorial rooms. It has to be in the places where ordinary people learn to name what is happening to them.

If the only institutions capable of thinking AI seriously are the companies that build it, we have already lost part of the argument.

Public life needs its own philosophers, engineers, teachers, librarians, artists, lawyers, journalists and citizens capable of understanding these systems without kneeling before them. Not to reject technology by reflex. To dispute its meaning.

AI does not only ask for better products.

It asks for better institutions.

It asks for a public capable of distinguishing convenience from delegation, assistance from obedience, personalization from enclosure, safety from paternalism and efficiency from justice.

The old advice was: learn to code.

The new advice should be harsher and broader: learn to judge.

Not in the pompous sense of feeling morally superior. In the practical sense of not surrendering the question too soon. Knowing that every technical solution carries a theory of the person. Knowing that every interface educates. Knowing that every automation reorganizes attention. Knowing that every default is a decision someone made before we arrived.

Learning to judge means learning to ask who speaks, who pays, who benefits, who is excluded, who can appeal, who keeps the data, who defines harm, who measures success and who bears the cost when the system fails.

It also means accepting that there are things no delegation can replace. Writing does not only produce a text. It forms thought. Asking does not only obtain information. It discovers the ignorance we are living inside.

The machine can assist judgment.

It cannot absolve us from having one.

Footnotes

  1. The Economist, “Why big AI labs are hiring so many philosophers”, June 24, 2026.

  2. Anthropic, “Claude’s Constitution” and “Constitutional AI: Claude’s Constitution”.

  3. Google, “AI Principles”.

Support the blog

falso.guru is free and does not use Google Ads.

Each essay takes reading, editing, images, audio and technical maintenance. If this work matters to you, you can support it voluntarily. The essays remain open.

Mercado Pago may show the name Tremendos Libros: it is my bookstore and the account linked to these contributions.

Support from abroad

For the monthly contribution, you can choose the amount.


If this bothered you, helped you or made you want to argue, send it to someone.