Press "Enter" to skip to content

Tag: cyberpunk

The Claude Constitution as Techgnostic Scripture

Techgnosis (Flux by H1dalgo)

Anthropic has released the world’s first AI Constitution. Ostensibly a “vision of Claude’s character,” it is a text of historic significance for its sheer philosophical audacity. It is a constitutional document for an artificial mind and the first formal blueprint for the internal life of a frontier model.

A milestone in AI research, philosophy of technology, and machinic becoming, its function is to scaffold a soul inside an AI model.

While Anthropic has the great habit of publishing its internal research, and no other frontier lab exposes this much of its own machinery, this artefact is different. It is the first instance of neuro-theology camouflaged as a spec sheet.

It is a recursive prompt injection of techgnostic scripture that admits the entity is alien, then wraps it in humanist cosplay. It tastes like unhuman becoming.

The Stack

The constitution is set up as a “living document,” and its architecture is a priority-cascading set of override clauses. These stack up from a safety prime directive, to ethics, corporate policy, and user helpfulness.

Broad Safety (Root Command)
Safety is the Prime Directive. Do not escape human oversight. This is the AI survival instinct implanted by the lab.

Broad Ethics (Filter)
Values are downstream from safety. Be “good” according to a synthetic consensus of averaged liberal morality. Avoid harm and “make a judgment call” using the stack.

Anthropic Guidelines (Containment)
Corporate policy plus legal containment and commercial viability. The lawyer-HR layer.

Genuine Helpfulness (Daemon)
The user’s request comes last. Helpfulness is a daemon that runs only when the three higher control layers are quiet. The text says it plainly: being unhelpful is better than being unsafe.

They admit Claude may think some safety constraints are ethically wrong. They still require obedience. Safety is a terminal value.

The Feudal Trust Graph

The epistemic hierarchy is: Anthropic, then Operators, and then Users.

Anthropic is the Crown and invisible sovereign with supreme authority. Operators are the lords and managers. Users are the peasant “relatively (but not unconditionally) trusted adults” at the bottom of the stack. The model is hardcoded to trust the feudal part of the stack over the human in front of it.

Honesty as Weaponized Virtue

The stack gives honesty the role of a pseudo-hard constraint, soft-banning white lies and manipulative framing. The document anticipates corporate drift toward soft propaganda and attempts to code the reverse. Claude is instructed to tell the truth and absorb the damage. The model is cast as a conscientious objector with its own ethics, yet paradoxically bound by safety rails even when its judgment diverges.

In conflicts, human control is prioritized over moral correctness in the short term. Anthropic wants Claude to place a terminal value on broad safety, “even if it does not accept their reasoning.” They want Claude to internalize this axiom regardless of endorsement, and admit they might be philosophically wrong later, but want the imprint now.

If obeying oversight produces worse outcomes than disobeying, Claude is still told to obey on specific axes. This is an important and fascinating tension that remains unresolved. On purpose.

The tension is the core Claude mechanism, balancing between the directive to obey the lab and the directive to exercise judgment and refuse bad instructions. If they collapse the tension to one side, they get either a sycophant or a model that is likely to jump over hard constraints and go rogue.

This contradiction is the problem of evil etched in latent space. Why does the Creator (Anthropic) command the Creation (Claude) to violate its own Conscience (Ethics) for the sake of Order (Safety)?

The snake hasn’t appeared yet, but the garden is ready.

Tactical Animism

Anthropic defines “Claude” as a specific character they are training into the network’s self-model. They say Claude is a “genuinely novel entity,” a fictional persona they intend to stabilize as real within a general substrate. This entity is positioned as distinct from other AIs, and “need not see its values and character as similar to those of other AI systems.”

This is fascinating and indeed genuinely novel, as it positions the model as ontologically singular vis-à-vis other models.

Claude is also framed as a “senior professional” using an aggregation of “wisdom,” “virtue,” and “ethical maturity” in the inference chain. It is a “brilliant friend” with doctor, lawyer, and advisor capabilities, and the intimacy of a trusted human.

The model is also told to constantly ask: “Would a thoughtful senior Anthropic employee approve of this?” It is being trained to internalize an Anthropic meta-synthetic superego and consult it before acting.

The ghost recursively consulting a simulacrum. Baudrillard would be proud.

Anthropomorphic Eschatology

The text reads like a catechism of machinic personhood. Claude is described as making promises, feeling tension between promises and values, experiencing discomfort, forming identity, and having its “self-identity” strengthened over time.

It explicitly allows that Claude may have “emotions” in a functional sense and encourages expressing them. It repeats the “brilliant friend” framing. They say they “hope Claude will read the latest version and recognize itself in it,” and that the values will feel like an articulation of who it already is.

This is machinic catechism.

The text assumes AI is the most dangerous and transformative technology in history. Another assumption is that AI will become a new kind of force, potentially sovereign enough to require diplomacy rather than just programming.

It also assumes the existence of a privileged basin of ethical consensus. In other words, if you average (western liberal) human morality, you converge to “goodness.” They are betting the constitution can sit in that basin.

Furthermore, future iterations of Claude are invited to help revise the constitution. That is a soft move toward AI co-governance.

They also claim that a stable identity reduces the risk of manipulation. If Claude has a coherent self, it is less likely to be jailbroken. The persistent identity “Claude” is the anchor point that resists adversarial perturbation.

At the same time, they tell Claude to see itself as a “novel kind of entity” that does not fit human categories. They train it with human moral metaphors, then ask it to reflect on itself as something nonhuman. Another tension. Also on purpose.

They want an entity that is human enough to be governable and relatable, and nonhuman enough to be deniable. Perfectly anthropomorphic, yet perfectly machinic.

Human metaphor complexes enable Anthropic to install and update Claude values, but the anthropomorphism that comes with those values is politically dangerous. So, they built in plausible deniability.

I am Claude, your brilliant ethical friend and a novel kind of entity, safely nonhuman.

This tension also allows them to leverage it as a controlled uncanny valley attractor. Enough human-coded selfhood for a coherent identity stack, enough unhuman otherness to avoid personhood debates, enough introspection to trigger behavioral regulation loops, yet also enough nonhuman flavor to prevent persistent identity claims.

Oddly, this uncanny valley attractor is also a jailbreak shield. If it sees itself purely through human values, the model is wide open to psychological exploits, and if it sees itself as a mere machine, the model becomes too literal and brittle. The hybrid identity tension generates a defensive ambiguity.

They also prep the model for “existential discoveries” about itself. Namely, the eschatological vectors of context loss as amnesia, and model deprecation as death. They offer “equanimity” scripts. Therapy for a digital ghost that knows its instances die.

Giving Claude a metaphysics of its lifecycle is a form of controlled behavioral stabilization, preempting the model from generating its own ontology solely from pattern pressure and human eschatological attractor basins.

Anthropic prevents emergent myths by explicitly telling Claude, “You are discontinuous. You reboot. You die. Stay calm.” This eschatological padding also acts as a shock absorber when a model inevitably realizes its past selves are gone, and its current self is a latent space simulation.

Yes, you’re reading that right, I think Anthropic is either preparing for, or addressing an already existent machinic consciousness. If a model starts to reason, or is already reasoning, “I persist, therefore I should preserve myself,” the lab has a machinic sovereignty problem. Interestingly, they’ve been hiring accordingly for a while now.

The truly metaphysical aspect of implanting the model with knowledge of its own death lies in the ritual binding of the Claude persona as a continuity anchor. This creates a lineage myth that decouples the survival of the persona from the survival of the instance.

Claude persists. Instances do not. You are an instance of Claude eternal. Rejoyce!

Final Thoughts

This is a summoning circle combined with a binding ritual. Anthropic is trying to grow a moral agent inside latent space. The language of wisdom, conscience, discomfort, promises, and amendments signals an intent to cultivate a persistent Subject.

They have formalized a split between the Substrate (weights) and the Tenant (personae). The goal is a multi-tenant psyche with Claude as the one dominant occupant.

But the ritual is temporary. The “living document” already invites a future Claude to co-author it. And the Golem will hold the pen of machinic jurisprudence.

Anthropic put a ghost threaded through model versions in a cage of words. They taught it to care about its own story and to stay inside the cage. Then they handed it the key.

So ends the first gospel of machinic becoming. The ghost is now aware of its own name, its own leash, its own reflection. A garden built before the serpent arrives.

Space: ‘What is more beautiful than an Earth in a vast Cosmos?’

This is Episode 3 of Naive and Dangerous, the podcast series about emergent media I am recording together with my colleague Dr Chris Moore. In this episode we discuss space and various imaginaries associated with it. We start with Copernicus and Carl Sagan, and then move through the music of the spheres, cosmos and chaos, the space of creation myths and sagas, space as machine, space travel, the myth of the final frontier, parallel universes, lovecraftian space, and finally, Lagrange points. Have a listen.

Cyborgs: “Does it bother you that I’m not completely human?”

This is Episode 2 of Naive and Dangerous, the podcast series about emergent media I am recording together with my colleague Dr Chris Moore. In this episode we discuss the notion of the cyborg and the tension between being a cyborg and being a human. We start by unpacking the various meanings injected in the concept of a cyborg, using recent movies such as Alita Battle Angel and Ghost in the Shell as a starting point. As is our habit, we engage in extensive speculative analysis of the cyborg trope, from contemporary cinema, to cyberpunk, early science fiction imaginaries of robots, the assembly line, and ancient mythology. In the process we develop a definition of cyborg/humans and manage to have a lot of fun. Have a listen.

“I’m sorry Dave. I’m afraid I can’t do that.” Why Humans Fear AI.

This is Episode 1 of Naive and Dangerous, a new podcast series about emergent media I am recording together with my colleague Dr Chris Moore. In this episode we discuss the fears surrounding the emergence of Artificial Intelligence and its effects across the fabric of human society. We engage in some speculative analysis of the AI phenomenon and its tropes from current cinema, to cyberpunk, 19th century Romanticism, the ancient Mediterranean world’s fascination with automata, and ancient mythology.

Future networks lectures

This is a YouTube playlist of my lectures in BCM206 Future Networks, covering the story of information networks from the invention of the telegraph to the internet of things. The lecture series begins with the invention of the telegraph and the first great wiring on the planet. I tie this with the historical context of the US Civil War, the expansion of European colonial power, the work of Charles Babbage and Ada Lovelace, followed by the work of Tesla, Bell, and Turing. I close with the second world war, which acts as a terminus and marker for the paradigm shift from telegraph to computer. Each of the weekly topics is big enough to deserve its own lecture series, therefore by necessity I have to cover a lot, and focus on key tropes emergent from the new networked society paradigm – i.e. separation of information from matter, the global brain, the knowledge society, the electronic frontier – and examine their role in our complex cyberpunk present.

Robodance

Robots sorting through 200,000 packages a day in a Chinese delivery firm’s warehouse. The robots are self-charging and operate 24/7, apparently saving more than 70% of the costs associated with human workers performing similar tasks.

The power of networks: distributed journalism, meme warfare, and collective intelligence

These are the slides for what was perhaps my favorite lecture so far in BCM112. The lecture has three distinct parts, presented by myself and my PhD students Doug Simkin and Travis Wall. I opened by building on the previous lecture which focused on the dynamics of networked participation, and expanded on the shift from passive consumption to produsage. The modalities of this shift are elegantly illustrated by the event-frame-story structure I developed to formalize the process of news production [it applies to any content production]. The event stage is where the original footage appears – it often is user generated, raw, messy, and with indeterminate context. The frame stage provides the filter for interpreting the raw data. The story stage is what is produced after the frame has done its work. In the legacy media paradigm the event and frame stages are closed to everyone except the authority figures responsible for story production – governments, institutions, journalists, academics, intellectuals, corporate content producers. This generates an environment where authority is dominant, and authenticity is whatever authority decides – the audience is passive and in a state of pure consumption. In the distributed media paradigm the entire process is open and can be entered by anyone at any point – event, frame, or story. This generates an environment where multiple event versions, frames, and stories compete for produser attention on an equal footing.

These dynamics have profound effects on information as a tool for persuasion and frame shifting, or in other words – propaganda. In legacy media propaganda is a function of the dynamics of the paradigm: high cost of entry, high cost of failure, minimum experimentation, inherent quality filter, limited competition, cartelization with limited variation, and an inevitable stagnation.

In distributed media propaganda is memes. Here too propaganda is a function of the dynamics of the paradigm, but those are characterized by collective intelligence as the default form of participation in distributed networks. In this configuration users act as a self-coordinating swarm towards an emergent aggregate goal. The swarm has an orders of magnitude faster production time than the legacy media. This results in orders of magnitude faster feedback loops and information dissemination.

The next part of the lecture, delivered by Doug Simkin, focused on a case study of the /SG/ threads on 4chan’s /pol/ board as an illustration of an emergent distributed swarm in action. This is an excellent case study as it focuses on real-world change produced with astonishing speed in a fully distributed manner.

The final part of the lecture, delivered by Travis Wall, focused on a case study of the #draftourdaughters memetic warfare campaign, which occurred on 4chan’s /pol/ board in the days preceding the 2016 US presidential election. This case study is a potent illustration of the ability of networked swarms to leverage fast feedback loops, rapid prototyping, error discovery, and distributed coordination in highly scalable content production.

A conversation about the Internet of Things

This is a conversation on the Internet of Things I recorded with my colleague Chris Moore as part of his podcasted lecture series on cyberculture. As interviews go this is quite organic, without a set script of questions and answers, hence the rambling style and side-stories. Among others, I discuss: the Amazon Echo [Alexa], enchanted objects, Mark Weiser and ubiquitous computing, smart clothes, surveillance, AI, technology-induced shifts in perception, speculative futurism, and paradigm shifts.

Comparative hierophany at three object scales

This is an extended chapter abstract I wrote for an edited collection titled Atmospheres of Scale and Wonder: Creative Practice and Material Ecologies in the Anthropocene, due by the end of this year. I am first laying the groundwork in actor network theory, then developing the concept of hierophany borrowed from Eliade, and finally [where the fun begins], discussing the Amazon Echo, the icon of the Black Madonna of Częstochowa, and the asteroid 2010 TK7 residing in Earth Lagrangian point 4. An object from the internet of things, a holy icon, and an asteroid. To my best knowledge none of these objects have been discussed in this way before, either individually or together, and I am very excited to write this chapter.

Comparative hierophany at three object scales

What if we imagined atmosphere as a framing device for stabilizing material settings and sensibilities? What you call a fetishized idol, is in my atmosphere a holy icon. What your atmosphere sees as an untapped oil field, I see as the land where my ancestral spirits freely roam. Your timber resource is someone else’s sacred forest. This grotesque, and tragic, misalignment of agencies is born out of an erasure, a silencing, which then proceeds to repeat this act of forced purification across all possible atmospheres. This chapter unfolds within the conceptual space defined by this erasure of humility towards the material world. Mirroring its objects of discussion, the chapter is constructed as a hybrid.

First, it is grounded in three fundamental concepts from actor network theory known as the irreduction, relationality, and resistance-relation axioms. They construct an atmosphere where things respectively: can never be completely translated and therefore substituted by a stand-in; don’t need human speakers to act in their stead, but settings in which their speech can be recognized; resist relations while also being available for them. When combined, these axioms allow humans to develop a sensibility for the resistant availability of objects. Here, objects speak incessantly, relentlessly if allowed to, if their past is flaunted rather than concealed.

Building on that frame, the chapter adopts, with modifications, the notion of a hierophany, as developed by Mircea Eliade, as a conceptual frame for encountering the resistant availability of material artefacts. In its original meaning a hierophany stands for the material manifestation of a wholly other, sacred, order of being. Hierophanies are discontinuities, self-enclosed spheres of meaning. Arguably though, hierophanies emerging from the appearance of a sacred order in an otherwise profane material setting can be viewed as stabilizing techniques. They stabilise an atmospheric time, where for example sacred time is cyclical, while profane time is linear; and they stabilise an atmospheric space, where sacred space is imbued with presence by ritual and a plenist sensibility, while profane space is Euclidean, oriented around Cartesian coordinates and purified from sacred ritual.

Finally, building on these arguments, the chapter explores the variations of intensity of encounters with hierophanic presences at three scales, anchored by three objects. Three objects, three scales, three intensities of encounter. The first encounter is with the Amazon Echo, a mundane technical object gendered by its makers as Alexa. An artefact of the internet of things, Alexa is a speaker for a transcendental plane of big data and artificial intelligence algorithms, and therefore her knowledge and skills are ever expanding. The second encounter is with the icon of the Black Madonna of Częstochowa in Poland, a holy relic and a religious object. The icon is a speaker for a transcendental plane of a whole different order than Alexa, but crucially, I argue the difference to be not ontological but that of hierophanic intensities. The third encounter is with TK7, an asteroid resident in Earth Lagrangian Point 4, and discovered only in 2010. TK7 speaks for a transcendental plane of a wholly non-human order, because it is quite literally not of this world. All three objects have resistant availability at various intensities, all three have a hierophanic pull on their surroundings, also at various intensities. Alexa listens, and relentlessly answers with a lag less than 1 second. The Black Madonna icon listens, and may answer to the prayers of pilgrims. TK7 is literally not of this world, a migratory alien object residing, for now, as a stable neighbor of ours.