How to Build a Self-Conscious AI Machine


Long, but well written article on the development and future of self-aware machines.

Includes this brilliant passage:

With the concept of Theory of Mind firmly in our thoughts, and the knowledge that brain modules are both fallible and disconnected, we are primed to understand human consciousness, how it arose, and what it’s (not) for.

This may surprise those who are used to hearing that we don’t understand human consciousness and have made no progress in that arena. This isn’t true at all. What we have made no progress in doing is understanding what human consciousness is for.

Thousands of years of failure in this regard points to the simple truth: Human consciousness is not for anything at all. It serves no purpose. It has no evolutionary benefit. It arises at the union of two modules that are both so supremely useful that we can’t survive without either, and so we tolerate the annoying and detrimental consciousness that arises as a result.

One of those modules is Theory of Mind. It has already been mentioned that Theory of Mind consumes more brain processing power than any other higher-level neurological activity. It’s that damn important. The problem with this module is that it isn’t selective with its powers; it’s not even clear that such selectivity would be possible. That means our Theory of Mind abilities get turned onto ourselves just as often (or far more often) than it is wielded on others.

It is not possible to turn off our Theory of Mind modules (and it wouldn’t be a good idea anyway; we would be blind in a world of hurtling rocks). And so this Theory of Mind module concocts stories about our own behaviors.

The explanations we tell ourselves about our own behaviors are almost always wrong. They’re pretty good when we employ them on others. They fail spectacularly when we turn them on ourselves.


  • Theory of Mind: the attempt by one brain to ascertain the contents of another brain
    • guessing what someone else is thinking: First Order Theory of Mind
    • guessing what someone is thinking about a third: Second Order… and so on
  • brain is not a single, holistic entity. It is a collection of thousands of disparate modules that only barely and rarely interconnect
  • Some functions of the brain were built hundreds of millions of years ago, like the ones that provide power to individual cells or pump sodium-potassium through cell membranes.
  • Others were built millions of years ago, like the ones that fire neurons and make sure blood is pumped and oxygen is inhaled.
  • Move toward the frontal lobe, and we have the modules that control mammalian behaviors and thoughts that were layered on relatively recently
  • AI becomes worse at its task if it becomes self-conscious (usually)
    • although I think defending its thinking against those that seek to corrupt it requires some level of what could be thought of as self-consciousness