Introduction ⇡
- A robot may not injure a human being or, through inaction, allow a human being to come to harm.
- A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.
- A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.
Isaac Asimov's Laws of Robotics are to AI safety what Sherlock Holmes' stories are to police investigations: quasi logical musings with thematic dressing. In this essay we are going to take them seriously and see if we can learn something interesting.
First Flaw: Terms ⇡
None of the important terms in the Laws are definable or knowable.
What is an injury or harm? There might be easy cases, involving broken bones and spilled blood. Most cases are not that clear.
Remember Covid? Wasn't too long ago. Mandated masks, mandated vaccines, mandated curfews, distancing laws, closed public places. The result is an enormous blow to the economy, hitting disproportionally the poor and the vulnerable. There was no proper legal framework, solid scientific proof, or even common sense backing. There simply wasn't time for that. How do you apply the First Law in this setting?
Note, that robots can't follow consensus or decree. Consensus is not relevant, because humans don't obey the First Law, therefore they might agree on something the First Law forbids. Decrees fall under the Second Law, and therefore have no power over the First. How would you like robots deciding such question for us? Especially considering they have to make a decision, the First Law doesn't allow them to pass.
How do we handle minor harm? The First Law doesn't seem to establish a threshold. Annoying a human is allowed? Sadness is something for a robot to fix? How about being dissatisfied with oneself?
We can't just set a limit though, small harms can be important. Take pollution for example. Can robots pollute or allow pollution? Any particular act of pollution causes negligible, essentially unmeasurable harm. Yet, all the polluters together over an extended period of time can make entire regions suffer chronic illness, shortened life spans, or degraded crop yields.
How do we factor in individual valuations? What is great value for one, is of little value to someone else. For some people, freedom means a lot. For others, safety is very important. Yet others value pleasure. Personal achievement. Growing. Being good. How would a robot decide if helping someone out of trouble is a net benefit if it degrades their self-esteem? Should a robot prioritize long term or short term benefits?
Is self harm included? Many forms of self harm are maladaptive strategies which the person probably regrets later. Even so, simply stopping the act might not be the right move. But in other cases, such behavior is simply a choice. Consider radical body modifications for example. Or simply a willingness to accept the risk of illness for joy in return. Adventure seeking.
The problem is greatly exaggerated in child care. The caretaker ought to make decisions for the minor in their best interest. Decisions just as hard and relevant, but without the excuse of being only self affecting. Is raising a child even possible when robots are around? Can they respect the parents' choices?
Is forgone benefit harm? Try to define the difference, to see it dissolve in front of your eyes. Every negative is just a rephrasing of a positive, and vice versa. Imagine for example that a webshop you frequently buy from sends you a little gift. Someone intercepts the package, and steals it. You will never learn you have lost something, and arguably you didn't. Should a robot allow it, since you are not experiencing anything? But if we established that lost opportunities are harm, the First Law mandates the robot to act on it. How would it even look like? How many things could happen, how many actions are possible that could bring about some benefit? Being lazy comes back to haunt me. Partying instead of studying for an exam might hurt my entire career. I don't want to be dragged to a university by a robot, to make my future brighter.
Surely we could go on, but we don't have all day for one single term. Take another one, human. It is not any easier.
We can open this dossier right in the middle: abortion. In the center of this conundrum is the very definition of what makes a human human. In what sense a handful of cells can be considered human? But if considered not, at which point it becomes a human? Is it abrupt, or is there graduality to it? Can you be 50% human? If you can, how does the First Law handle that? Some might claim that the prospect of a human should be treated as a human. In this case, a sperm and an egg cell, a few inches apart, are not a prospect of a human life? Maybe procreation should be enforced.
The same problems arise at the other end of life: do we go in an instant, or are we fading away? Think of Alzheimer's or stroke, or just a slow decay of one's mental faculties. What are we, our bodies, our memories, our peculiarities? All of these change and fade. So are we?
In an attempt of escaping death, one might choose cryopreservation. Or perhaps cryopreservation is death? Our current law doesn't recognize a cryopreserved state as living, therefore a voluntary transition would be consideded murder or suicide. But our current regulations has no bearing on the First Law. Performing cryopreservation while the person is in good condition might greatly increase the chances of successful reviving. Should robots enforce cryopreservation?
Then there is mind uploading and mind backups. Whether a digital copy of someone's brain is in any way human is debatable. This is exactly the point. In some circumstances, a robot might have to choose between harming a living person and a data vault holding digital minds.
So far we've only been discussing the First Law. We are going to skip the other two for reasons explained in the next section.
Second Flaw: One Law ⇡
Give an order to a robot, and thus invoke the Second Law. The action commanded can either be a) harmful, in total, to people, or b) beneficial to people, or c) neutral. If harmful, the robot will not obey. If beneficial, the robot should've been doing it already, since not doing it is benefit forgone, a harm, via inaction. If neutral, disobeying the order causes harm: who enjoys being ignored? Either way, the robot's behavior is always determined by the First Law.
The robot hopefully has utility to humans. Destroying itself, therefore, violates the First Law. Perhaps the robot has disutility. In that case, destroying itself is mandated by the First Law.
The First Law in itself covers the entire space of possible actions. It is incomprehensible that an action has no, not even negligible effect on human life. Based on this effect, all actions can be classified as either mandatory or forbidden. Any subsequent laws have no chance of ever coming into play.
Third Flaw: Unforeseen Consequences ⇡
Let's buy a housekeeping robot. Upon activation, the robot will assess its situation, and realize that there is a near infinite number of ways to remove harm, danger, or to improve human life all over the world. Vacuuming your living room is on this list, somewhere down. Your new maid will promptly leave for Africa, or whichever region at that time has the most trouble, and go full First Law mode there. Commands or threats will have no effect, the Second and the Third have no power.
In Asimov's world, robots don't have roles or jobs. There might be specialized types, like welders or surgeons. But these specializations are merely physical, size, build, tooling, and will not affect behavior very much. Robots will engage in the most rational form of division of labor, find uses for whatever appendages they happen to have furthering the collective effort of robotkind to help humans. Self modification is also an option.
Ownership of a robot is a vacuous concept. A robot will not behave any differently whether you are the owner or not. Nor you have any more control. Thus, buying a robot makes no sense, and at this point, manufacturing a robot also makes no sense.
Unless the manufacturing is done by robots. Robots have a good incentive to manufacture robots, since more robots can do more of the mandatory robot deeds. We have unleashed an exponentially growing, unstoppable wellbeing maximizer, with the twist that we have no control over what that wellbeing would entail.
There might be a pushback. It introduces some friction, since mass dissatisfaction with the new world order is harm, which the robots need to address. The optimal solution probably isn't going back to the plantation; removing suffering from the world is of utmost importance. Dissatisfaction can be addressed directly instead. People can be deceived. People can be drugged or genetically modified to be more content. Finally, people can be put in virtual reality, and solve all problems once and for all.
Post Mortem Analysis ⇡
At a glance, the Three Laws seem rock solid, risk free, highly beneficial, and frankly, somewhat obvious. Under scrutiny, they turn out to be a recipy for disaster. Instead of the idyllic future we hoped for, we paved the way to the most nightmarish hellscape of unescapable slavery, and probably extinction. How did this happen? Why did this happen?
The core of the problem is the erroneous concept of "number one priority". We most often here this regarding safety, but it doesn't actually matter. There can not be a number one priority in any situation ever. It simply doesn't make sense. Let's explore why.
We always have more than one unnegotiable requirements, without which the product is not viable. Just a few: safety, operability, durability, affordability. Imagine scoring zero in any of these aspects. No affordability would entail infinite price, or at least high enough to exclude all possible customers. Zero durability would mean the product don't last long enough to be used once. And so on, these are absurd propositions. We need a minimum level of each to get what is called a minimum viable product. To improve, we consider each aspect, and determine how much we can improve, how much value that imrpovement brings to the customers, and how much does it cost. Improving any aspect comes with diminishing marginal value. A product that is already reasonably safe will not benefit much from additional safety, but might benefit more from reduced price, or improved durability. At one point, visual appeal will trump all the other options.
In an idealized world, if we can do incremental upgrades, everything is the same priority. Once an equilibrium is approximately reached, no matter what you improve, it will add the same amount of utility. Thus, you work on everything at the same time. That's why we have different teams in product development, people are working on safety, others are optimizing production to bring down costs, yet others are busy finding new materials or better methods to make the product function better or last longer. We advance on all fronts.
We called neglecting any one aspect absurd. The word absurd is to be noted here. This is exactly what we have observed when we attempted to set strict priorities to The Laws. Just like a product with infinite price is absurd, a robot strictly and qualitatively valuing human wellbeing above all else is absurd, and leads to absurd consequences if pursued. Human life is highly valuable, yes, but if we conduct ourselves properly, human life will get enough attention such that any additional attention is less valuable than something else, for example utility. At one point, visual appeal will win over more safety.
It is an uncomfortable truth. We don't want to think about the value of a human life. We don't want to consider knowingly putting people at risk in order to gain something else, material or even aesthetic. No matter how uncomfortable it is though, this is how the world works. We can stick our heads in the sand, but that's all we can do.
A valid Law would look more like:
However, implementing the consider subroutine will be extremely demanding.
- A robot must consider the value of human life, the demands of human beings, and the robot's own value, and many more things.