What self-driving cars tell us about AI risks (IEEESpectrum)

Five conclusions from an automation expert fresh off a stint with the US highway safety agency

IN 2016, JUST WEEKS before the Autopilot in his Tesla drove Joshua Brown to his deathI pleaded with the US Senate Committee on Commerce, Science, and Transportation to regulate the use of artificial intelligence in vehicles. Neither my pleading nor Brown’s death could stir the government to action.

Since then, automotive AI in the United States has been linked to at least 25 confirmed deaths and to hundreds of injuries and instances of property damage.

The lack of technical comprehension across industry and government is appalling. People do not understand that the AI that runs vehicles—both the cars that operate in actual self-driving modes and the much larger number of cars offering advanced driving assistance systems (ADAS)—are based on the same principles as ChatGPT and other large language models (LLMs). These systems control a car’s lateral and longitudinal position—to change lanes, brake, and accelerate—without waiting for orders to come from the person sitting behind the wheel.

Both kinds of AI use statistical reasoning to guess what the next word or phrase or steering input should be, heavily weighting the calculation with recently used words or actions. Go to your Google search window and type in “now is the time” and you will get the result “now is the time for all good men.” And when your car detects an object on the road ahead, even if it’s just a shadow, watch the car’s self-driving module suddenly brake.

Neither the AI in LLMs nor the one in autonomous cars can “understand” the situation, the context, or any unobserved factors that a person would consider in a similar situation. The difference is that while a language model may give you nonsense, a self-driving car can kill you.

In late 2021, despite receiving threats to my physical safety for daring to speak truth about the dangers of AI in vehicles, I agreed to work with the US National Highway Traffic Safety Administration (NHTSA) as the senior safety advisor. What qualified me for the job was a doctorate focused on the design of joint human-automated systems and 20 years of designing and testing unmanned systems, including some that are now used in the military, mining, and medicine.

My time at NHTSA gave me a ringside view of how real-world applications of transportation AI are or are not working. It also showed me the intrinsic problems of regulation, especially in our current divisive political landscape. My deep dive has helped me to formulate five practical insights. I believe they can serve as a guide to industry and to the agencies that regulate them.

1. Human errors in operation get replaced by human errors in coding

Proponents of autonomous vehicles routinely assert that the sooner we get rid of drivers, the safer we will all be on roads. They cite the NHTSA statistic that 94 percent of accidents are caused by human drivers. But this statistic is taken out of context and inaccurate. As the NHTSA itself noted in that report, the driver’s error was “the last event in the crash causal chain…. It is not intended to be interpreted as the cause of the crash.” In other words, there were many other possible causes as well, such as poor lighting and bad road design.

Moreover, the claim that autonomous cars will be safer than those driven by humans ignores what anyone who has ever worked in software development knows all too well: that software code is incredibly error-prone, and the problem only grows as the systems become more complex.

While a language model may give you nonsense, a self-driving car can kill you.

Consider these recent crashes in which faulty software was to blame. There was the October 2021 crash of a Pony.ai driverless car into a sign, the April 2022 crash of a TuSimple tractor trailer into a concrete barrier, the June 2022 crash of a Cruise robotaxi that suddenly stopped while making a left turn, and the March 2023 crash of another Cruise car that rear-ended a bus.

These and many other episodes make clear that AI has not ended the role of human error in road accidents. That role has merely shifted from the end of a chain of events to the beginning—to the coding of the AI itself. Because such errors are latent, they are far harder to mitigate. Testing, both in simulation but predominantly in the real world, is the key to reducing the chance of such errors, especially in safety-critical systems. However, without sufficient government regulation and clear industry standards, autonomous-vehicle companies will cut corners in order to get their products to market quickly.

2. AI failure modes are hard to predict

A large language model guesses which words and phrases are coming next by consulting an archive assembled during training from preexisting data. A self-driving module interprets the scene and decides how to get around obstacles by making similar guesses, based on a database of labeled images—this is a car, this is a pedestrian, this is a tree—also provided during training. But not every possibility can be modeled, and so the myriad failure modes are extremely hard to predict. All things being equal, a self-driving car can behave very differently on the same stretch of road at different times of the day, possibly due to varying sun angles. And anyone who has experimented with an LLM and changed just the order of words in a prompt will immediately see a difference in the system’s replies.

One failure mode not previously anticipated is phantom braking. For no obvious reason, a self-driving car will suddenly brake hard, perhaps causing a rear-end collision with the vehicle just behind it and other vehicles further back. Phantom braking has been seen in the self-driving cars of many different manufacturers and in ADAS-equipped cars as well.

THE DAWN PROJECT

The cause of such events is still a mystery. Experts initially attributed it to human drivers following the self-driving car too closely (often accompanying their assessments by citing the misleading 94 percent statistic about driver error). However, an increasing number of these crashes have been reported to NHTSA. In May 2022, for instance, the NHTSA sent a letter to Tesla noting that the agency had received 758 complaints about phantom braking in Model 3 and Y cars. This past May, the German publication Handelsblatt reported on 1,500 complaints of braking issues with Tesla vehicles, as well as 2,400 complaints of sudden acceleration. It now appears that self-driving cars experience roughly twice the rate of rear-end collisions as do cars driven by people.

Clearly, AI is not performing as it should. Moreover, this is not just one company’s problem—all car companies that are leveraging computer vision and AI are susceptible to this problem.

As other kinds of AI begin to infiltrate society, it is imperative for standards bodies and regulators to understand that AI failure modes will not follow a predictable path. They should also be wary of the car companies’ propensity to excuse away bad tech behavior and to blame humans for abuse or misuse of the AI.

3. Probabilistic estimates do not approximate judgment under uncertainty

Ten years ago, there was significant hand-wringing over the rise of IBM’s AI-based Watson, a precursor to today’s LLMs. People feared AI would very soon cause massive job losses, especially in the medical field. Meanwhile, some AI experts said we should stop training radiologists.

These fears didn’t materialize. While Watson could be good at making guesses, it had no real knowledge, especially when it came to making judgments under uncertainty and deciding on an action based on imperfect information. Today’s LLMs are no different: The underlying models simply cannot cope with a lack of information and do not have the ability to assess whether their estimates are even good enough in this context.

These problems are routinely seen in the self-driving world. The June 2022 accident involving a Cruise robotaxi happened when the car decided to make an aggressive left turn between two cars. As the car safety expert Michael Woon detailed in a report on the accident, the car correctly chose a feasible path, but then halfway through the turn, it slammed on its brakes and stopped in the middle of the intersection. It had guessed that an oncoming car in the right lane was going to turn, even though a turn was not physically possible at the speed the car was traveling. The uncertainty confused the Cruise, and it made the worst possible decision. The oncoming car, a Prius, was not turning, and it plowed into the Cruise, injuring passengers in both cars.

Cruise vehicles have also had many problematic interactions with first responders, who by default operate in areas of significant uncertainty. These encounters have included Cruise cars traveling through active firefighting and rescue scenes and driving over downed power lines. In one incident, a firefighter had to knock the window out of the Cruise car to remove it from the scene. Waymo, Cruise’s main rival in the robotaxi business, has experienced similar problems.

These incidents show that even though neural networks may classify a lot of images and propose a set of actions that work in common settings, they nonetheless struggle to perform even basic operations when the world does not match their training data. The same will be true for LLMs and other forms of generative AI. What these systems lack is judgment in the face of uncertainty, a key precursor to real knowledge.

4. Maintaining AI is just as important as creating AI

Because neural networks can only be effective if they are trained on significant amounts of relevant data, the quality of the data is paramount. But such training is not a one-and-done scenario: Models cannot be trained and then sent off to perform well forever after. In dynamic settings like driving, models must be constantly updated to reflect new types of cars, bikes, and scooters, construction zones, traffic patterns, and so on.

In the March 2023 accident, in which a Cruise car hit the back of an articulated bus, experts were surprised, as many believed such accidents were nearly impossible for a system that carries lidar, radar, and computer vision. Cruise attributed the accident to a faulty model that had guessed where the back of the bus would be based on the dimensions of a normal bus; additionally, the model rejected the lidar data that correctly detected the bus.

Software code is incredibly error-prone, and the problem only grows as the systems become more complex.

This example highlights the importance of maintaining the currency of AI models. “Model drift” is a known problem in AI, and it occurs when relationships between input and output data change over time. For example, if a self-driving car fleet operates in one city with one kind of bus, and then the fleet moves to another city with different bus types, the underlying model of bus detection will likely drift, which could lead to serious consequences.

Such drift affects AI working not only in transportation but in any field where new results continually change our understanding of the world. This means that large language models can’t learn a new phenomenon until it has lost the edge of its novelty and is appearing often enough to be incorporated into the dataset. Maintaining model currency is just one of many ways that AI requires periodic maintenance, and any discussion of AI regulation in the future must address this critical aspect.

5. AI has system-level implications that can’t be ignored

Self-driving cars have been designed to stop cold the moment they can no longer reason and no longer resolve uncertainty. This is an important safety feature. But as Cruise, Tesla, and Waymo have demonstrated, managing such stops poses an unexpected challenge.

A stopped car can block roads and intersections, sometimes for hours, throttling traffic and keeping out first-response vehicles. Companies have instituted remote-monitoring centers and rapid-action teams to mitigate such congestion and confusion, but at least in San Francisco, where hundreds of self-driving cars are on the roadcity officials have questioned the quality of their responses.

Self-driving cars rely on wireless connectivity to maintain their road awareness, but what happens when that connectivity drops? One driver found out the hard way when his car became entrapped in a knot of 20 Cruise vehicles that had lost connection to the remote-operations center and caused a massive traffic jam.

Of course, any new technology may be expected to suffer from growing pains, but if these pains become serious enough, they will erode public trust and support. Sentiment towards self-driving cars used to be optimistic in tech-friendly San Francisco, but now it has taken a negative turn due to the sheer volume of problems the city is experiencing. Such sentiments may eventually lead to public rejection of the technology if a stopped autonomous vehicle causes the death of a person who was prevented from getting to the hospital in time.

So what does the experience of self-driving cars say about regulating AI more generally? Companies not only need to ensure they understand the broader systems-level implications of AI, they also need oversight—they should not be left to police themselves. Regulatory agencies must work to define reasonable operating boundaries for systems that use AI and issue permits and regulations accordingly. When the use of AI presents clear safety risks, agencies should not defer to industry for solutions and should be proactive in setting limits.

AI still has a long way to go in cars and trucks. I’m not calling for a ban on autonomous vehicles. There are clear advantages to using AI, and it is irresponsible for people to call on a ban, or even a pause, on AI. But we need more government oversight to prevent the taking of unnecessary risks.

And yet the regulation of AI in vehicles isn’t happening yet. That can be blamed in part on industry overclaims and pressure, but also on a lack of capability on the part of regulators. The European Union has been more proactive about regulating artificial intelligence in general and in self-driving cars particularly. In the United States, we simply do not have enough people in federal and state departments of transportation that understand the technology deeply enough to advocate effectively for balanced public policies and regulations. The same is true for other types of AI.

Continue reading

Author Mary (Missy) L. Cummings, a senior member of IEEE, is a professor in the Department of Electrical and Computer Engineering and the Department of Computer Science, Duke Institute for Brain Sciences (DIBS), Duke University. As a specialist in systems automation and the way that people use it, she recently served as a safety consultant for the National Highway Traffic Safety Administration.