What is safety? Who decides, and how do they do it? Turns out, safety isn’t just wearing a seatbelt or looking both ways before you cross a road. It’s a heavily regulated, thought out system of protecting everyday citizens from unnecessary danger. From U.S. Navy subs to Carnegie Mellon, the National Robotics Engineering Center and now Edge Case Research, Dr. Phillip Koopman has long been the tip of the intellectual spear defining the bar for safety—and raising it. But what happens when Alex Roy asks the question no one in AI or robotics wants to answer?

Listen On
Apple
Google
Spotify
iHeart Radio
Share

View Episode Transcript

Episode Transcript

Alex Roy

Welcome to the No Parking Podcast, as always I’m Alex Roy here with my friend and cohost Bryan Salesky.

Bryan Salesky
Hi there.

Alex Roy
People often ask…what is safety’? What does “safe” mean while they drive themselves to work each day with no training? And so this week Bryan and I decided to have someone on that Bryan knows for a very long time from Carnegie Mellon AND academia…to discuss the topic of safety. What does it mean? What’s it all about? What is safe? His name is Dr. Phil Koopman. He’s worked on a variety of safety standards going back decades. He drives a Volvo, which he may not want to consider a defining character attribute, but I do, and he was also an officer on US Navy submarines, which I find very, very cool. Bryan, is there anything else you’d like to say about Dr. Koopman before we roll into it?

Bryan Salesky
Well he’s also a co-founder in a startup called Edge Case Research, which is aiming to validate automated vehicles through various ways. I’ve known Phil for a long time as you mentioned, and he’s one of the more thoughtful leaders in safety. He has a lot of really great things to say. We tried through the whole show to keep it at a level that’s hopefully understandable by a broad audience.

Alex Roy
Hopefully.

Bryan Salesky
Hopefully. If we failed, Phil has promised us we’ll do a do over. So I guess we’re going to listen to audience feedback see what happens.

Alex Roy
I hope he will come back, I think he will. Let’s roll right into it.

Phil Koopman
Safety? The normal definition is people think it’s like not actively dangerous, something’s not trying to actively kill you. And the catch is that it’s more complicated for a self driving car…or something like that…because there’s … I put it in three bins. I like making things three bins, maybe we’ll do another one further on.

Bryan Salesky
They say all good things come in threes, right?

Alex Roy
Are lawn darts one of those bins? Because those were illegal when I was a kid.

Bryan Salesky
We’re trying to keep this educational for the moment. So just hold that thought.

Phil Koopman
I escaped death by long dart, but that’s okay. So in this case for safety, there’s three bins. There’s the things that are actively dangerous. Everyone looks and says, “Oh that’s scary.” Okay? And then over time if you’re building something…vehicles in general have some danger. There’s a lot of weight, a lot of speed, a lot of energy there. But over time people figure out how to make them safe on an everyday basis. But that’s the middle bin. Safe day-in-day out, that’s fine. There’s a third bin, and for someone like me, safety doesn’t really hit until you get to the third bin. It isn’t ‘I have one vehicle and it hasn’t killed anyone today.’ That’s the middle bin. The final bin is there’s 100 million vehicles on the road, and if you’re worried about, well, that’s one in a million. That’s a hundred times a day.

Bryan Salesky
Yeah. The statistics are not on your side when you start to scale things out.

Phil Koopman
That’s right.

Bryan Salesky
What used to be infrequent now becomes frequent.

Phil Koopman
Right. Just because a matter of scale. And so the people who do safety, like an aircraft, it’s one per billion with a B flight hours. 10 to the minus nine for engineers, and that’s a lot of hours.

Alex Roy
But it isn’t that for all planes and all airlines, right?

Phil Koopman
For commercial aircraft, the FAA, it’s one per billion. And the way they got that number is they said, “All right, let’s say it’s 747s that are ever going to be built, they’ll only come up with maybe one billion flight hours for the entire fleet. So there’s less than 50% chance that any plane in the fleet for the history of 747s will ever fall out of the sky.” That’s how they get that number.

Bryan Salesky
So how did they get there? I read the Wright Brothers book recently that-

Alex Roy
Yeah, that one.

Bryan Salesky
David McCullough, I read the Wright Brothers book recently and it’s an insane biography, I highly recommend the book, but what it goes through is the trials and tribulations that these guys went through and the dedication they had over many years. And in fact they had to do a number of demonstrations just to prove to people this was even a thing…that it actually worked. But what was remarkable, the way the book tells it anyway, was it did not take a long time for it to catch on, and more and more people were trying it out and acceptance actually occurred, at least as I read the book, it seemed like fairly quickly…decades…but still quickly. How did we get to…what did you say? One failure per billion hours with airplanes?

Phil Koopman
It was not a straight path, so at the beginning aircraft were pretty dangerous, right? But then you had World War I driving the technology. When you’re getting shot at the level of risk is different than in peacetime. But then they stalled out. So they were actually pretty darn dangerous even through into the 50s. And what they did was…over time there were some events, and I’m not versed in the detailed history, but there were some events that basically led to the creation of a federal agency to oversee this stuff, to improve safety. And so that happened…and their first strategy was to say, “Well, this airplane crashed, why did it crash? Let’s fix it.”

Then an interesting thing happened after a while…they stalled out, so to speak. They hit a plateau, and they realized they actually needed a cultural change instead of waiting for every single thing to go wrong and fix it, they had to get in there and say, “Wait a minute, let’s be a little more proactive, let’s look at the engineering. And just because it hasn’t happened, doesn’t mean it won’t happen. Because it hasn’t happened doesn’t mean you can’t figure out that it’s going to happen and get ahead of the curve.” And so they did that, and all of a sudden the fatality rate, the crash rate went down…and that’s how we got we are today. Not just by being responsive, but rather by applying, there’s a standard for where they apply engineering rigor…to make sure they thought of everything they can possibly think of.

Bryan Salesky
And the space program fed into this as well, if you read a lot of early safety literature, there’s quite a bit written coming out of NASA and its constituents about different methods to be used to verify and understand safety.

Phil Koopman
That’s right. There was a thing called Fault Tree Analysis, which is, well we can look at all these hundreds or thousands of parts and figure out which one will fail and that’s called Failure Mode Effect Analysis. That’s the classic thing people use-

Alex Roy
Sorry, Failure Mode Effect Analysis?

Phil Koopman
FMEA, that’s one of the things car companies use. But there’s another one for safety.

Alex Roy
Just so we are clear Dr. Koopman, I am representing a lot of folks listening to this who want to know what safety is, are not familiar with all these terms, take it slow.

Phil Koopman
Fair enough. So we can do tutorial later. Okay. So FMLA is you list all the components and you say, this resistor can fail this way, that resistor can fail another way. You say, here are all the pieces, here’s how they can break, and here’s what happens. And so the old style safety did that from World War II. And what they found was that these new systems in cars are clearly so complicated that no one can wrap their brains around it. And you worry about, well what if these three things fail together, will that happen? Well, if they’re right next to each other, it turns out it will.

So they needed another technique, and the technique from the Minuteman Missile Project was called Fault Tree Analysis, FTA. So FMEA is bottom up, here are all the things that can break, here’s what happens. And nobody’s brain can absorb all this, too complicated. Fault Tree says, look, there’s a bunch of stuff, most of it doesn’t matter, what we really care about in that case was unintended launch of a nuclear missile against the Soviet Union. That’s a bad thing, right? And so let’s start from there and say what are all the ways that this could happen? So it’s a top down analysis. And so that was in the early 1960s for Minuteman, but both of these types of techniques, so the kinds of things, the basic tools of the trade you see in safety these days, but it’s not just ‘is it going to work?’, it’s ‘here’s what we care about that’s bad, how do we make sure that never happens?’

Bryan Salesky
Right. So there’s tons of methods to be clear out there that can be used to analyze safety…and to understand and brainstorm all the ways in which things can go wrong. And a lot of it is a little bit of alphabet soup to anybody who doesn’t study the field, but it’s everything from-

Phil Koopman
I have to look them up sometimes myself.

Bryan Salesky
He has a little flip card in front of him. No, I’m just kidding. There’s Fault Tree Analysis, there’s the FMEA that he just talked about, there’s also hazard analysis, there’s approaches to system safety and there’s a whole gaggle of standards that applies these things and more in order to help, and really all of this is leading up to helping you understand the safety of your system at the end of the day.

Phil Koopman
And so back to the FAA lesson, they learned that flying around waiting for bad stuff to happen gets you only so far, and then you have to go in and use, I call it engineering rigor, lots of people do, just be a methodical engineer to not just say, “Yeah, seems to work, but no, no we’re really sure what we’re doing here, we understand it, it’s going to work, here’s the reasons why.”

Bryan Salesky
Now let’s tie this back to where we started on ‘what is safety’, and this is why defining safety for whatever the system or application is so important.

Phil Koopman
That’s right. For fault trees to work you have to know what the bad thing is. You’d be astonished how often I’ll go in to a company…because I’ve done hundreds of design reviews on everything from power supplies to chemical process stuff all over the place, cars, all sorts of things. And you say, okay, your safety is great, what does “safe” mean? And you get this blank stare back, because they intuitively know it, but it’s hard for them to actually articulate it.

Bryan Salesky
And this is the key thing that you taught many of us years ago when we were at Carnegie Mellon. It starts with ‘what is the definition of safety?’ You have to be as crisp as possible about that because everything leads from there.

Phil Koopman
And a general definition isn’t good enough, usually it’s something like safety means you don’t have loss events. What’s a loss event? Okay. Now we’re getting into the terminology thing. ‘“Bad things don’t happen”, okay? And most engineers aren’t taught that at all. Most engineers are taught how to make the good things happen.

Bryan Salesky
They’re so focused on the normal path of operations, what we used to call-

Phil Koopman
Just getting it to work.

Bryan Salesky
Yeah, we call it in the industry the happy path.

Phil Koopman
The happy path, right. And safety’s about the unhappy, or another way…if you have a system that 9,999 times in a row sees the kid in the crosswalk, okay? Yeah, that’s supposed to happen. That’s not what safety is about. Safety is about the one time you don’t.

Bryan Salesky
Right. Yep. So I don’t want to get too academic here, but one of the other things that you would talk about is, at least in the academic circles, there’s a difference between safety and dependability. And I think this is a helpful thing to explain, not because everyone cares about the academic nuances. But-

Alex Roy
I’d like to know.

Phil Koopman
Alex, you don’t.

Bryan Salesky
Let me set it up. I think there’s an important philosophical difference here, we’ll keep it high level, but can you explain a little bit of the difference between the two fields of study?

Phil Koopman
Sure. Well, they’re all interrelated. I go to the dependability conference, I go to the safety conference and I see some of the same people there. So they’re not that far apart.

Alex Roy
So they’re both dependable and go. I’m sorry…that was bad.

Phil Koopman
Well, dependability is just an umbrella term for does the right thing. So the dependability people will tell you safety’s underneath it, but it has a distinct flavor. So maybe I’ll go into that.

Bryan Salesky
Please.

Phil Koopman
So for dependability, some of their properties, and I will not give you the … I’ll leave that to our lecture slides on my laptop. I’m not pointing them out, I promise.

Bryan Salesky
I believe I’ve gotten that lecture.

Phil Koopman
So Bryan has suffered through this one. Yeah.

Alex Roy
If you can skip it today. Thanks.

Phil Koopman
Okay. But for example, availability is uptime, my computing center has such and such 99.999% availability. And that’s just uptime and nobody dies when it goes down. But you really want your telephone to work when you make a call. Another one is reliability. Reliability is out of so many missions, I think like…an airplane flight, how many times does the airplane land without crashing? So you want a lot of nines, we call it a lot of nines. 99.99999, and there’s nine 9s on airplanes per hour. And so you want a lot of nines. On reliability it’s ‘did you complete the mission?’ You stated safety is different, it’s related but different…because for safety, if you have a car that sits in your garage and you never get in it and it never starts and never go anywhere, it’s perfectly safe. Now, if the reason it never goes anywhere is because the engine’s missing, it wasn’t very reliable, wasn’t even very available, but the safety is great. And so there’s this issue that safety is ‘bad things don’t happen’, whereas availability and reliability are about ‘do good things do happen?’

Bryan Salesky
And so dependability is really the intersection of all of these things, right? We do have a mission we need to run, and it needs to be safe and it needs to be available. We’re trying to find a way to engineer the system to accomplish all of these objectives, right?

Phil Koopman
Yeah. So if you have a robotaxi, fleet availability is the car is not in the shop. And reliability is someone gets in, do you actually get to your destination? And now where I’m safety is ‘no crashes.’

Bryan Salesky
Perfect. Now where I’m going with all this is that-

Alex Roy
Humans are only good at one of those three.

Bryan Salesky
Well, I think it depends on the human, but our experience certainly shows that a typical human driver is not incredible at what they do just from our observations in the field. But that’s a whole other-

Alex Roy
That’s a whole other point, but I would say humans are really good, it’s just not as good as you want them to be.

Bryan Salesky
Exactly.

Alex Roy
Maybe that’s a different topic. We are just jumping in here.

Bryan Salesky
No, it’s good, but here’s where I’m going with all of this though. So at the end of the day, safety does not mean perfect because, as you said, perfect would mean that let’s just not turn the thing on and go anywhere.

Phil Koopman
Everything breaks. There is no such thing as perfect safety, that’s absolutely true.

Bryan Salesky
So when we talk about safety, especially in these critical fields, whether it be self-driving, or flight, or your water heater…Phil’s got great water heater stories. At the end of the day, what it’s about is being able to define what safe means, apply these methods to understand the system as best as you can, and to think about not just the happy path, but also the path where things might break, and to mitigate those things so that they don’t happen, or at the very least — if they do — make sure that we go to some safe condition to prevent the bad thing from happening at the end of the day. Can we talk a little bit about now that pathway?

Phil Koopman
Yeah, sure. There’s a reason that commercial aircraft have two jet engines. Sometimes one of them doesn’t work.

Alex Roy
They used to be four.

Phil Koopman
It used to be four, and unless you had four engines, you weren’t allowed to fly over the ocean. Well, you were, there’s a thing called extended twin engine operations, ETOPS. I used to work with Pratt and Whitney. Bryan’s heard so many of these stories. I think he’s scared to fly with me…or something.

Alex Roy
I remember people would freak out if one of the four went down. “Oh no, we’ll only have three.”

Bryan Salesky
Phil and I were getting on the same plane once. I saw him from a distance at the gate and I said, “Oh no, maybe I should need to take another plane.” I don’t know.

Phil Koopman
No, it’s even better. He put his head in his hands, his life flashed before his eyes, and I sat down behind him and they closed the door. We’re going to tell the full story, Bryan.

Bryan Salesky
Sure, why not, no…that’s fine, go ahead.

Phil Koopman
He’s like, he’s nervous, and he is like “I’m a grownup. I’m not superstitious. It’ll be fine.” It’ll be fine because he’s heard all these…like the time the airplane caught fire. I was on, but this was the last time…

Alex Roy
Are you prone to flying around and…

Phil Koopman
Let me get through this story, then we can. Yeah. All right. So I’m sitting next to him and everything’s fine. They close the door…and nothing happens. The pilot gets on the horn and says, “Ladies and gentlemen, our airplane’s broken.” He just gives me this look like, “Freakin’ Koopman broke the airplane.”

Bryan Salesky
Yeah. Yes. Also one of the few people I know. In fact, I think the only person I know who has lost all four brakes on his vehicle. Let’s end with that story. Shall we?

Phil Koopman
We’ll come back to that one. That is a true story. We’ll come back to it.

Bryan Salesky
So back on track. So we were talking about the methods that we apply to safety. Help us understand the ways in which things can go wrong. And then we start to think about, all right, what are the mitigation factors? How do we proactively prevent the bad thing from happening?

Phil Koopman
So let me go back to the airplane incident. You have two airplane engines…cause one goes bad. It’s about two times in five. So it’s about every 50,000 hours.

Alex Roy
Do you always talk like that?

Phil Koopman
Yeah, I’m an engineer. Sorry. You know about every 50,000 hours with a Pratt and Whitney, it will just stop working. Usually they can be started in flight, but they don’t take credit for that. And then you have three hours to get to the nearest airport because if you do the math, you’re still at one per billion hours. That second engine is very unlikely to fail, and so when you go on a two engine aircraft of that era, you have to say, it’s ETOP when it’s extended twin engine operations, or 180 minutes. And so you have to stay three hours from the nearest diversion airport and that gets you almost everywhere except parts of the Indian Ocean. But it used to be ETOP 120, ETOPS 60 and he used to have to refuel an Anchorage and fly over land. It’s gotten better.

Bryan Salesky
The pilots, they would bake all of this into their flight plan. Like every step of the way…they’ve got a fallback?

Phil Koopman
And they’d fly further just to make sure they steer near land. And there, I think if you go to the Galapagos, I think it is one of those, it’s a four engine aircraft and there’s a reason that’s more than three hours. It’s one of those remote islands, right?

Bryan Salesky
It’s that far out.

Phil Koopman
So the aviation guys have got all this figured out, and now in a car you only have the one engine. Yeah, I had an old car with, you could use the starter to move it a little if you wanted to. They’ll let you do that back then. But you really only have the one engine. But you know if a car goes bad, you can pull to the side of the road. If an airplane goes bad, there you are. What do they say? If the car goes bad, there you are and if an airplane goes bad there you aren’t.

Alex Roy
I mean, the thing about aviation is that the average person is never going to get a pilot’s license.

Phil Koopman
Yeah, fair enough.

Alex Roy
And in fact, even if they had one, they don’t have access to aircraft to fly themselves to Europe.

Phil Koopman
Unless you live in Alaska where like it’s more common than driverless.

Alex Roy
You still don’t have access to a plane that’ll take you to Europe. You don’t own it, and so we put faith in these things. We trust them for like a, a myriad reasons, but among them is we have to, if we want to go there.

Phil Koopman
Yes.

Alex Roy
So the average person, and this is the insanity of talking about safety…what people believe safe is, the average person trusts strangers all the time who drive them, and who drive around them, and think that they’re safe, which makes absolutely no sense…I think I’m a pretty safe driver, although that’s declining every time I come to the office and tell the people what I did that morning. But I think I’m pretty safe, and yet I know that others around me who are less safe put me at risk despite my skills. The irony of AV is that no matter how well we designed them — and I have great faith, I wouldn’t be here if I didn’t think so — we are still subject to the vagaries of external error. Third parties, fourth parties and that is something that I think machines can do better than people. Machines will mitigate that risk.

Phil Koopman
So here’s the catch. The catch is by the same token, the world is a really bizarre place, and people are pretty good at dealing with it more or less. But they’re pretty good…and they’re just so many bizarre things that happen. If you want your AV, your self-driving car, to be able to deal with it. If using machine learning is a problem, I’ll get into in a second. But basically if you haven’t designed it to deal with it, these machines don’t have what the AI people call general intelligence. It’s specialized, and so if you see something you’ve never seen before, you know that’s a problem. They’re bad at not knowing. They don’t know. But let me circle back. Machine learning is a technology used in a lot of these systems. And the idea of machine learning is you showed a bunch of examples and it figures out, “Oh, the designer doesn’t know how to say what a person looks like.”

They worked on that for decades and didn’t get there. But machine learning is like, well, I’m not going to tell you what a person looks like. I’m going to show you a bunch of people in pictures, tell you which ones are people and you’ll figure it out. And that’s great. Okay. But it’s only stuff it has seen before. So when it hits something it hasn’t seen before, that can be an issue.

Bryan Salesky
Very good. So let’s pivot toward, was there anything more you wanted to on that one?

Alex Roy
Well, the funny thing is I, even, despite what you say about machine learning, maybe not being good at knowing what it doesn’t know…

Phil Koopman
I hope you’re working on it because this…

Alex Roy
Bryan’s working on it.

Phil Koopman
Bryan’s working on it.

Alex Roy
But this is traditionally the issue, right? So my perspective of that issue is that the average person, and I would put myself in this list…maybe I’m above average as a driver.

Phil Koopman
I’ll drive. Do you know that more than half drivers are above average, right?

Alex Roy
Of course. That even though my judgment, I think I know what my judgment is. I don’t have faith in my own ability, muscle memory to execute on choices. But I have total faith in a machine to execute, and its choice is better than most people.

Bryan Salesky
Well, what Phil’s saying though is the bulk of the task that’s before us is to actually deal with these surprise events, right?

Phil Koopman
That’s getting the third safety bin is the worst stuff.

Bryan Salesky
That’s right. And that’s, and that’s absolutely true. That’s where we spend the bulk of our effort. at Argo…and I presume most other self-driving companies, this is what they’re dealing with. We sometimes call them long tail issues, but that tail’s not that long. When you start to scale things out, like, we said before, the infrequent becomes all of a sudden frequent. And when deploying machine learning or other similar types of technologies, at the end of the day we do best when we’ve thought about it, we’ve tested it, we’ve characterized the system to make sure it can handle these events, and so on. And then everything that’s left, there needs to be some thought around the design of how you handle it.

Phil Koopman
And it may be as simple, I mean, people have a way of handling this. They’d say, what the heck is that? I’m going to slow down until I figure that out.

Alex Roy
My mother just starts screaming and says, “but I’m still better than you!” which makes me never want to drive with her ever.

Bryan Salesky
Well that’s right. But so this is where we get to, this is where we start to get to the reactions to when these infrequent things happen no matter, regardless of the frequency, if the autonomous vehicle is not sure how to handle our particular scenario, we need to build enough intelligence in it that it’s aware of the fact that it can’t handle it, and then to take some action. So what are our options?

Alex Roy
One of them is to just be really clear about what you know and admit the possibility that you don’t know.

Bryan Salesky
Even acknowledging that though is really important.

Phil Koopman
I’m going to tell you about one of your cars. It’s a good story. Okay. I’m following this car with my dash cam. This is, oh yeah…

Bryan Salesky
So you have recorded evidence.

Phil Koopman
I was doing a drive with Nova. There’s an episode on next week and this is during the filming of that. I’m driving around behind one of your cars and it jams on the brake lights about a hundred feet before a stop sign. Fortunately I wasn’t tailgating, so we were okay, and I’m like, ‘why did it do that?’ And then from around the corner that was not in my sight line comes the most bizarre piece of construction equipment I’ve ever seen. I’m like, “What the heck is that?” And the good news is it said, ‘what the heck is that?’ It jumped out. It was some guy working on earth compactors. It’s like I’d never seen it before. I’ve never seen it again.

Bryan Salesky
I assume this was in the Strip District.

Phil Koopman
In the Strip District. Yeah.

Bryan Salesky
So for those who don’t know…pretty much the entire Strip district in the City of Pittsburgh is under immense amount of construction, and there are a lot of surprise events that pop out from the sidewalk every day. We see it all the time. But so what are our options? Right? So the options are you stop, maybe it is able to pull over, if some failure has occurred and it doesn’t know how to continue safe operation. But these are the things that we have to think about as we start to brainstorm all the things that can go wrong. When I talked about mitigations, what I’m saying is we have to think about what is the safe action that it can do?

Phil Koopman
Well, the first thing is you have to realize you don’t know what’s going on, right? And this is a lot of work to do that. And the next thing is, you may not stop. You may just slow down and say as long as I’m really sure there’s nothing going on in twice my stopping distance, it’s okay to move. Right? And so there’s measured responses. It doesn’t have to be ‘pull the emergency stop’ every single time.

Bryan Salesky
We’ve talked in very broad strokes about what safe means, and what some of the methods are as to how to understand safety and so on. There are actually a lot of standards out there that help to take what we’ve just been talking about and define it in a very principled manner. What are some of the standards that apply today to automated vehicles? Not necessarily full self-driving, but just automated vehicles in general.

Phil Koopman
Sure, and there’s going to be a bunch of numbers streaming out, and the disturbing thing is I actually know these numbers, right?

Alex Roy
You better back that up, Holmes.

Phil Koopman
Be careful what you wish for.

Bryan Salesky
Don’t tempt him.

Phil Koopman
So the basic one is ISO 26262. Whomever named that standard had a sense of humor because there is no 26261. They just decided they’re going to have fun with that number.

Bryan Salesky
I think always wondering how they come up with these numbers.

Phil Koopman
As far as I know…

Alex Roy
For the civilians, can you explaining what that is.

Phil Koopman
Yeah. So ISO International Standards Organization 26262 has to do with the computers inside conventional vehicles, and whether or not if something goes wrong, they do a safe thing. Usually…like a shutdown. So for your engine control or your steering or your brakes, those all have computers. Oh by the way, on a, on a hybrid car, when you press the brake pedal most of the time it’s just a suggestion to the computer. If you follow this standard, it’s supposed to make sure that that always works, or if it doesn’t work it knows it’s not working and does something safe like go to a hydraulic backup for your brakes. So that’s a baseline standard.

Bryan Salesky
I can think of several cars that don’t conform to that, right now.

Phil Koopman
There are a number of car companies which do not conform to that standard. There’s a couple who say they do. There you are. Okay. That standard has been through a couple of revisions already.

Alex Roy
If only humans could conform to that standard, we wouldn’t be in business.

Phil Koopman
What else do we got another one, there’s a one called the SOTIF, Standard Safety Of The Intended Function. S-O-T-I-F and that’s ISO 21448.

Bryan Salesky
That’s brand new.

Phil Koopman
That’s about a year old and it’s not actually for full self driving cars.

Alex Roy
We mean for a level four.

Phil Koopman
Oh don’t get me started on the levels.

Bryan Salesky
We’ll come back to that.

Alex Roy
I just want to make a note. I just want to be clear because you’re a man of specificity. Yeah. And I like to know what I’m listening to.

Bryan Salesky
Did you always interrupt the professor in class?

Alex Roy
No, but we’re here to learn and he’s talking above some folks including myself from time to time.

Bryan Salesky
21448 SOTIF, go on.

Phil Koopman
Okay. So the SOTIF standard is about ADAS Advanced Driver Assistance Systems. So when you see these, these fancy cool things that try to keep you in your lane, or smart cruise control that that uses radar to keep distance from the car in front of you. Those are ADAS systems, and this standard has a way of doing that. And I should circle back because the difference between these two is really important. The first one to 26262 is ‘here’s an engineering methodology.’

Methodology means a set of methods. It doesn’t mean much, but it’s an engineering methodology. Okay? It’s a set of steps you follow. They call it the V cause you, when you draw it in a picture, it’s a letter V and you have to do the requirements. Then you have to do the high level design and detailed designing. You have to write the code and you have to test, test, test, and it says do all these things, generate all these pieces of paper. And if you don’t, all that stuff. There’s a list of hazards, the bad things that go wrong and show that you’ve mitigated all the hazards that anything that goes wrong and you’ll handle. And it says, “Here are the steps you follow when you’re done. We’ve presumed you’ve done a good engineering job and the more critical the failure the more engineering you have to throw at it.”

And in traditional safety, all the traditional safety standards feel like this. Define how bad the problem is, can you defined if it’s going to happen often, or can you throw budget or engineering at it to make sure if it does happen it’ll be mitigated. Sometimes you have to put in two or three things. So if one fails, the other one takes over, and so on. So that’s 26262. The problem with that is it’s all about internal failures. And so there may be something your system wasn’t designed to do. So if your system doesn’t know what kangaroos are, okay, and it’s supposed to estimate distance, it turns out kangaroos don’t put their feet on the ground all the time…and this was actually a problem for one company. They call them functional insufficiencies. It’s a fancy name. It just means stuff you didn’t think of.

The SOTIF standard 21448 is about trying to go out and I’m going to have some fun with it. I have a lot of respect for these folks, but I’m going to play this one for laughs. Right? You remember when Pokemon Go came out? My daughter started going outside for walks and we couldn’t explain why. We figured out that Pokemon Go had come out. She’s going around and trying to collect them all, right? So the SOTIF standard is a game of Pokemon Go. You go around and you say, “Oh, it didn’t work for this.” You try and collect them all to you think you have them all.

Bryan Salesky
But there are unintended consequences, Phil.

Phil Koopman
Yeah.

Alex Roy
Like what you said earlier, describing lane keeping as, “it tries to keep you in the lane”. it would be great if that was a product description when people sell cars.

Phil Koopman
I know the terminology thing.

Alex Roy
’What does this do?’ ‘It tries to keep you in the lane’ That would be a good one…

Phil Koopman
That’s just my description. Yep. It’s like a game of Pokemon Go. You collect them all and then when you think you have them all, you’re done. And for something like driver assistance, this actually probably works for fully automated. That’s the heavy tail, the unknown unknowns. The rare cases. But let me explain why it works because this segues into the more recent things. The reason it probably works is if you have automatic emergency braking…for that to invoke, the driver had to have made a mistake, or have something unexpected happen. So the starting out point is the driver was supposed to brake and didn’t, so we’re going to help. That’s AEB.

Alex Roy
Although, but Commercial AEB does have ghost braking.

Phil Koopman
Oh yeah. But no I’m going to get there. Okay. And so what they do is they say it’s the driver’s responsibility to be safe, and AEB is going to help you out. Now when you build those systems, because Alex, you talked about ghost braking you, there’s a thing called false positives, false negatives. False positives are something, you saw something but nothing was there. A false negative is something’s there and it misses it and because ghost…

Alex Roy
It’s much better to have a false positive than a false negative.

Phil Koopman
Well it depends, and that’s the point.

Bryan Salesky
It depends on what your optimizing.

Phil Koopman
That’s what your optimizing. So if you’re AEB and you’re on a highway, a false positive gets run over by the big rig behind you. Right? And he a false negative, well that was the driver’s fault. The driver was supposed to see it. So for an AEB type system, and I’ll make up some numbers, right, but let’s say you tune it so you only get one out of 1,000, one out of 10,000 is a ghost, you basically want no ghosts at all. But there’s a trade off every time you get rid of a ghost. Sometimes it doesn’t see something really there that it detected weakly.

Bryan Salesky
It’s so nice to discuss false positives and negatives in the context of a public good.

Alex Roy
But that’s the thing, to understand what is acceptable. You have to really look at what it was intended to do. If it’s just, if it’s just ringing a bell in the cabin, you can deal with so many false positives. On the other hand if you’re doing a really hard brake at many Gees, and you’re on a highway, and there’s somebody behind you, maybe that was not the best decision.

Bryan Salesky
So let me wrap that up. So that’s what the other question about this AEB when it first was released. It performed at X level and it improved safely overall, but AEB today is obviously performing at X plus 50% better than 20 years ago, and safety is improved. So if your starting point is better than not having the technology at all, surely we on the path to a better world.

Phil Koopman
Let me finish up the false positive and false negatives, because it relates exactly what you’re saying. You don’t want false positives because you get any it’s over, right? But with false negatives you basically say ‘I want no false positives. I’m willing to accept false negatives.’ If nine times out of 10, and I’m making up a number here, it stops, and one time out of 10 it doesn’t because you don’t want any ghosts…you said anything that’s close to a ghost, now that’s a ghost. We’re going to ignore that even though it’s a real thing. Right, but it’s okay because the way you look at it’s ‘9 times out of 10 you save the driver and the occupants.’ And the other one out of 10, well, it was his fault.

Bryan Salesky
The bottom line is for it to trigger you’re already in a state in which the human failed to perform some action…and so now the safety net doesn’t need to be perfect because look, if it fires, then, “Hey, it did a great job.” It shouldn’t have gotten in that state to begin with.

Phil Koopman
If it only fires half the time, you just save half the people.

Bryan Salesky
It’s still added value, right?

Phil Koopman
Now here’s, here’s where this turns into a problem if you’re in a fully self driving car, right? Now…the driver is asleep in the back.

Alex Roy
Level four.

Phil Koopman
A fully self driving car where, where the driver is not in the car, or asleep in the back.

Alex Roy
That’s not real.

Phil Koopman
It doesn’t matter. Okay. All right and you get a false positive. Well, it’s still an issue, right? But maybe it’s smart enough to know there’s no big rig and tune it. Right? But if you get a false negative and the occupants get killed, it’s not the driver’s fault anymore. It’s the technology’s fault and so you expect these to be tuned up differently…that all of a sudden the cost of missing something that’s, there used to be no big deal because you were busy saving lives. Now all of a sudden now instead of saving lives, you’re killing people…hypothetically…and this is an issue. The technology has to be used in a very different way. If the car’s in charge because there’s no human to say, “Well a human should have caught that”.” That’s where this and this—

Alex Roy
This is exactly the heart of why driver assistance is fundamentally different development than full self-driving.

Phil Koopman
So when you see data from driver assistance, I’ve heard people say, “Well…drivers assistance has its problems. What does that mean for self-driving?” The answer is…nothing.”

Alex Roy
Nothing, nothing.

Phil Koopman
Because it’s tuned completely differently.

Bryan Salesky
Yeah. Full, full self-driving has this requirement that all of its detection systems needs to be, and this is industry jargon, it needs to be high precision and high recall. Oftentimes you can give on one of those dimensions and this gets to the false positive-false negative rate.

Alex Roy
Can you explain what high recall means?

Bryan Salesky
I don’t want to go into academics.

Phil Koopman
Is that total recall, or is that the movie?

Bryan Salesky
Yeah, it’s the movie.

Alex Roy
Throw me a bone Bryan.

Bryan Salesky
Let’s just talk about in, in for the purpose of this conversation, we’ll talk about it in the two axes of false positives and false negatives. Which rate are you okay with? What we’re saying is in the driver assistance world they’re optimizing to reduce the number of false positives. So that it’s absolutely clear before it fires and take some safe action that the human has made a mistake, but with self-driving, neither are acceptable.

Phil Koopman
Okay. So now Alex, I’m going to circle back about the levels. There are three levels that matter.

Alex Roy
Don’t talk down to me Dr. Koopman. No one does that to me.

Phil Koopman
He’s smiling at me. It’s fine.

Bryan Salesky
We have 15 more standards we need to get through you. You’re like the noisy kid in class that we couldn’t even get to the…

Alex Roy
I’m representing the civilians. So we want to get into these things when they work. Understood.

Phil Koopman
So let me give you the three levels that matter because the one through five are confusing for everyone. Right? Especially level three; every time I talk to someone they have a different definition. So I’m not going there. There are three levels I care about. All right. First one, it’s the human’s fault, right?

Alex Roy
Agree.

Phil Koopman
Second level, it’s always the car’s fault. Period. Okay.

Alex Roy
Yes.

Phil Koopman
The middle one is, it’s the car’s fault, but we plan to blame the human.

Alex Roy
Repeat that again.

Phil Koopman
It’s the car’s fault, but we’re going to blame the human anyway. Let me say what I mean by that because it’s actually maybe deeper than it sounds. The issue is that it’s, you can’t expect humans to be perfectly vigilant for hours and hours and hours. Humans just aren’t wired that way and so if you make an unreasonable ask, if you expect humans to be superhuman, you’re setting yourself up for problems.

Bryan Salesky
And this is why with our safety drivers, they have mandatory breaks. They’re incredibly trained and skilled. We give them various monitoring systems that aid them. I mean we augment their capabilities in significant ways because that is their job in fact to test level four cars. But if you take a completely untrained person without all of those augmentations and without the training, it’s not good.

Phil Koopman
Bryan, what you’re saying is exactly the answer that makes me happy. When people say, “We’re going to just take people off the street, put them in.” After about 15 minutes, it gets tough.

Bryan Salesky
You can’t do that.

Phil Koopman
You just can’t do it.

Alex Roy
Or in my case, about 90 seconds. Yeah.

Bryan Salesky
Are you saying you have a bit of an attention deficit disorder?

Alex Roy
Well, that’s another different episode. Please go on, doctor.

Bryan Salesky
All right, so we just talked about two standards. We talked about 26262 we talked about SOTIF, and there’s another standard that you’re instrumental in helping to develop. Do you want to talk about that?

Phil Koopman
I don’t want to leave one out, there’s the SaFaD white paper. So safety first driving white paper from Europe. Well, how do we start?

Alex Roy
One I hadn’t heard of.

Phil Koopman
It’s actually pretty good reading. So I’m working on a standard you call UL 4600. UL as in Underwriters Laboratories 4600…it’s now to the point where we have a voting committee and we’re getting hundreds and hundreds of comments. But I was really closely involved with the initial draft and now I’m part of the team that’s trying to make it happen.

Bryan Salesky
So what does it cover? What’s it about?

Phil Koopman
It is about completely autonomous vehicles. So it assumes that there is no human that you’re asking to be responsible for safety during driving. It turns out humans do show up, for example. Someone has to make sure that the maintenance gets done, right? Things like that. But there’s no human driver and so that makes it distinct from the other standards. And it’s not a ‘here’s how to do some stuff.’ So the other two standards, it’s “we’ll follow this method and you’ll be okay.’ This one is, well, ‘you’re probably still going to use those other standards, and probably some other stuff too.’ But how do know you’re done? How do you know you’ve got all the Pokemon right? It’s called a safety case.

A safety case is an argument saying, “Here’s what we have to be…” “Like the fallen trees I talked about, here’s what safe means, and here’s why you should believe we’re safe, and here’s some evidence to prove that what we’re saying makes sense. And so it’s a long inventory of…if you want to do a safety case, if you want to argue about car safety, here’s all the things you have to do.

And the social media tag is, “Did you think of that?” And a lot of it is what we found…the medical industry found that different companies were all making the same mistake. They all had the same blind spots because people don’t think of things. So part of the standard is long lists of things of ‘did you think of that?’ And I have a couple of great co-authors, I have Uma Ferrell who’s an FAADR. What that means is she can sign off on safety on aircraft for the FAA, and Frank Fratrik who did that for the army vehicles.

And I’ve done automotive and chemical processes, and a bunch of things on the design side of safety. And so we all put our heads together, and there is a long list of stuff in the standard we’ve actually seen, and there’s no reason you should learn that the hard way. So part of it is just a long list of ‘make sure you don’t make these mistakes’ that other people have made, and it isn’t even car people. Maybe it’s rocket people that have made the mistake, or airplane people.

Bryan Salesky
And what it asks at the end of the day is that you show evidence as to how you did that through a safety case.

Phil Koopman
Right. It’s even less prescriptive than that. People call standards prescriptive. You have to follow these steps. It isn’t like that. It’s, “Hey, did you think of this?” And you might say, “Does not apply.” Okay, fine, sure.

Bryan Salesky
But it’s facilitating. It’s helping you get to that.

Phil Koopman
It’s a memory aid to say, “Did your engineers apply an effort or did they think all the things they should think of so they don’t have blind spots.” Now it’s not going to be perfect, but if it gets you the first 99% and you can’t make what I’ll call stupid mistakes about ‘you should’ve thought of that.’ You didn’t, what do you…what were you thinking? It sets the bar up a lot higher than what we’ve seen in a lot of companies across many industries.

Bryan Salesky
And so you combine all of these standards, and there’s more that we could go on. There’s, there’s tons of them, right? And we can draw lessons from standards that are meant for emergency stopping systems and heavy equipment industry and it just, it really does go on and on. But a lot of these standards, they say the same thing but applied to a different industry. I think what’s interesting, a lot of people say, “Well there’s no good standard for autonomous driving, or no good standard for fill in the blank.” Look, at the end of the day the least part of that sentence is correct. There is no one standard. You actually need to pull lessons and draw from the field of safety, and integrate that together into a safety program for whatever it is that you’re doing. And I mean, isn’t that the case for just about any industry?

Phil Koopman
Sure. And the point of 4600 is it gives you a bucket to put all that stuff in and make sure you haven’t left something out. That’s the point.

Bryan Salesky
And it helps organize it.

Phil Koopman
It helps you organize. It helps you explain it to someone else so they believe you, and it’s a way to say, “Hey, independent assessor, I have it together. Take a look, look for yourself.” And he has a checklist to go through and say, “Oh yeah, look you thought everything that we know of, and you’re looking for new stuff, and you’re going to add it as you get it. Okay, great.”

Bryan Salesky
So Phil, we have a sizable fleet testing on roads today, we’re in a bunch of cities. We get a lot of learning from that.

Phil Koopman
I see them all the time. When I drive to work, I go by three, four or five, six…many cars.

Alex Roy
You follow them sometimes too.

Bryan Salesky
So that’s really instrumental to any company’s ability to develop self driving cars. You need this real world data because it inspires everything from edge cases to all sorts of different scenarios that helps us test and refine our algorithms. It’s an important part of how we, how we develop the technology. But there’s a right way and a wrong way to do it. Maybe tell us a little bit about what you hope companies like Argo are doing in terms of their on-road testing.

Phil Koopman
Well I think a big distinction that needs to be made is there’s data collection, there’s testing and there’s deployment. Data collection you can do with a car that’s just driven by human with some equipment on it…and that’s a car driven by human. It’s same as any other car, just has some cameras and other stuff on it. So let’s set those aside because a lot of the testing you’re doing is actually the test platforms. I don’t call them self-driving cars because they’re test platforms. In fact there’s a great joke about this, how do you know it’s a self driving car? There are two drivers in the front seat.

Bryan Salesky
I’ve seen that joke. Absolutely. Yeah. Someone was asking me, someone was asking me the other day he was like, “Is your system a level four system today?” I said, “Well, actually no, it’s a level two system in some ways.”

Phil Koopman
It’s a test platform.

Bryan Salesky
It’s very much a test. That’s fine. But I mean, to simplify all that there’s two stages, two huge stages, development and deployment, right? We are squarely in the development stage.

Phil Koopman
People need to keep that straight because these are not self-driving cars. They’re test platforms with humans in charge of being safe, so I’m going to talk about those. There’s the deployed things where in principle, there’s no human on remote. There’s no human there. They’re just driving around. That’s not where we are today, for the most part. Right?

Bryan Salesky
Correct.

Phil Koopman
Okay. So chances are, unless you live in a few blocks in Tempe, Arizona, you’re seeing a test vehicle, right? That’s okay. So people get hung up about, well, is the autonomy trustworthy? Well, that’s actually not the point for safety on these because the point is you’re testing stuff that isn’t done yet. It’s not ready yet. That’s why you’re testing. That’s why you have two people in the front, to make sure it’s safe. So then people say, “Well, if the autonomy makes mistake, should I be worried about that?”

Phil Koopman
It’s like, “No, no. That’s not the question you want to ask.” The question you want to ask is, is the safety driver paying attention? Is the safety driver going to be able to tell when something goes wrong? Is the safety driver going to be able to intervene effectively and bring the vehicle to a safe state? That’s what you want to talk about. So when you hear about a mishap, and that there’s a crash or something, and say, “Well, the autonomy made a mistake, therefore people should conclude it’s not safe.” It has nothing to do with it. The question to ask is not ‘why did it not sees something?’ The question to ask is ‘why wasn’t the safety driver paying attention?’

Bryan Salesky
That’s right. Or did something else go wrong that prevented the safety driver from being able to do their job?

Phil Koopman
Or maybe the big red button is, I don’t know, not connected up to the, to the right place the way it should be. I know Bryan…we’ve never seen that.

Bryan Salesky
Yes, exactly. So that’s exposing a whole new layer of design and engineering that we actually don’t talk a lot about, which is the fact that we actually put a huge amount of engineering and rigor into the self-driving system. If speaking very globally or very broadly, I should say. But what I mean is we put a lot of engineering into the takeover mechanisms and making sure that if the safety driver touches any of the control surfaces in the car, so…if they twist the steering wheel with a certain amount of force, if they press the brake, if they press the throttle, that the vehicle will return control to the driver and allow them to go ahead and take whatever action they need to take. Those mechanisms are in and of themselves safety critical, and that we want the vehicle to return control safely is in a very, very small amount of time for them to go ahead and take whatever action they deem appropriate.

Phil Koopman
That’s the complicated explanation. The simple explanation is there’s a big red button and when you slap it, you want it to work.

Bryan Salesky
That’s right.

Phil Koopman
And Bryan, I know from our previous life at Carnegie Mellon…

Alex Roy
It didn’t work one time, did it? Well, no, no. Actually Bryan and I worked together on this and our job was to make sure it worked.

Bryan Salesky
Yeah. I mean, there was a…there was a time in this silly race that we did where there was some testing that was happening where actually it worked exactly as intended…and that one time in which it didn’t work.

Phil Koopman
Which says, “Be careful of what you ask for?”

Bryan Salesky
Which is, it didn’t work. But here’s the deal and this is why it gets, so this is why the definition of safety is so important. Someone decided that if the vehicle was in a particular state, that button should not operate…and that was wrong.

Alex Roy
Can you be more specific as to what that state was?

Bryan Salesky
If the vehicle was not in autonomous mode, that the red button didn’t need to do anything.

Phil Koopman
If another person was driving, the red button didn’t do anything, which is what it thought. But that wasn’t what was happening.

Bryan Salesky
That’s exactly right. Yeah. That’s so…it’s a whole, anyway, there’s a whole…

Alex Roy
It’s funny thought. In a second you have consequences—

Bryan Salesky
What’s funny is how complicated something so simple can become, but this is the very reason why one must follow these types of processes.

Phil Koopman
Yes. You would think that a red button is simple. And I, we can tell you Bryan, I can tell you from that project we did together. It is surprisingly complex.

Alex Roy
The title of this episode probably should be Safety. Why the red button isn’t always that simple.

Bryan Salesky
So anyway…so continue about the on-road test operations. You actually wrote a paper about this about, look, we’re still in development, not in deployment yet. Short of deployment. I hope that these companies do these things. Basically what the paper says. What are some other elements that are important?

Phil Koopman
Let me start where people come from because I just reviewed another draft standard, and I won’t go in into the numbers about this topic, and it basically said, it concentrated on the technology and said, “Well, the people are trained to, it’ll be fine.” It’s like, “No, no, the people is the hard part.” Getting them to pay attention and making sure you don’t paint them into a corner. The classic is ‘alright, hypothetical: you’re going to run off a cliff and we’ll tell the driver it’s the driver’s problem when the front wheels are off the cliff.’ It’s too late, right?

Bryan Salesky
It’s right.

Phil Koopman
And then we’re going to blame the driver for not stopping. Well, that’s ridiculous. And yet some of what you see is being very aggressive testing these vehicles and I go back to, I call it expecting human drivers to be superhuman. They’re not superhuman. That’s just not it.

Alex Roy
I could vouch for that, right?

Phil Koopman
So you need to make sure that they’re paying attention, but you need to make sure they have enough time to react and notice. So you may see these cars being more conservative than human drivers because the job is harder. It isn’t just stop before you hit. It’s self-driving car, “Is this what it’s supposed to be doing?” “We’re getting awfully close.” And it’s like, “Oh, now something went wrong.” “Okay.” “Is my situational awareness okay?” “Do I know who’s around me?” “What should I be doing?” “All right, I’m going to hit this, the brake pedal.” And that’s going to take more time than just a normal driver. So you may see these cars being more timid because that’s what it takes to be safe.

Bryan Salesky
I think that’s right. And the cars don’t actually need to be timid all the time. And this is the key, right? Look, there are customers in these cars and there are people all around these cars that expect them to behave in a certain way. You could actually create more problems by creating this overly cautious super robotic vehicle that operates unexpectedly and that’s not good.

Phil Koopman
Yeah. This is why they keep getting rear-ended, and the speculation is ‘so I’ve been rear-ended three times, so maybe I’m just like, those drivers…every time was at a stop sign where the guy behind me thought I should be going.’ And I’m like, “No, I’m not going.” That’s happened to me three times.

Bryan Salesky
In just about all of the cases that I’m familiar with, these incidents are because the autonomous vehicle detected something that the humans around it didn’t. And this happens all the time, right?

Alex Roy
Like that thing coming out of the alley.

Bryan Salesky
That’s right.

Phil Koopman
Yeah, but it had got to see one of those.

Bryan Salesky
But that was actually a surprise type of event. You wouldn’t necessarily be expecting that to happen. What we see though is there are, there’s a pedestrian that is walking through a crosswalk and we will get honked at aggressively. Why are you stopped at that? They just assume that we’re stopped for no reason and frequently it’s because they couldn’t see that person, whether it be because of an occlusion or because they weren’t paying attention. And this happens frequently and I guess the thing that I want everyone to understand is that these vehicles are detecting and tracking hundreds, they have the capability of detracting hundreds, even thousands of things at the same time. It sees way more than you do all around that intersection and it’s making decisions about where they might go taking the more conservative path to make sure we don’t collide with those things. Humans are very good at picking from all the noise that’s in a particular intersection, they pick the two or three things they think are most relevant.

Phil Koopman
They’re almost always right and humans are remarkably good at that.

Bryan Salesky
But the almost part is the rub, right? But when they don’t get it right, it’s a problem. Right? And this is what the autonomous vehicle could really accelerate.

Alex Roy
It’s funny, the first few times I got a ride in an AV and it hit the brakes…I couldn’t see why I got really mad. I was a journalist, and I’m like, “This thing sucks.” And then on the second ride I saw what it had been stopping for and I realized something. it let me reset my expectations because I know that I judge other motorists for stopping at things I can’t see because I assume they’re wrong. I’m right and they’re wrong and you have to invert that.

Phil Koopman
That’s a really human thing to do.

Alex Roy
You have to invert that and this trust, like, I trust that I know better than them even though I don’t know what they know. And that has to, that inverts with AVs because once you’ve been demonstrated that they do see more than humans do, then you start having to put faith in something completely different, which is their judgment, which humans are very inconsistent about. They’re very consistent about identifying signal to noise, but generally pretty poor in judgment I find.

Phil Koopman
Well, so we’re different people Alex, but my, the first time I rode in a self-driving car I was pretty nervous and my safety case for my first ride in the self-driving car. Do you remember this Bryan? I do. My first safety case was that Bryan was next to me and he was going to go if I was going to go.

Alex Roy
So that’s fair. That’s the funny thing.

Phil Koopman
I was in charge, and he was in charge of the project, so he knew what’s going on.

Alex Roy
Which makes the safety case of getting in a taxi with a strange human driving you insane, absolutely nuts.

Bryan Salesky
All right, so Phil, you have a, you have a company Edge Case Research. Tell us about it.

Phil Koopman
Edge Case Research. Mike Wagner and I worked together in NREC…and Bryan of course we worked with you at NREC: the National Robotics Engineering center at Carnegie Mellon. We decided that we were going to take our technology for stress testing robots and try and commercialize that. And we found out initially that didn’t actually work out because we broke all the robots in the first five minutes and there’s not a repeat business for that tool. But that’s okay because we do consulting services too. So in the army when they have ground vehicles, we’re the safety team for the army, and for a number of regular non-military companies.

Bryan Salesky
But the question is, did you, was you able to break it? Of course you’re able to break it. I think the thing that companies are looking for is, did you break it in a way that they weren’t aware of or that they didn’t think of.

Phil Koopman
Oh yeah. And sometimes they’d fix it and some they times they go in denial and it’s really interesting learning all these things. But we’ve progressed and then we found out that the really interesting area was perception. So this is where the self driving car detects objects. It uses a camera, does it see a person or not. And we came up with a technique for stress testing so that it will determine, “Hey, what? That person you think you saw, you almost didn’t see him.” And even worse, later on you’re not going to see him. We predict it, and it’s not just random. We found systematic problems.

So as an example, we found that one of the… It’s publicly available research, right? Someone’s test car…we found it’s really bad at detecting yellow coats, yellow clothing. Now why would you care…like construction workers? So you would think I have this yellow raincoat I bought cause I wasn’t going to buy the black one. Okay. I have a raincoat yellow. It turns out that’s camouflage for some of that technology. And why is that? Well, machine learning only can see things that it’s seen before, and how many people wear yellow out on the street?

Not many, as it turns out, and so you could see 99.99% of people. And the 0.01 is all people in yellow, or in coats, or construction workers. 99.99 sounds great. Yeah. But if there are systematic biases, this is a problem. So what we build is a tool called Hologram that actually sniffs through huge amounts of data and says, “Did you realize you almost didn’t see this one, it almost didn’t see this one?” And “Hey, I bet there’s a gap in your training data that you need to pay more attention to people in wheelchairs or yellow raincoats or people with bare legs.” I guess it’s winter data. So it looks at vertical brown things, or trees.

Alex Roy
Well, this is why, I mean this is why the more experience the better. Right? And the more diversity in the data, the better.

Phil Koopman
So the more diversity. But the other thing that’s really important is there are sub-populations, really niche subpopulations that if you’re systematically bad, if you always get them wrong, you won’t notice, if you’re not paying attention to those particular situations.

Bryan Salesky
It’s also why we’re in so many cities…to get the people in shorts, get the people…

Phil Koopman
You get all the crazy staff, yes.

Bryan Salesky
You get crazy everything. Yes. It’s so important.

Phil Koopman
But then, you can’t all mix it together and say, “Well, we’ve got 99.99% of people. Well, wait a minute, does that include people looking at their cell phones and people who own Halloween costumes and then all these other things?”

Bryan Salesky
But this is the other thing I don’t get, this is another trap where I might get too technical or nuanced here, but there’s a concept of building resilient systems, which there was a fascinating paper that will put most people to sleep that was written years ago. But if you’re building algorithms that are brittle, or that are fragile, and that break just because the human had a different posture that you’ve never seen before, you’re never going to win in this field. And this is the key, right? Because you have to be building systems that have, I don’t know what words you want to put around it, ‘resiliency margin’. We have to be building algorithms that are not brittle because otherwise you’re never going to be able to handle these, these surprise cases.

Phil Koopman
So that’s great. That’s a part of it. There’s a couple other things you want to do is that there’s the zoo of things. The zoo of the weird things, but they say the thing is ‘what’s weird to machine learning may not be weird to you.’ So like bare legs would never have occurred to me. Yellow raincoats would never have occurred to me, but the machine learning was brittle there. So part of it is building up the zoo of all the things you might see and being really good at saying, “It says it sure, but it’s not sure.”

Bryan Salesky
Yeah. And in my view, this is also why it’s not just machine learning. If you’re building a detection capability that’s only using machine learning. Look, no one tool is perfect. Where you get resiliency is by deploying multiple methods that we’ve learned over a long period of time. That’s in the state of the art.

Phil Koopman
Yeah, that’s true. And another thing that, well people say, if I’ve really good radar and Lidar, I know where the objects are. I know where the free space is…then I’ll be safe and that, that doesn’t go far enough. We’re from Pittsburgh, so I’d have to use the quote. You don’t skate to where the puck is. You skate to where the puck’s going to be.

Bryan Salesky
Nice. Well done.

Phil Koopman
You don’t drive to where the free space is. You drive to where the free space is going to be. Right, right. And that means you have to predict what happens next and if the things that you’re predicting are people, good luck with that.

Bryan Salesky
Yeah. Machine learning gets all the press these days. It’s in a small number of buzzwords that get thrown around. But there’s a lot more to building these things than just these pattern matching pattern sensing systems or whatever. I just pissed off a bunch of machine learning people. That’s okay.

Phil Koopman
I mean, so there’s the more to it that what we’re doing. Let me continue the job. There’s more to it than machine learning magic.

Bryan Salesky
That’s what I’m trying to say. Right…

Phil Koopman
Which by the way is an awesome technology for what it’s good at. But you have to know your limitations.

Bryan Salesky
That’s exactly right, and this is also why there’s no one sensor that’s used. It’s why camera-only is going to be very difficult. I’m not saying you’ll never get there, but with what we know today, in order to build resiliency and not have a fragile system, you have to deploy multiple methods and multiple sensors in order to sense and navigate the world accurately.

Phil Koopman
And it now comes the safety guy with the bucket of cold water. People say we have four sensors, it’ll be great. What are the chances all failing the same thing? The answer is not zero and that’s a problem. That’s right.

Bryan Salesky
It’s called common mode failure, which if you’re an engineer and don’t know what that is, you should go look it up right now.

Alex Roy
I would argue, not to name names, that if my life depends on it machine control I want as many sensors as exists that work.

Phil Koopman
So having a lot of sensors is good. Okay. But if there’s a common mode, common cause failure and I didn’t make up this example. There is a detached truck tire tread. Okay. All right. So it’s made out of rubber so LIDAR gets absorbed. Okay. If it’s a fabric belted, there’s no radar cross-section to speak of and if it’s on freshly painted asphalt on fresh asphalt, you don’t see it either. So that’s a common mode failure.

Alex Roy
Just for everyone, just for wait, for the civilian, a radar cross-section is the reflection of metal in the tire doesn’t have metal, so you are not going to see it..

Phil Koopman
So that makes it hard to see it.

Alex Roy
But LIDAR will see it though.

Phil Koopman
Well, no offense if it’s infrared LIDAR and it’s the right kind of rubber it absorbs, it’s like stealth material.

Alex Roy
So it really is invisible.

Phil Koopman
It’s invisible. It’s a black hole.

Alex Roy
Suboptimal.

Phil Koopman
And it’s a couple of hundred pounds of rubber that’ll tear the bottom end of your car. It’s a deal.

Bryan Salesky
But this is why this why diversity is so important? And also thinking through all these Edge Cases and validating the system against these scenarios.

Alex Roy
There’s literally no amount of money I want to save that would put me in a car that didn’t have every available sensor. I literally don’t care.

Phil Koopman
Does that include your eyeballs because now you’re not in a fully automated car.

Alex Roy
I don’t trust myself. I know it, which is why I don’t drive faster.

Bryan Salesky
So let’s cap all this off with another thought which is we can throw out our favorite Edge Cases all day long on this show.

Phil Koopman
Like an endless list.

Bryan Salesky
An endless, well, it is an endless list, but this is exactly the thing, right? We only need to encounter it once to see it, validate, record it, validate the system against it over and over again and thus continually make the system better. You can’t say that same thing about your 16 year old driver or really any human driver, because over time we know skills increase and then there’s a point where skills start to atrophy. And the beautiful thing about autonomous vehicles, if you’re building it correctly, is that they continue to learn and they will continue to get better.

Phil Koopman
Fair enough. But I said no, almost. Let me explain why.

Alex Roy
You have to notice it. If you don’t notice it, it doesn’t count.

Bryan Salesky
No, of course. Yes. So everything I just said assumes we saw it but this is why that closed.

Phil Koopman
That’s harder than you might think. Sometimes people don’t notice a lot.

Bryan Salesky
I do this for a living and it is hard, but that’s why we have a closed loop process in order to find these things, take it back and add it to the regression test set. In order to build these systems, you have to have a solid test process that includes this thing, the notion that you are seeing these things, that you have systems to find these Edge Cases and then you close the loop around them.

Phil Koopman
The methodical engineering is you’re looking aggressively instead of waiting. You have to go find it instead of waiting for it to happen.

Bryan Salesky
That’s exactly right. That’s exactly right and so but this is an important point because I think folks don’t understand is that when there is a failure, or when in an autonomous vehicle doesn’t do exactly the right thing…the beautiful part is if it goes noticed, which it will absolutely get noticed, especially when we go into market, right? We take that feedback and we modify or iterate the system to be able to handle whatever that scenario was. And the beautiful thing is the next time it encounters it or even something like it, it’s going to be able to do the right thing. And this is how over time we’re building a smarter and smarter driver. Oh by the way; the lessons that were learned on that one vehicle can be deployed across the whole fleet. So the fleet continues to learn as well. This is an important part of why over time building the computer driver is much more of a robust thing than building a human driver.

Phil Koopman
So that’s fair. I would say the trickiness, the hard part is that the first time you still have to be safe enough, and for now the plan is we have humans manning the system, right? But when you deploy, when you’re done with the safety drivers who deploys, somebody makes the decision, it’s time to actually go production. You have to make a different argument…that even then when something weird happens, you’ll be safe…which could be simple as we don’t know how to deal with it, but we know something weird just happened: let’s phone at home and do something safe in the meantime.

Bryan Salesky
No, it’s absolutely true. I will say though, we have more than just two people in the car looking for errors issues and consistencies and so forth. We actually have…at least at Argo…we have a fairly sizable team that’s combing through this footage. Even looking for events that weren’t necessarily flagged or noticed by the operators.

Phil Koopman
Which is great cause you don’t want to have a near miss.

Bryan Salesky
That’s right. They have a lot going on in the car, we can’t rely on their four eyes entirely. We also even run systems across that data to find things maybe that they didn’t catch that doesn’t seem quite right. And these anomaly finders are a part of the sauce that goes into building a really robust perception system.

Phil Koopman
So that’s a great engineering process. Let me ask you a question.

Bryan Salesky
Sure.

Phil Koopman
How are you going to know when it’s time to pull the trigger and say, “We did enough?”

Bryan Salesky
Well, this is where we cut off the episode. So no one has the answer to that. And I think that there’s you mentioned…what a billion, what do we say planes were again?

Phil Koopman
Billion hours.

Bryan Salesky
A billion hours.

Phil Koopman
Per catastrophic failure.

Bryan Salesky
Per catastrophic failure. Do you believe that the system needs to be tested to that level before people will trust or accept a self driving car?

Phil Koopman
Well, you can’t test that much. It’s not possible. I went into a computation website, Wolfram Alpha and I said, how far is a billion miles and it said 25 round trips on every paved road in the world and by way, when you make a change.

Bryan Salesky
That’s pretty cool.

Phil Koopman
Yeah, yeah it was fun. When you make a change, you reset to zero and you start over again. And so yeah…

Bryan Salesky
It’s not going to work.

Phil Koopman
Which is why all of safety in every field isn’t about testing it and testing and testing it. Right? It’s about testing now. It’s about doing good engineering and the testing proves your engineering was good. The testing isn’t the point. The engineering is the point. The testing is just there to make sure your engineering actually worked out the way you thought.

Alex Roy
Let me tell you guys…I mean, maybe I’m crazy, but if I saw an AV operating safely, I would love to get in that thing because every day I observe people driving horribly. And I love driving, I totally believe in freedom, but that if humans are not retested every year their entire lives, please just give me just the taste of a reliably operating AV. I would absolutely give it a shot.

Phil Koopman
Yeah. I think they have the potential to be great. The catch is they fail differently than humans do and people are pretty good at giving mechanics a false sense of security. All right?

Bryan Salesky
Absolutely.

Phil Koopman
And they find it hard to believe that things can be defective if they seem to work. I’ll give you an example. Has your cell phone ever crashed. Alex, you ever had to reboot your cell phone?

Bryan Salesky
My Android phone?

Phil Koopman
Yeah. More than my Apple iPhone crashes.

Bryan Salesky
It’s Phil going back to earlier conversation.

Phil Koopman
Can you make it do it again?

Alex Roy
Can I make it…

Phil Koopman
Can you on command make it crash?

Alex Roy
Yes. My Android phone, not my Apple phone.

Phil Koopman
Okay. Well I can’t make mine crash on command, but if it does, does that mean there’s nothing wrong with it? When people drive a car for 5,000 miles, nothing went wrong. Must be perfect. Well yeah, in mechanical things they tend to be reproducible in computers. When they go wrong they go wrong and it’s hard to make it happen again.

Bryan Salesky
So Alex, I said I have to ask because now people are going to be wondering what you do to your Android phone to reliably crash it?

Alex Roy
If you run Waze the GPS app for the police location, if you run Waze and you run in the background and you run a like a dash cam app simultaneously, my phone will crash after like an hour. I think it’s a storage situation.

Bryan Salesky
Yeah, could be. It could also be peculiar to whatever else’s this is. The problem is you don’t, it’s easy to blame it on Android, but you don’t actually know if it’s your dash cam app. You don’t know if there’s some interplay between them.

Alex Roy
But I do know that when I do these same apps on my iPhone, it fails differently. And the iPhone, one of the apps will just close, but the phone won’t crash. Whereas on the Android phone, the whole thing just locks up.

Bryan Salesky
All right. Speaking of failures, let’s go to the breaking story. Okay, so, you had a what? A Dodge vehicle at one point in time.

Phil Koopman
So it was a Dodge OT4 which is the charger with a small engine?

Bryan Salesky
Love it.

Alex Roy
Why?

Bryan Salesky
Come on now and what happened Phil?

Phil Koopman
I was driving home, I had a day job at the Navy Base in Newport, Rhode Island. I was submarine officer way back when.

Bryan Salesky
Thank you for your service.

Phil Koopman
Oh thanks. And I was driving home after a day’s work and I’d got my brakes worked on. There was air in my brake lines and so they bled the brake lines and I didn’t understand why there was air just, it was a newish car. I was only a couple years old. So they bled it. They fixed it all up and I get on over the Bay…over Narragansett Bay is a bridge. Now this is a serious bridge, I think it’s four lanes. It was like 55 mile an hour speed limit and it’s more than a hundred feet in the air because the Navy destroyers go under it. Okay. So this is a big arch. Okay. And you accelerate all the way up for if you have to put your foot on the gas.

Alex Roy
It’s a state called acceleration.

Phil Koopman
Well, I was accelerating because I was young and stupid and I get to the top, it’s like, okay, we’re about to take a roller coaster ride down the other side, which I did every day time to put the brakes on. So I put my foot on the brakes and the brake pedal goes to the floor and nothing happens.

Bryan Salesky
Literally nothing happens.

Phil Koopman
And I pressed it a couple times out of like; well that’s weird, right?

Bryan Salesky
Out of academic curiosity.

Phil Koopman
And it didn’t go anywhere because the brake pedal’s flat against the floorboards. And fortunately I’m paranoid and what they say about paranoia, just because you’re paranoid doesn’t mean they’re not out for you. Okay. And I just treat the world this way even back then my parking brake worked… This is in the early eighties because the mechanical cable, usually they’re all rusted out. I use mine every time. So it would be there when I needed it. That was the day. It was a manual so I downshifted very aggressively.

I use my parking brake to came to a stop and at the end of the bridge, like at the bottom of the Hill, we had Toll Booths at rush hour and it’s and it’s a bridge over Bay and it’s like, and what had happened was when I got it back to the mechanics, what had happened was, there was a metal casting that held two different fluid reservoirs or brake fluid cause it was a two to loop system and the casting had a defect and apparently when the air had been getting in there, I guess, I don’t know, when they removed the air, the braking force was higher and it split the casting horizontally across both pools of fluid and both of them drained out through the hole in the middle of the casting and I had no brake fluid. That’s what happened.

Bryan Salesky
Yeah, that’s not good. So, but we talk about failures, right? I mean you assume if you have four brakes, you have some redundancy.

Phil Koopman
But there are only two hydraulic lines and there’s only one casting and the casting has a common mode failure for crack to clean them. That’s exactly what happened.

Bryan Salesky
Common mode failure and that is why friends, we must do safety analysis, common mode failures.

Phil Koopman
Absolutely.

Bryan Salesky
All right. Very good. Well, thank you Phil for coming. This was very educational as always and it was super informative.

Phil Koopman
Bryan and Alex, thanks for having me. There was a lot of fun having the discussion.

Alex Roy
I really enjoy. That will save my questions for the neck for you when you come back.So Bryan, what do you think my mother or your mother would say if they listened to that episode? Do you think that they would be satisfied with how safety is being addressed?

Bryan Salesky
Well, I don’t know if they’ll be satisfied. I hope that they walked away with a little more principle behind the word safety and what it, what it means. It’s such a broad, somewhat nebulous topic. My hope is that we were able to lay the groundwork for what it means in at least as it’s as it’s applied to the automated vehicle industry, but also the amount of rigor and diligence that goes into the safety engineering of a product.

Alex Roy
The part that really struck me was if a car doesn’t leave a garage, it doesn’t serve its purpose, but the moment a car does…there’s something that could happen.

Bryan Salesky
There are risks.

Alex Roy
No one will leave the house…but society could not function otherwise. So I think we should revisit the topic periodically, maybe next year, and talk about what safety meant when we did this, and what it means 12 months from now. It would be really fast. If you want to learn more about the No Parking Podcast, check us out at www.noparkingpodcast.com. Follow us on Twitter at No Parking pod. Bryan does not do social media.

Bryan Salesky
Not at present.

Alex Roy
But I’m going to ask them every week. I’m Alex Roy 144 on all platforms. You can check out Phil Koopman’s Company, its edgecaseresearch.com and if you want to be on our show or recommend a topic, please contact us at guests@noparkingpodcast.com. See you next time.