If You Can Drive in Jerusalem, You Can Drive (Almost) Anywhere
The following is an opinion editorial provided by Amnon Shashua, senior vice president at Intel Corporation and the chief executive officer and chief technology officer of Mobileye, an Intel company.
The first phase of the Intel and Mobileye 100-car autonomous vehicle (AV) fleet has begun operating in the challenging and aggressive traffic conditions of Jerusalem. The technology is being driven on the road to demonstrate the power of the Mobileye approach and technology, to prove that the Responsibility-Sensitive Safety (RSS) model increases safety, and to integrate key learnings into our products and customer projects. In the coming months, the fleet will expand to the U.S. and other regions. While our AV fleet is not the first on the road, it represents a novel approach that challenges conventional wisdom in multiple areas. Leveraging over 20 years of experience in computer vision and artificial intelligence, our vehicles are proving the Mobileye-Intel solution is the most efficient and effective.
The key differentiator of our system is that it is designed to meet important goals of safety and economic scalability from the beginning. Specifically, we target a vehicle that gets from point A to point B faster, smoother and less expensively than a human-driven vehicle; can operate in any geography; and achieves a verifiable, transparent 1,000 times safety improvement over a human-driven vehicle without the need for billions of miles of validation testing on public roads.
The obvious answer is because Mobileye is based in Israel. That makes it convenient, but we also wanted to demonstrate that the technology can work in any geography and under all driving conditions. Jerusalem is notorious for aggressive driving. There aren’t perfectly marked roads. And there are complicated merges. People don’t always use crosswalks. You can’t have an autonomous car traveling at an overly cautious speed, congesting traffic or potentially causing an accident. You must drive assertively and make quick decisions like a local driver.
This environment has allowed us to test the cars and technology while refining the driving policy as we go. Driving policy, also known as planning or decision-making, makes all other challenging aspects of designing AVs seem easy. Many goals need to be optimized, some of which are at odds with each other: to be extremely safe without being overly cautious; to drive with a human-like style (so as to not surprise other drivers) but without making human errors. To achieve this delicate balance, the Mobileye AV fleet separates the system that proposes driving actions from the system that approves (or rejects) the actions. Each system is fully operational in the current fleet.
Balancing the conflicting goals of safety and assertiveness
The part of our driving policy system that proposes actions is trained offline to optimize an assertive, smooth and human-like driving style. This is a proprietary software developed using artificial intelligence-based reinforcement learning techniques. This system is the largest advancement demonstrated in the fleet, and you can see the impressive results in the event visuals. But just like a responsible human driver, in order to feel confident enough to drive assertively, this “driver” needs to understand the boundary where assertive driving becomes unsafe. To enable this important understanding, the AI system is governed by a formal safety envelope that we call Responsibility-Sensitive Safety.
RSS is a model that formalizes the common sense principles of what it means to drive safely into a set of mathematical formulas that a machine can understand (safe following/merging distances, right of way, and caution around obstructed objects, for example). If the AI-based software proposes an action that would violate one of these common-sense principles, the RSS layer rejects the decision.
Put simply, the AI-based driving policy is how the AV gets from point A to point B; RSS is what prevents the AV from causing dangerous situations along the way. RSS enables safety that can be verified within the system’s design without requiring billions of miles driven by unproven vehicles on public roads. Our fleet currently implements Mobileye’s view of the appropriate safety envelope, but we have shared this approach publicly and look to collaborate on an industry-led standard that is technology neutral (i.e., can be used with any AV developer’s driving policy).
Current sensor setup: only cameras. Why?
During this initial phase, the fleet is powered only by cameras. In a 360-degree configuration, each vehicle uses 12 cameras, with eight cameras providing long-range surround view and four cameras utilized for parking. The goal in this phase is to prove that we can create a comprehensive end-to-end solution from processing only the camera data. We characterize an end-to-end AV solution as consisting of a surround view sensing state capable of detecting road users, drivable paths and the semantic meaning of traffic signs/lights; the real-time creation of HD-maps as well as the ability to localize the AV with centimeter-level accuracy; path planning (i.e., driving policy); and vehicle control.
The camera-only phase is our strategy for achieving what we refer to as “true redundancy” of sensing. True redundancy refers to a sensing system consisting of multiple independently engineered sensing systems, each of which can support fully autonomous driving on its own. This is in contrast to fusing raw sensor data from multiple sources together early in the process, which in practice results in a single sensing system. True redundancy provides two major advantages: The amount of data required to validate the perception system is massively lower (square root of 1 billion hours vs. 1 billion hours) as depicted in the attached graphic; in the case of a failure of one of the independent systems, the vehicle can continue operating safely in contrast to a vehicle with a low-level fused system that needs to cease driving immediately. A useful analogy to the fused system is a string of Christmas tree lights where the entire string fails when one bulb burns out.
The radar/lidar layer will be added in the coming weeks as the second phase of our development and then synergies among sensing modalities can be used for increasing the “comfort” of driving.
Computing hardware on the road: today vs. tomorrow
The end-to-end compute system in the AV fleet is powered by four Mobileye EyeQ®4s. An EyeQ4 SoC has 2.5 Terra OP/s (TOP/s) (for deep networks with an 8-bit representation) running at 6 watts of power. Produced in 2018, the EyeQ4 is Mobileye’s latest SoC and this year will see four production launches, with an additional 12 production launches slated for 2019. The SoC targeting fully autonomous is the Mobileye EyeQ®5, whose engineering samples are due later this year. An EyeQ5 has 24 TOP/s and is roughly 10 times more powerful than an EyeQ4. In production we are planning for three EyeQ5s to power a full L4/L5 AV. Therefore, the current system on roads today includes approximately one-tenth of the computing power we will have available in our next-gen EyeQ5-based compute system beginning in early 2019.
The Mobileye-Intel approach is contrary to industry common practice in the field, which is to over-subscribe the computing needs during R&D (i.e., “give me infinite computing power for development”) and then later try to optimize to reduce costs and power consumption. We, on the other hand, are executing a more effective strategy by under-subscribing the computing needs so that we maintain our focus on developing the most efficient algorithms for the sensing state, driving policy and vehicle control.
We certainly have much work ahead of us, but I’m extremely proud of the Mobileye and Intel development teams for their hard work and ingenuity to enable this first significant step. Our goal, in support of our automaker customers, is to bring this system to series production in L4/L5 vehicles by 2021.