Tele-Operation is Not Autonomous Driving

Game controller for drive-by-wire autonomous development vehicle

Remote Tele-Operation has recently come to be viewed as a possible safety fall back option that is situated squarely within the context of a futuristic, fully autonomous, Level 5, scaled deployment scenario of millions of autonomous vehicles across all modern cities, and more.

The promise of this type of a futuristic, fully automated transport paradigm, is itself currently placed at more than at least two decades away, based on a consensus view within the autonomous vehicle industry, if not further out by many who forecast this timeline over the next two generations perhaps. Many supporting transport enablements will have to undergo rapid evolution in order to enable this fully automated transport future. Regulation, insurance, infrastructure, connectivity, and lifestyle changes being only a few such enablers that will have to evolve alongside the underlying automated driving technology.

More importantly however, autonomous driving technology will have to develop, deepen and be; refined, verifiably safe, commercially viable, and fully validated for delivering the requisite robust performance at Level 5 automation, requiring no human intervention at any stage. At present, several open technology challenges remain in reaching this goal. While it is reasonable to presume that these open challenges may apply in different degrees to various self driving car programs around the world depending on their level of technology maturity, there is consensus across the board that these technology challenges remain unsolved.

An example of one such open challenge is 3D/HD map dependency. These maps cost thousands of dollars per mile to create, petabytes of data collection, millions of hours of human effort in painstaking data annotation, and pre-driving of every road, worldwide. It is currently assumed by most practitioners and regulators that all current autonomous driving approaches will have to wait for detailed 3D/HD maps to be created around the world, in order to enable everywhere-all-the-time autonomous mobility. However, the technological and commercial challenges of a globally scaled 3D/HD mapping effort of; making, annotating, regularly updating, and wirelessly transmitting such maps to millions of autonomous vehicles, have not even been framed properly, let alone solved.

Another example of an open challenge for autonomous vehicles is robust scene perception. The current dominant approaches to solving the perception challenge involve increasing levels of AI to solve the ‘long-tail’ problem, speaking in statistical terms. At present, having robust perception, that does not falter in any edge case, can operate in any type of weather, and can be deployed in any geography across the world, could perhaps rightfully be claimed by very few self driving programs around the world.

Given the strategic opportunity and economic value to society of successfully deployed self-driving technology, estimated in multiple trillions, many hastily deployed first-mover initiatives in the self driving car industry are now showing signs of having ‘hit the technology wall’ syndrome, especially with respect to the many open challenges.

Amidst a global race to achieve early lead ahead of other nations, regulatory frameworks in many countries and regions are evolving rapidly to promote technology development and advanced trials. Regulation in this field has been a delicate balancing act between supporting rapid innovation while ensuring safety for the public, as a diverse set of early technologies for varying use cases are trialed on public roads.

In the backdrop of the current stage of autonomous driving technology and its open challenges, largely experimental regulatory codes and the strategic imperative of being first to the finish line, it is not surprising that more than a few ‘opportunistic technologies’ are being offered up as a panacea to circumvent the open challenges.

Foremost among the technologies used to veil the true state of autonomous driving, and work-around the open technology challenges, is Tele-Operation.

Tele-Operation of Highly Automated Road vehicles, whether remotely or through a joystick held by someone in the back seat of the vehicle, hardly moves the needle on usefulness to society or the evolution of technology. It only makes the dream of a fully automated future of transport increasingly more distant.

There is no technological imperative today, at this stage of technology development for regulators in any country to permit or encourage the remote controlling of cars or other vehicles on public roads. As of yet, no single self-driving program, anywhere in the world has even come close to developing a fully autonomous, Level 5 self-driving vehicle for all weather and all geographies. State-of-the-art trials of autonomous vehicles around the world are limited to geo-fenced zones, in fair weather conditions, on a subset of routes, with the frequent intervention from safety drivers at the wheel to account for incorrect autonomous driving behaviour. If such programs are permitted to rely on joystick control from the back seat, or remote control from a nearby facility, where they may have a testing base, we could all end up grappling with the unintended proliferation of technologies that are nothing more than radio controlled cars, appearing to be autonomous.

Mandating a safety driver in the driver seat who is able to directly assume human control of an autonomous vehicle in ‘open sight’, is the only established, safe, and transparent approach for both testing the technology and showcasing what has been developed. Permitting joystick control of supposedly ‘highly automated vehicles’ can easily encourage non-transparent practices and result in unverifiable technology claims and create a smoke and mirrors effect. Such practices damage the prospects of all self-driving car programs, including the most advanced ones. One would never be able to establish if various competing approaches are in reality operating an autonomous driving service or a largely radio controlled service under a veneer of autonomous driving.

PathFinder – a mapping-free, go-anywhere, autonomous path planner

PathFinder in action on muddy roads without lane markings or curb edges

Imagine you have to go from a bedroom to kitchen in a new house, completely blind folded. You would need to first practice the route a couple of times without the blindfold, and then under a blindfold, you would keep touching the walls along the way as you make your turns to get to your destination in order to not crash into things. When you walk the path without blindfolds you are creating your ‘localisation map’ and when you touch the walls under a blindfold to get a sense of you position with respect to the room, you are localising using your map memory.

When we drive, we carry out several tasks without being conscious of the complexity of instant decisions we make to ensure we drive safely. An Autonomous Driving System (ADS) needs to replicate this very complex performance, and it is often challenging to understand what’s really happening under-the-hood.

Path planning enables a highly automated vehicle to select viable path trajectories in real time, on an on-going basis, as the vehicle traverses from one position to another. In finding and following a path, the ADS must be able to detect where the drivable free space is, segment it accurately, know its own precise position with respect to its environment, and then calculate a viable path to follow within the total drivable free space while maintaining its position along the path.

The ability of the ADS to find its current precise position with respect to its location is called ‘localisation’. The industry state-of-the-art to enable real time localisation is achieved by use of ‘localisation maps’. These maps are different from navigation maps we use everyday. A localisation map is a detailed feature memory based on high precision data collected by driving on the path beforehand manually. It contains a lot of information on the scene features and structure such as; lane edges, lane markings, features like tress and buildings, road signs, traffic lights etc.

Autonomous cars need to localise within 10-15 centimetres of their position in real time as they traverse a path to make sure they don’t drift. Localisation maps being a high precision memory, make this possible. The challenge of this approach is two-fold; first, autonomous cars today can drive only where they have been driven before manually for data collection (practice runs without the blindfold) and second, it is nearly impossible to scale these maps worldwide, over nearly 200 million Kilometres of road networks. Add to this challenge the fact that the world keeps changing all the time – road works, change in road layouts, new buildings etc., so new updated versions are needed and these maps must be developed for driving in both directions. Imagine what would happen in our analogy if the room layout was changed and the furniture moved around, you would be unable to get to the kitchen blindfolded because you would struggle to figure out your position. You would have to go back and practice the route again without the blindfold.

Interestingly enough, human drivers operate very differently from an ADS when it comes to localisation. Human drivers don’t need centimetre-precise prior information on where everything is around them – often a GPS satellite navigation system is more than enough for us to navigate busy urban streets. Human drivers can drive on roads they have never driven on before without detailed prior map data – autonomous cars today struggle with this challenge.

The reason human drivers are able to drive with such flexibility is down to our incredible environmental perception. In a fraction of a second, we perceive where we are in the road context, what is around us, what does the road look like, where are the traffic lights, how are other cars navigating through a junction, and can safely drive on the basis of our environmental perception.

Our technical inspiration comes from how the human mind processes visual data to perceive the world, and we have built a visual cognition engine that can match this performance for autonomous driving. This means our autonomous car doesn’t need prior high definition localisation maps to drive, it drives by seeing and understanding its environment. We are proud to unveil for the first time another world beating capability as our autonomous car can drive where there are no maps. Our VisionAI is a generalisable cognition and perception capability. Vision AI makes it possible for our ADS to perceive the scene as humans do, keep the vehicle localised with respect to its current position and safely follow the chosen path trajectory. To plan the path for our autonomous vehicle as it drives, our ADS calculates not one but several concurrent path trajectories based on highly accurate drivable free space detection and detection of all still and moving obstacles.  The most viable trajectory from amongst the several possible ones is selected in real time and those that become infeasible automatically get dropped from the set of possibilities.

We have been testing our ‘PathFinder’ for nearly 15 months, in all sorts of varied and difficult scenarios clocking nearly 8,000 miles of driving on complete of- road paths, highways, rural roads with no lane markings, residential neighbourhood roads and urban city centre layouts. PathFinder and VisionAI work together to tell our autonomous car what’s around it and how it should drive through its environment to safely avoid obstacles and make its way to the end destination. PathFinder is able to pick out a safe and viable trajectory each time, no matter the scenario.

Here, we share a few visualisations PathFinder’s outputs from a bird’s eye view looking down. The little square at the bottom right always represents the autonomous vehicle and the red dots represent the minimal proximity obstacles and match up with detections of perception outputs from VisionAI in the video image. The dynamically changing blue bars represent the detection of drivable free space in real time that matches up with detections in the video. The green dotted lines are ‘localisation’ markers for the vehicle and the orange coloured group of lines represents all possible paths our autonomous vehicle can traverse with the differently coloured line chosen as the most viable one to follow.

We chose the Park Street through Woburn Safari Park connecting Woburn to M1 as a test case. The road runs straight through the park but just wide enough for two vehicles to pass in opposite directions, has no lane marking, no clear road edges or curbs, small wooden bollards dotted along both sides and most importantly, it can have roaming deer cross the road at any time. Notice how PathFinder automatically and dynamically changes the trajectory outputs when opposing traffic approaches our vehicle. We had no localisation maps of GPS data for the road but the output was perfectly creating a driving corridor for the vehicle to localise itself. The Point-of-View output of the path planning system in the video shows that impressive capability.

We had to push the limits to test our system performance in the most challenging conditions – driving on really narrow rural B road barely wide enough for two cars. It had rained earlier that day and the road edges were just wet mud, and of course no markings of any sort, no clear curbs. These roads are tricky even for experienced human drivers. It was sheer moment of pride and joy for us to see PathFinder navigate in a totally unmapped environment.

We have been refining and enhancing the capabilities of PathFinder and VisionAI over the last 6 months and are now getting ready to demonstrate how they work for fully autonomous driving on public roads, and will be releasing some really awesome stuff over the coming months. Keep looking at this space for more soon.

Perception in snowy conditions for autonomous cars

When storm Emma started forming in late February 2018 over South England, we weren’t expecting 22 inches of snow to cover the grounds over just a week. As much of an inconvenience as it was, for the general population of UK, we considered it a unique and timely opportunity to test the quality of our world beating Vision AI system for autonomous car perception.

We wanted to use the most basic sensing capability (an off the shelf consumer grade camera) to test our system in the most challenging driving conditions. As we all know very well, driving on snow covered roads is a huge challenge for human drivers due to the tricky road conditions where salting is infeasible, such as residential neighbourhood roads and rural lanes. We set ourselves the goal of testing our Vision AI system for detecting the ground surface and segmenting the drivable free space in the most challenging set of conditions — snow covered residential neighbourhood roads and rural lanes. This meant, we would never be able to see the full road surface clearly, most of the road and lane markings would be snowed over, there would be slush on the road with lots of tyre tracks, we wouldn’t be able to see the road curbs and land edges, and most everything on the ground would look white.

This is probably one of the hardest set of conditions one can throw at a perception engine to detect where the road surface is and where can the autonomous system drive. Our Vision AI has two key features that make it unmatchable and beyond the state-of-the-art in autonomous perception. First, Vision AI is a generalisable perception system that works out of the box. You turn on the system and it starts to do what it is supposed to do without the need for any data driven training. Second, it is highly sophisticated in its technical capabilities to detect and segment the ground surface and drivable free space in conditions where humans need to make inferences and guesses about where the ground might be — for example when we are unable to see the road clearly due to snow cover, we tend to follow the tracks left by road users who have driven before us, without needing to see the entire road surface. For an autonomous perception system to be able to replicate this performance requires technically very advanced capabilities.

To our delight, not once did the Vision AI let us down. We drove over the entire period from Emma’s forming to dissipation (nearly 6 days) and clocked over 250 miles of driving and perception data collection and Vision AI performed like an expert road surface detector, clearly segmenting round-about junctions, partly occluded lanes due to parked vehicles, slush, driving tracks of black on otherwise a uniformly white surface.

When you see the video clips of some of the footage of Vision AI at work, you will notice how clean and accurate the performance is. The conditions of the surface are feature sparse, means there’s isn’t much to detect and make sense of. Yet the system provided a very high fidelity output. We keep an eye out for how the field of autonomous perception is advancing and keenly review the release of video footage put out in the public by our peers in the industry. We won’t be off the mark if we said that this is a ‘world first’ in terms of what’s out there as evidence of the state and technical sophistication of autonomous perception capabilities.

We have broken new ground in pushing the technical boundaries and have been constantly refining the capabilities of Vision AI through out this year. We are hoping that UK might give us another opportunity this year where we get to test the advances we have achieved in Vision AI performance in the last 8–10 months.

What really is “Perception” for autonomous vehicles

Perception is the term used to describe the visual cognition process for autonomous cars. Perception software modules are responsible for acquiring raw sensor data from on vehicle sensors such as cameras, LIDAR, and RADAR, and converting this raw data into scene understanding for the autonomous vehicle.

                                           Raw pixel data fed as input to perception

                                      Scene understanding derived from perception

The human visual cognition system is remarkable. Human drivers are able to instantly tell what is around them, such as the important elements in a busy traffic scenario, the locations of relevant traffic signs and traffic lights, the likely response of other road users, alongside a plethora of other pertinent information. The human brain is able to derive all of this insight using only the visual information being acquired by our eyes in split second time. This visual cognition ability extends in a generalised way across numerous types of traffic scenarios in different cities, and even countries. As human drivers, we can easily apply our knowledge from one place to another.

However, visual cognition is incredibly challenging for machines, and the idea of building a generalisable visual cognition is currently the biggest open challenge within the fields of autonomous driving, machine learning, robotics, and computer vision. So, how does perception work for autonomous cars?

Perception technologies can be broken down into two main categories, computer-vision approaches and machine learning approaches. Computer vision techniques seek to formally address problems by using an explicit mathematical formulation to describe the problem, and usually rely on a numerical optimization to find the best solution to the mathematical formulation. Machine learning techniques on the other hand, such as convolutional neural networks (CNNs) take a data-driven approach, where instead, ground-truth data is used to ‘learn’ the best solution to a particular problem by identifying common features in the data associated with the correct response. For example, a CNN trained to identify pedestrians in camera images will extract features that are commonly present in the training data associated with the appearance of pedestrians, such as their shape, size, position, and colour. Both approaches have their merits and disadvantages, and autonomous vehicles rely on a combination of these techniques to build a rich scene understanding of their environment.

Perception is very challenging for autonomous vehicles because it is incredibly difficult to build a generalisable and robust model to describe complex traffic environments, either explicitly or through data. Autonomous vehicles can encounter strange and previously unseen obstacles, new types of traffic signs, or obstacles of a known type in a strange configuration such as a group of children wearing Halloween costumes.

                                                            Challenging obstacles

Similar challenges are present in identifying where it is safe to drive. Deriving a safe driving corridor is fairly straightforward in the presence of well-maintained lane markings on roads that an autonomous vehicle has frequently driven on. But performing the same task on a new road without lane-markings, or a different style of lane markings is a much tougher challenge. There is huge variety in road geometry and road surface types across the world, from motorways to dirt roads, and for a truly automated future, autonomous vehicles will have to be able to contend with all of these conditions.

                                                               Challenging roads

The challenge of perception is further compounded in adverse weather or at night time, where raw sensor data becomes degraded and the perception system needs to parse noisier data to make sense of what is in the environment.

                                Difficulty of perception in low light and adverse weather

Computer vision-based perception approaches usually have a fair performance and are typically generalisable across a wide set of scenarios and conditions, depending on the robustness of the underlying mathematical formulation. On the other hand, machine learning-based approaches are limited based on the data used to train the system, and whilst good performance is achieved if real-world conditions match the training data, performance degrades significantly when the real-world looks different to what the machine learning system has been taught to recognise. This then begs the question that if perception is so challenging, and computer-vision and machine learning have limitations in performance and generalisability, how are autonomous cars today able to contend with real-world driving scenarios. The answer – Mapping. Autonomous cars take the burden away from on-vehicle perception by using a prior 3D survey of roads with annotations identifying important road features. This 3D map, sometimes referred to as a high-definition (HD) map, contains detailed information about each centimetre of every road an autonomous vehicle will operate on, including the precise position of lane markings, curbs, traffic lights, traffic signs, buildings, and other environmental features. By utilising an HD map, autonomous vehicles only need to perceive dynamic elements of a scene, such as pedestrians, other vehicles, and cyclists, for which CNNs are well suited and provide good performance under most scenarios. Computer vision can then be relied upon as a redundant perception technology in case a CNN failure occurs because a strange obstacle is present or an unknown scenario develops.

However, a simple question then comes to the fore, what happens if autonomous vehicles don’t have access to HD maps, or HD maps are outdated. How can an autonomous vehicle drive in these scenarios when it has to only rely on its on-board perception?

At Propelmee, our technologies answer these questions…


What really is “Mapping” for autonomous vehicles

“Mapping” is a commonly associated term with autonomous vehicles. We generally think of maps as birds-eye view representations of roads and geographies which highlight important features like the locations of buildings, roadside infrastructure and places of interests. Maps are used by people for navigation – to answer the question, “how do I get there?”, however, autonomous vehicles use maps in very different ways.

Autonomous vehicles are unable to navigate with simple high level goals such as “take the next left”, or “turn right at the end of the road”. While these instructions are very simple for human drivers to follow, translating these types of instructions into autonomous driving actions is a complicated task – that’s where maps come in. Autonomous cars utilise high-definition (HD), three-dimensional maps of the environment to know the centimetre-precise road layout beforehand. These maps can have multiple annotations about the locations of lane-markings, traffic lights, traffic signs and other important road features as well as the exact path an autonomous vehicle should travel along, to “take the next left”, for example. These paths are usually annotated by expert operators and define the preferred behaviour of an autonomous car.

In essence, an HD map tells a car exactly what the static scene looks like, where important road features are, as well as typical driving manoeuvres to negotiate a specific part of the road network. Localisation is an important step which autonomous cars use to figure out their exact position in an HD map by matching their live sensor data with the stored map data. It’s like remembering a place you’ve visited beforehand. This allows the autonomous car to know where it is in the map and utilise all of the prior information stored within the map.

To create HD maps, autonomous cars need to pre-drive routes to collect the map data and create a type of “digital scene memory”. This raw survey data is enhanced with human input to highlight important road features for use in autonomous driving. This is a time consuming process as these annotations are performed on each LIDAR (laser scanner) and camera frame, and many frames are being produced by multiple sensors during each second of driving.

The other challenge besides the task of annotation effort is keeping HD maps fresh and updated. Imagine driving back to a place you’ve visited beforehand except the road structure and environment has changed. Your memory of that place no longer matches what it now looks like, but you as a human driver can continue to drive in an exploratory mode. For autonomous cars, that’s not really possible, because the autonomous car can no longer match it’s “digital scene memory” to its live perception of the environment and is unable locate itself in the HD map. Even if the autonomous car is able to localise itself, the HD map will no longer be representative of the scene and can’t be relied upon to guide autonomous driving. That’s why it is critically important for autonomous cars to do frequent mapping runs to ensure that their “digital scene memory” of places is fresh and accounts for all changes such as roadworks, diversions, or infrastructure upgrades.

So, autonomous cars need HD maps of all roads to drive there, and require that these maps are updated. It’s a massive challenge given the size of global road networks and the required frequency of map updates. Some companies are taking on this challenge to map out the world.

We are taking a different approach…