ACM SIGSPATIAL, the International Conference on Advances in Geographic Information Systems, has come and gone. I wasn't able to make the trip to attend, but Xin Chen generously spent some time to teach me a few of the insights that he shared with the crowd in his keynote: "HD Live Maps for Automated Driving: An AI Approach".
Xin, what's your role with HERE?
I am an engineering director in the Highly Automated Driving division. I lead the "HD Perception" team focused on automating HD live map product creation by developing state-of-the-art machine learning and 3D technologies.
How long have you been working in the field on Automated Driving?
I have worked at HERE since completing my doctorate at Notre Dame 12 years ago. I have worked in the Highly Automated Driving (HAD) division since it was formed a few years ago. Before HAD, I worked in the Research organization and transitioned to Platform and Core Map organizations each for a couple of years.
To prepare you for this field, what did you study while at Notre Dame?
I did my Master's and PhD in Computer Science and Engineering.
They had a pretty good football season this year didn't they?
Yeah, ND was in the playoffs with a perfect regular season! I went to the ND-Stanford and ND-Northwestern games this year with my wife Maria, who is also an alum. We have five ND degrees between us!
HD Live Maps
Looking at your Keynote title, for somebody that's not familiar with the term, what do you mean by HD Live Maps? How is that different from normal navigation maps that we use everyday to get from A to B?
Traditional navigation maps contain road topology, road centerline geometry, and road-level attributes. These traditional maps are referred to as a Road Model -- one of the three layers in HD maps.
Another layer in HD maps is the HD Lane Model, which contains lane topology data and lane-level geometry and attributes at centimeter level precision. The third layer coined the HD Localization Model include various features to support localization strategies to pin point a self-driving car to the exact lane and longitudinal location on an HD map. Essentially, we add two additional map layers to the Road Model layer available in a traditional map -- HD Lane Model and HD Localization Model -- resulting in much richer and more precise content.
The HERE HD Live Map is a cloud-based service comprised of various tiled mapping layers that are highly accurate and continuously updated to support connected ADAS and highly or fully automated driving solutions.
How does having an HD Map help a self-driving car? Wouldn't computer vision be enough? A human driver travels to new places without an HD map.
HD maps save human lives. Sensors and Artificial Intelligence are never perfect and make mistakes which cost lives. HD maps can mitigate risk by reducing the margin for error. HD maps assist in planning beyond the sensor range. HD maps also enhance sensor and AI capability to understand the environment, especially in uncommon and harsh conditions. Most automated driving solutions recognize HD maps as the hub of key components of sensing, perception, and planning. To use rich HD map information the car also needs to precisely pin point itself on the map and it needs localization. GPS-based solution sometimes has errors of a few meters which could be well beyond a lane width. By matching real time car perception and the HD map it can greatly enhance the localization accuracy.
How do we make HD maps? What are some of the challenges?
To make centimeter level high precision and high accuracy maps on a global scale and keep them constantly up to date is technically challenging and expensive. HD maps have never been successfully deployed before on a large scale so we are pioneers in the space, working closely with our customers and partners to iterate on integrating HD maps in automated driving systems.
The cost to collect data, manage data, develop software (algorithms, tooling, pipeline, process), operate production is prohibitive for many. We have to have a scalable production solution so that we don't break the bank. High accuracy, global coverage, near real-time freshness, production scalability and map interoperability are common challenges.
Wait, how can a map be interoperable?
Interoperability means one HD maps rules all. Either the HD map content or the production platform can be adopted by heterogeneous automated driving solutions and by different customers/partners. It is not realistic to custom build an HD map for each automated driving solution or each customer/partner as there are too many of them and most are still in R&D stage with software and hardware evolving rapidly.
What are some of the Artificial Intelligence strategies you are using?
Specifically computer vision, 3D data analysis, and machine learning. These approaches work together to significantly automate the HD map creation process.
The technology we use to build HD maps and the automated driving techniques in sensing and perception are two sides of a coin. The former can afford to do post-processing with tight quality control to build out a base HD map while the latter requires real time performance. The mapping sensors are typically more advanced / sophisticated at a higher cost than those on self-driving cars.
What sensors do you have access to for mapping?
We have DGPS, IMU, multiple industrial high resolution cameras and multi-beam LIDAR scanner. The imagery and LIDAR point clouds are geo-referenced by DGPS and IMU and we extract HD map attributes from the imagery and LIDAR. We also leverage other sources of data such as satellite and aerial imagery.
LIDAR can be very expensive compared to something simple like cameras. Why use LIDAR?
LIDAR is still too expensive to deploy on a large scale but the trend is that it will become cheaper and smaller like cameras.
I think that LIDAR is a must to make HD maps that guarantee centimeter level precision. Stereo-based computer vision cannot reconstruct 3D to that precision level consistently and the errors grow quadratically with the distance from the object to the cameras. For example, if you have 1cm error within 10 meters range from a camera. The error will be 1 meter at 100 meters distance. While a typical LIDAR scanner can maintain 1-2 centimeter level accuracy within a couple of hundred meters which covers most of the roadside objects.
A camera is also necessary and important as it provides color information that LIDAR doesn’t have and it has wider view angles and longer ranges. Most of the machine learning algorithms including deep neural net are developed for 2D images. ADAS and highly automated vehicles have cameras rather than LIDAR and their camera based real time feature detection is a great source to update HD maps due to its large quantity and high frequency.
Getting back to software, you just mentioned deep neural nets, how do you use deep learning?
It is a subcategory of the machine learning approach that I mentioned. We use it to automatically extract features such as lane markings, signs, barriers and poles from imagery and LIDAR point cloud data.
How does deep learning compare to computer vision?
Deep learning can be used to solve many computer vision problems and it is scalable over many different features. It often can achieve higher accuracy than traditional computer vision algorithms as long as there is sufficient training data and computing power. When training data is not sufficient for deep learning traditional computer vision or machine learning can still help.
Do you build the AI algorithms in-house or do we leverage other open-source or commercial services?
My team develops computer vision, deep neural nets, machine learning and 3D data analysis in house with the state-of-the-art techniques.
Our "secret sauce" is our proprietary data and really our know-how developed over my time at HERE in mining that data. We have unique datasets that no one else has. My team applies and develops state-of-the-art technologies customized to our datasets and optimized for HD mapping. With large scale training data it might be straight forward to get to an accuracy to 70, 80 some percent with an off-the-shelf deep neural net framework, but we strive to bring it up 10 to 20 percent more which requires deep expertise and experience.
I believe that AI technology is also strategically critical for a mapping company. I have over 50 patent applications and I know a few colleagues at HERE who have even more patents than I do. Many other engineering organizations within HERE also use AI to create navigation maps, traffic intelligence, indoor mapping, mobility services and AR.
My team works on building a machine learning services platform in house with a goal to democratize AI at HERE so that all engineers in the company, regardless of their machine learning background, can use this platform to train, evaluate, deploy and share their machine learning models.
Does everything for that platform run in the cloud or do you also need edge processing?
I have a team in Boulder who works on edge perception and localization. Their mission is to create reference implementations on how to use our HD maps for localization. It covers all key components of self-driving except for planning and control. It is built in consumer devices such as dashcams and smart phones for real time perception of road environment and localization of the car on HD maps.
How does one get into the field -- who are the people on your team capable of doing this type of work?
There are mainly three types of people on my team: research engineers who develop algorithms and push the envelope of the current state-of-the-art; software engineers who implement the algorithms in robust and efficient C++ and python software and maintain a common code base and library; production engineers who develop infrastructure and applications to run the AI software on a large scale in the cloud or on the edge .
We have many PhD level research engineers from schools such as CMU, MIT, Purdue and several hard core C++ developers who came from the fintech industry from places like Citadel.
How do you keep your skill set current with state of the art AI technologies since it is fast evolving and dynamic? I'm a CMU grad myself but dating myself a bit at the time we were still trying to figure out OCR.
Teaching forces me to really keep my skills current. I have developed and taught two AI courses at the Illinois Institute of Technology and Northwestern University every year since 2010. The syllabus evolves a great deal with each course - new algorithms, techniques, industry trends and software emerge every year.
What sort of topics do you cover in your courses or what is unique about them compared to some of the popular Coursera or Udacity ones you can find online?
I developed a course called “Geospatial Vision and Visualization” where students learn computer vision, big data analysis and machine learning in the geospatial intelligence context. Another course is “Biometrics” which was my PhD thesis topic. The students learn AI techniques through face, iris and fingerprint recognition. Biometrics is one of the most successful productization of machine learning technologies, while geospatial intelligence touches every aspect of human life from navigation mapping, traffic, energy efficiency to homeland security and self-driving – each is in great demand of machine learning for scalability and quality. Both improve human lives and at the same time raise ethical and privacy concerns that have significant social impact.
My teaching philosophy involves allowing students to learn machine learning in the real world context by using real world data to consider real world examples and solve real world problems in class. I want to equip students with the knowledge, experience and a skillset highly desired by their future employers in dynamic industry settings.
I took a look at the syllabus and it looks interesting covering lots of interesting topics like probe data, image enhancement, etc. Do you find that teaching helps you in other ways beyond keeping your knowledge fresh?
Absolutely. Many of my former students are my colleagues now at HERE across different departments and sites from Chicago to Mumbai. Teaching fosters a good relationship with local universities and several faculty and PhD students have worked on HERE-sponsored research projects in the past as a result.
Illinois Institute of Technology is ranked #5 – only behind Carnegie Mellon University, Stanford University, and University of California–Berkeley - by The San Francisco Business Times in terms of universities who produce the most professionals focused on the driverless car industry. I hope that my course has contributed to this growth! Giving back is also a core value at HERE and I am proud to have taught over 1000 students in the AI domain in the past 8 years.
How can the developer community benefit from your work or get involved in the space?
There are many ways that we collaborate with the developer community interested in HD maps. For example, we provide HD map samples and we also work with a popular R&D programming platform to integrate HD map interface in their libraries. There are also industry collaborations such as the OneMap Alliance.
I have also advanced our work through formal collaboration with universities such as Northwestern, Columbia, Notre Dame and Carnegie Mellon. I have organized a few challenges and competitions in the research community to analyze mapping data.
I am always looking for top talent to join our world-class team.
Thanks Xin, we really appreciate you sharing your insights and the thought leadership you provide in the field.