Towards more accurate 3D object detection for robots and self-driving cars
Researchers have developed a network that combines 3D LiDAR and 2D image data to enable a more robust detection of small objects.
Robotics and autonomous vehicles are among the most rapidly growing domains in the technological landscape, with the potential to make work and transportation safer and more efficient. Since both robots and self-driving cars need to accurately perceive their surroundings, 3D object detection methods are an active area of study. Most 3D object detection methods employ LiDAR sensors to create 3D point clouds of their environment. Simply put, LiDAR sensors use laser beams to rapidly scan and measure the distances of objects and surfaces around the source. However, using LiDAR data alone can lead to errors due to the high sensitivity of LiDAR to noise, especially in adverse weather conditions like during rainfall.
To tackle this issue, scientists have developed multi-modal 3D object detection methods that combine 3D LiDAR data with 2D RGB images taken by standard cameras. While the fusion of 2D images and 3D LiDAR data leads to more accurate 3D detection results, it still faces its own set of challenges, with accurate detection of small objects remaining difficult. The problem mainly lies in properly aligning the semantic information extracted independently from the 2D and 3D datasets, which is hard due to issues such as imprecise calibration or occlusion.
Against this backdrop, a research team led by Professor Hiroyuki Tomiyama from Ritsumeikan University, Japan, has developed an innovative approach to make multi-modal 3D object detection more accurate and robust. The proposed scheme, called “Dynamic Point-Pixel Feature Alignment Network” (DPPFA−Net), is described in their paper published in IEEE Internet of Things Journal on 3 November 2023.
The model comprises an arrangement of multiple instances of three novel modules: the Memory-based Point-Pixel Fusion (MPPF) module, the Deformable Point-Pixel Fusion (DPPF) module, and the Semantic Alignment Evaluator (SAE) module. The MPPF module is tasked with performing explicit interactions between intra-modal features (2D with 2D and 3D with 3D) and cross-modal features (2D with 3D). The use of the 2D image as a memory bank reduces the difficulty in network learning and makes the system more robust against noise in 3D point clouds. Moreover, it promotes the use of more comprehensive and discriminative features.
In contrast, the DPPF module performs interactions only at pixels in key positions, which are determined via a smart sampling strategy. This allows for feature fusion in high resolutions at a low computational complexity. Finally, the SAE module helps ensure semantic alignment between both data representations during the fusion process, which mitigates the issue of feature ambiguity.
The researchers tested DPPFA−Net by comparing it to the top performers for the widely used KITTI Vision Benchmark. Notably, the proposed network achieved average precision improvements as high as 7.18% under different noise conditions. To further test the capabilities of their model, the team created a new noisy dataset by introducing artificial multi-modal noise in the form of rainfall to the KITTI dataset. The results show that the proposed network performed better than existing models not only in the face of severe occlusions but also under various levels of adverse weather conditions. “Our extensive experiments on the KITTI dataset and challenging multi-modal noisy cases reveal that DPPFA-Net reaches a new state-of-the-art,” remarks Prof. Tomiyama.
Notably, there are various ways in which accurate 3D object detection methods could improve our lives. Self-driving cars, which rely on such techniques, have the potential to reduce accidents and improve traffic flow and safety. Furthermore, the implications in the field of robotics should not be understated. “Our study could facilitate a better understanding and adaptation of robots to their working environments, allowing a more precise perception of small targets,” explains Prof. Tomiyama. “Such advancements will help improve the capabilities of robots in various applications.” Another use for 3D object detection networks is the pre-labeling of raw data for deep-learning perception systems. This would greatly reduce the cost of manual annotation, accelerating developments in the field.
Overall, this study is a step in the right direction towards making autonomous systems more perceptive and assisting us better with human activities.
***
Reference
DOI: https://doi.org/10.1109/JIOT.2023.3329884
About Ritsumeikan University, Japan
Ritsumeikan University is one of the most prestigious private universities in Japan. Its main campus is in Kyoto, where inspiring settings await researchers. With an unwavering objective to generate social symbiotic values and emergent talents, it aims to emerge as a next-generation research university. It will enhance researcher potential by providing support best suited to the needs of young and leading researchers, according to their career stage. Ritsumeikan University also endeavors to build a global research network as a “knowledge node” and disseminate achievements internationally, thereby contributing to the resolution of social/humanistic issues through interdisciplinary research and social implementation.
Website: http://en.ritsumei.ac.jp/
Ritsumeikan University Research Report: https://www.ritsumei.ac.jp/research/radiant/eng/
About Professor Hiroyuki Tomiyama from Ritsumeikan University, Japan
Professor Hiroyuki Tomiyama received B.E., M.E., and D.E. degrees in computer science from Kyushu University in 1994, 1996, and 1999, respectively. He joined the College of Science and Engineering at Ritsumeikan University in 2010, where he works as a Full Professor. He specializes in embedded and cyber-physical systems, autonomous drones, biochip synthesis, and the automation and optimization of electronic designs. He has published over 110 papers on these subjects as well as several books.
Funding information
This work is partly supported by JSPS KAKENHI Grant Number 20K23333 and partly commissioned by NEDO (Project Number JPNP22006).
JOURNAL
IEEE Internet of Things Journal
METHOD OF RESEARCH
Computational simulation/modeling
SUBJECT OF RESEARCH
Not applicable
ARTICLE TITLE
Dynamic Point-Pixel Feature Alignment for Multi-modal 3D Object Detection
Intuitive and self-learning robots
The Department of Industrial Engineering coordinates an 8 million research project in the field of collaborative robotics. The European funding will help build devices that are capable of changing their behaviour without human intervention
«We started from a very specific machine learning problem, that is the generation of safe and sensible behaviour when devices are outside the world of training data, that is data acquired through artificial intelligence algorithms», explains Matteo Saveriano. In particular, the project addresses the problem for a robot to change a previously learned task in complete autonomy, without the intervention of an operator.
The project will have application settings in two sectors: the automotive industry and the heavy industry. In the first case, the system will be tested to assemble, disassemble and recycle the batteries of electric vehicles. The robotic device will be instructed to install a battery. But the ultimate goal of the project is that the robot uses this knowledge to explore the needs of the setting in which it operates and take apart the battery.
In the heavy mechanical industry, the system will instead be used to facilitate an intelligent interaction between an operator, the robotic device and an automated overhead crane. In this case, the goal is to automate the cranes, which are used for lifting and moving large and heavy metal loads from one point to another at a company facility. Currently, there is an operator performing this work who also operates the overhead crane and often works in an uncomfortable and dangerous position, where accidents are likely to occur. The robotic device will therefore replace the human in risky or unpleasant tasks. Operators will continue to supervise the work along the entire production process. «Our idea is to let operators perform less repetitive and heavy tasks so that they can use all the intellectual abilities of human intelligence such as imagination and the ability to solve problems, that machines cannot do. We would like let robots handle the materials and put people in a safer position. In this way, humans would supervise the work of machines and make sure it is as precise as possible» adds Saveriano.
The project is in line with the policies of the European Union on the circular and green economy. One of the great problems of the automotive industry is the recycling of battery components. Today batteries are manufactured by different companies, with different assembly techniques. The sorting, disposal and reuse of these materials is a very complex task because no two batteries are the same. What Inverse aims to achieve is to automate this process by developing highly innovative solutions and flexible learning techniques that adapt quickly to the components of the battery.
In addition to this, the robotic devices will be able to measure the energy efficiency of a product, the amount of greenhouse gases they emit, the materials they use and their recycling rate, to make a positive contribution to the disposal of industrial waste.
But the project also has other sustainability goals. The researchers want that the device that has been trained for the purpose of assembling can be used, with small changes, for disassembling, to scientifically demonstrate that this is more convenient than training a device from scratch.
The Inverse project
Inverse is the result of the collaboration among the Idra interdepartmental laboratories of robotics of the University of Trento, which also include the Department of Information Engineering and Computer Science.
Of the ten partners of the project, six are universities and research institutes: the Create consortium of Federico II University of Naples, the University of Vienna, the Technical Research Centre of Finland, the German Space Agency (Dlr), the Universities of Mondragon (Spain) and Bogazici (Turkey). The other four partners are from the industrial sector: Centro Ricerche Fiat, KroneCrane AG, Steinbeis Europa Zentrum, MTU Civitta Foundation.
This is a Research and Innovation Action – RIA project that has obtained eight million euros funding from the European Union. The project will start in January. The goal of researchers is to create, by 2027, the first prototype of a robot tested in a laboratory environment that simulates an industrial setting.
No comments:
Post a Comment