Multi-Modal Fusion Transformer for Stop-to-Finish Autonomous Driving

How should representations from complementary sensors be built-in for autonomous driving? Geometry-dependent sensor fusion has shown fantastic promise for notion responsibilities this sort of as object detection and movement forecasting. Even so, for the precise driving activity, the world-wide context of the 3D scene is critical, e.g. a alter in visitors gentle point out can have an effect on the habits of a car geometrically distant from that visitors gentle. Geometry by yourself might thus be inadequate for efficiently fusing representations in end-to-finish driving designs. In this get the job done, we display that current sensor fusion approaches less than-accomplish in the presence of a large density of dynamic brokers and complicated eventualities, which need world contextual reasoning, these kinds of as managing visitors oncoming from several directions at uncontrolled intersections. Therefore, we propose TransFuser, a novel Multi-Modal Fusion Transformer, to combine graphic and LiDAR representations employing interest. We experimentally validate the efficacy of our technique in urban options involving complicated scenarios employing the CARLA city driving simulator. Our strategy achieves condition-of-the-art driving effectiveness whilst decreasing collisions by 80% when compared to geometry-centered fusion.

http://www.cvlibs.net/publications/Prakash2021CVPR.pdf
https://ap229997.github.io/projects/transfuser/

(Visited 5 times, 1 visits today)

You Might Be Interested In

LEAVE YOUR COMMENT

Your email address will not be published.