Robust depth perception in degraded visual environments is crucial for autonomous aerial systems. Thermal imaging cameras, which capture infrared radiation, are robust to visual degradation. However, due to lack of a large-scale dataset, the use of thermal cameras for unmanned aerial system (UAS) depth perception has remained largely unexplored. This paper presents a stereo thermal depth perception dataset for autonomous aerial perception applications. The dataset consists of stereo thermal images, LiDAR, IMU and ground truth depth maps captured in urban and forest settings under diverse conditions like day, night, rain, and smoke. We benchmark representative stereo depth estimation algorithms, offering insights into their performance in degraded conditions. Models trained on our dataset generalize well to unseen smoky conditions, highlighting the robustness of stereo thermal imaging for depth perception. We aim for this work to enhance robotic perception in disaster scenarios, allowing for exploration and operations in previously unreachable areas.
A pair of stereo thermal cameras, a LiDAR, and an inertial measurement unit (IMU) are mounted on an unmanned aerial vehicle (UAV) platform, which supports data collection during handheld experiments and UAV flights. The stereo thermal pair is facing forward with a 24.6 cm baseline, and the LiDAR is positioned on top of the UAV. An onboard NVIDIA® Jetson AGX Orin™ computer is connected to the sensors. Setup and coordinate system for each sensor:
Sensor specifications:
Depth Map Generation:
The resulting depth map is in the left camera frame, and stereo disparity can be obtained with provided calibration, supporting both monocular and stereo depth estimation. A closed-loop trajectory was followed with the same initial and final position, making the dataset suitable for testing loop closure and accumulated drift for mapping and localization.
The processed FIReStereo dataset contains 204,594 stereo thermal
images total across all environments. 29% are in urban
environment, 15% are in mixed environment, 56% are
in wilderness environment with dense trees. 84% of the
images were collected in day-time and the rest were during
night-time. Obstacles were measured at a median depth of 7.40 m, with quartiles q1 =
5.17 m and q3 = 10.52 m, which falls within the typical
range for UAS obstacle avoidance. Histogram on the right shows the distribution of distances to objects.
42% stereo thermal pairs are smokeless, while 58% contain smoke. Of the smokeless
images, 35,706 have corresponding depth-map pairs.
Our data various in environment conditions and varying amounts of clutter spanning sub-urban settings, sparse trees, and dense trees. Depth data is available for the first two locations, while the latter two are intended for testing purposes. A detailed description of each sequence in the 4 locations can be found in the dataset text file.
Noticebly, flames, fire embers, and objects relevant to diaster response are visible in the Firesgl collection, making it useful for developing algorithms like ember detection for wildfire monitoring.
We implemented 5 representative stereo depth estimation models to evaluate the capabilities of our new dataset in facilitating robust depth estimation for UAS navigation in cluttered environment. More details and quantitative results can be found in the paper.
Fast-ACVNet is used to generate qualitative results, as it is best suited for running on a low Size, Weight, Power, and Cost (SWaP-C) system while maintaining similar performance to the more resource-intensive models. We observe the model trained on our dataset is now able to estimate depth for outdoor thermal images with challenging objects, such as thin tree branches and poles, where were previously difficult to capture.
We further evaluate the trained model on unseen environment with highly dense smoke conditions. Results show than the model trained on smokeless data is able to generalize to these smoke-filled environments.
@article{firestereo,
author={Dhrafani, Devansh and Liu, Yifei and Jong, Andrew and Shin, Ukcheol and He, Yao and Harp, Tyler and Hu, Yaoyu and Oh, Jean and Scherer, Sebastian},
journal={IEEE Robotics and Automation Letters},
title={FIReStereo: Forest InfraRed Stereo Dataset for UAS Depth Perception in Visually Degraded Environments},
year={2025},
volume={10},
number={4},
pages={3302-3309},
doi={10.1109/LRA.2025.3536278}
}