eSkiTB: A Synthetic Event-based Dataset for Tracking Skiers

(Accepted in IEEE/CVF Winter Conference on Applications of Computer Vision (WACV-2026))

Arizona State University

eSkiTB

Abstract

Event cameras are bio-inspired sensors that asynchronously capture per-pixel brightness changes at microsecond precision, offering low latency, high temporal resolution, and resilience to motion blur—critical for tracking fast-moving objects. Despite their advantages, the lack of standardized benchmarks and datasets for event-based tracking has limited their integration into computer vision pipelines. This paper introduces eSkiTB, a specialized benchmark for event-based tracking in winter sports, focusing on ski jumpers, freestyle skiers, and alpine skiers. Derived from broadcast footage covering varied lighting, weather, and clutter conditions, eSkiTB includes 300 sequences with 240 for training, 30 for validation, and 30 for testing. Using the v2e event simulator under the iso-informational constraint, we ensure neuromorphic fidelity while preserving temporal resolution. We evaluate state-of-the-art event and frame-based trackers, demonstrating the efficacy of spiking neural networks and the value of fine-tuning on domain-specific data. eSkiTB advances neuromorphic vision research and establishes a foundation for event-based tracking in extreme-motion scenarios.


Dataset Pipeline


We convert broadcast RGB videos to events using the v2e simulator under the iso-informational constraint, ensuring no neural interpolation artifacts. The pipeline includes upsampling to 1000 fps via frame repetition, adaptive color-to-grayscale conversion using the CIE luminance formula, and cubic spline interpolation for smooth bounding box trajectories at 1ms resolution. Events are stored in HDF5 format with microsecond-level timestamps (t, x, y, p), providing high temporal precision while maintaining compatibility with standard neuromorphic processing tools. This approach preserves the temporal fidelity of event cameras without introducing synthetic motion interpolation.



Dataset Statistics


eSkiTB contains 300 sequences distributed across three skiing disciplines: Alpine (AL), Freestyle (FS), and Ski Jumping (JP). The dataset is split into 240 training sequences, 30 validation sequences, and 30 test sequences, ensuring comprehensive coverage of diverse scenarios including varying speeds, camera angles, weather conditions, and occlusion patterns. Each sequence contains synchronized event data at 1280x720 resolution with bounding box annotations interpolated at 1ms intervals. The dataset captures challenging tracking scenarios with athletes moving at speeds up to 140 km/h, rapid camera movements, complex backgrounds with spectators and gates, and extreme lighting variations from bright snow to shadowed slopes.



Why Event Cameras for Ski Tracking?


Ski tracking presents unique challenges that conventional RGB cameras struggle with: extreme speeds (up to 140 km/h), rapid illumination changes from bright snow to dark shadows, severe motion blur during fast movements, and compression artifacts in broadcast footage. Event cameras address these limitations through their bio-inspired design: (1) High Temporal Resolution - Microsecond-level event capture eliminates motion blur by registering brightness changes independently per pixel; (2) Low Latency - Asynchronous operation enables real-time tracking with minimal delay; (3) High Dynamic Range - Superior performance across extreme lighting conditions (120+ dB vs 60 dB for standard cameras); (4) Efficient Encoding - Sparse event representation focuses computational resources on motion-relevant regions. The figure above demonstrates how RGB frames suffer from motion blur and compression artifacts, while event representations maintain clean, high-fidelity motion information.



Benchmark Results


We evaluate state-of-the-art event-based and frame-based trackers on eSkiTB. SDTrack (spiking transformer) achieves 0.711 IoU after fine-tuning on eSkiTB (+0.399 improvement over pretrained baseline), demonstrating the effectiveness of spiking neural networks for event-based tracking. In high-clutter scenarios, SDTrack reaches 0.685 IoU, outperforming generic STARK RGB by +20.0 points. Frame-based trackers also benefit from domain adaptation: STARK fine-tuned on eSkiTB events achieves 0.795 IoU, while STARK fine-tuned on ski-specific RGB data reaches 0.829 IoU, establishing the upper bound. These results highlight two key findings: (1) event-based representations provide robust tracking under extreme motion and lighting conditions, and (2) domain-specific fine-tuning is critical for both event and frame-based methods to handle ski tracking challenges.



Qualitative Results


Fine-tuned trackers demonstrate robust performance across challenging scenarios. The figure above shows tracking results on test sequences featuring: (1) Occlusions - Skiers passing behind gates, trees, or other obstacles; (2) High Clutter - Dense backgrounds with spectators, advertising banners, and complex textures; (3) Scale Variation - Dramatic changes in target size due to camera zoom and perspective; (4) Harsh Weather - Snowfall, fog, and low-light conditions. The visualizations demonstrate how event-based tracking maintains accurate bounding box predictions even when RGB-based methods fail due to motion blur or lighting challenges. Green boxes indicate successful tracking with high IoU overlap, while challenging frames showcase the robustness of fine-tuned models.



Temporal Analysis


We analyze tracking performance over time to understand model robustness across sequence duration. The IoU-over-time curves reveal that fine-tuned models maintain stable tracking performance throughout extended sequences, while baseline models show degradation over time. Event-based trackers exhibit particularly strong temporal consistency due to their ability to handle motion blur and lighting changes. The analysis demonstrates that domain-specific training not only improves average IoU but also reduces performance variance across different sequence segments, indicating better generalization to the full range of ski tracking challenges present in eSkiTB.



BibTeX

@inproceedings{vinod2026eskitb,
  title     = {eSkiTB: A Synthetic Event-based Dataset for Tracking Skiers},
  author    = {Vinod, Krishna and Vishal, Joseph Raj and Chanda, Kaustav and Ramesh, Prithvi Jai and Yang, Yezhou and Chakravarthi, Bharatesh},
  booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
  year      = {2026}
}