C. Zhang, J. Liu, E. Shahabi, W. Pan and C. D. Santina, "End-to-End Learning of Soft-Robot Dynamics from Event-Based Camera Streams," 2026 IEEE 9th International Conference on Soft Robotics (RoboSoft), Kanazawa, Japan, 2026, pp. 438-445, doi: 10.1109/RoboSoft67810.2026.11522882.

Abstract: Modeling and predicting the motion of soft robots remains challenging due to their infinite-dimensional and highly nonlinear dynamics. A promising direction is to learn dynamics directly from high-dimensional sensory streams. Yet, standard RGB cameras suffer from motion blur, making it challenging to capture fast transients and biasing learning toward steadystate behaviors. Here, we turn to event-based cameras, which provide asynchronous, high-frequency visual information better suited to capturing dynamic deformations. We propose a learning architecture that encodes two-channel event frames from a DVS sensor through a convolutional autoencoder while jointly learning a compact latent representation of the robot’s dynamics. Within this space, we test several latent models, including a novel spiking-harmonic latent oscillator network (snLON), in which spiking neurons capture the event structure of the data stream and drive a latent Oscillator Network that represents the underlying mechanical dynamics. Validated in both simulation and real-world experiments, the proposed framework predicts long-horizon soft-robot motion with high accuracy and consistency from a single initial event frame and control sequence.