A novel framework integrating transformer-based video recognition methods with functional data analysis
Project with Haolun Shi
This project explores a novel framework integrating Transformer-based video recognition methods with Functional Data Analysis (FDA) to enhance the understanding and modeling of complex spatiotemporal patterns in video data. Transformer architectures have recently demonstrated remarkable performance in various natural language and image processing tasks, owing to their ability to capture long-range dependencies through self-attention. By introducing concepts from FDA鈥攚here data are treated as continuous functions rather than discrete points鈥攐ur approach seeks to improve feature representation and reduce computational overhead.