3D Fractal Pre-training for Action Recognition
Author
Term
4. semester
Education
Publication year
2025
Submitted on
2025-06-04
Pages
76
Abstract
Recent advances in computer vision have underscored the importance of large, diverse datasets for deep learning, but collecting real-world action recognition videos remains costly and challenging. This thesis explores synthetic data generated from 3D fractal geometry as a scalable, privacy-preserving resource for pre-training action recognition models. Leveraging Iterated Function Systems (IFS), a pipeline was developed to generate diverse 3D fractal point clouds, transform them into synthetic videos and construct large datasets for neural network pre-training. Experiments with a ResNet-50 backbone and Temporal Shift Module (TSM) show that pre-training on fractal-based datasets significantly outperforms training from scratch. Systematic studies reveal that dataset size, transformation strength, color augmentation and fractal geometry control all impact downstream performance. Data-driven methods for controlling fractal structure, such as condition number constraints and SVM-informed weighting of singular values, further enhance the visual diversity and quality of the data. While 3D fractal pre-training does not yet surpass the strongest 2D fractal baselines, it narrows the gap and demonstrates the practical potential of formula-driven synthetic data for scalable action recognition in domains with limited real data.
Keywords
Documents
