Beyond CNNs: A Study on Fisher Vectors and Their Fusion for Few-Shot Generalization
Abstract
While convolutional neural networks (CNNs) have become the standard in modern visual learning, classical representations such as Fisher Vectors (FVs) are often overlooked in contemporary few-shot learning research. In this study, we revisit Fisher Vectors as standalone representations and in fusion with CNN features to assess their effectiveness in low-data regimes. We conduct controlled experiments on few-shot classification tasks (5-shot, 10-shot, and 15-shot) using benchmark datasets such as CIFAR-10, CIFAR-100, and miniImageNet. Our approach involves extracting Fisher and CNN features independently and evaluating their individual and combined performance via a simple feature concatenation strategy followed by classification. The results, visualized through comparative accuracy bar graphs, indicate that Fisher Vectors remain competitive in few-shot settings and can significantly enhance performance when fused with CNN embeddings. These findings suggest that classical feature encodings still hold value and can offer complementary benefits when integrated with deep representations in data-constrained learning scenarios
Full Text:
PDFReferences
F. Perronnin, J. Sánchez, and T. Mensink, "Improving the Fisher kernel for large-scale image classification," ECCV, 2010.
J. Sánchez, F. Perronnin, T. Mensink, and J. Verbeek, "Image classification with the Fisher vector: Theory and practice," IJCV, 2013.
K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," ICLR, 2015.
A. Krizhevsky, "Learning multiple layers of features from tiny images," Technical report, University of Toronto, 2009. (CIFAR-10 dataset)
O. Vinyals, C. Blundell, T. Lillicrap, K. Kavukcuoglu, and D. Wierstra, "Matching networks for one shot learning," NeurIPS, 2016.
J. Snell, K. Swersky, and R. Zemel, "Prototypical networks for few-shot learning," NeurIPS, 2017.
C. Finn, P. Abbeel, and S. Levine, "Model-agnostic meta-learning for fast adaptation of deep networks," ICML, 2017.
Y.-X. Wang, D. Ramanan, and M. Hebert, "Learning to model the tail," NeurIPS, 2017.
K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," CVPR, 2016.
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "ImageNet: A large-scale hierarchical image database," CVPR, 2009.
S. Ravi and H. Larochelle, "Optimization as a model for few-shot learning," ICLR, 2017.
W.-Y. Chen, Y.-C. Liu, Z. Kira, Y.-C. F. Wang, and J.-B. Huang, "A closer look at few-shot classification," ICLR, 2019.
J. B. Tenenbaum, V. de Silva, and J. C. Langford, "A global geometric framework for nonlinear dimensionality reduction," Science, 2000.
G. Hinton and R. Salakhutdinov, "Reducing the dimensionality of data with neural networks," Science, 2006.
F.-F. Li, R. Fergus, and P. Perona, "One-shot learning of object categories," PAMI, 2006.
L. Bo, X. Ren, and D. Fox, "Kernel descriptors for visual recognition," NeurIPS, 2010.
T. Kobayashi, "Low-rank Fisher discriminant analysis for image classification," CVPR, 2015.
E. Tzeng, J. Hoffman, N. Zhang, K. Saenko, and T. Darrell, "Deep domain confusion: Maximizing for domain invariance," arXiv:1412.3474, 2014.
S.-A. Rebuffi, H. Bilen, and A. Vedaldi, "Learning multiple visual domains with residual adapters," NeurIPS, 2017.
K. Lee, S. Maji, A. Ravichandran, and S. Soatto, "Meta-learning with differentiable convex optimization," CVPR, 2019.
B. N. Oreshkin, P. Rodríguez López, and A. Lacoste, "TADAM: Task dependent adaptive metric for improved few-shot learning," NeurIPS, 2018.
S. Hou, Q. Yao, J. T. Kwok, and X. Chang, "Cross attention network for few-shot classification," NeurIPS, 2019.
Refbacks
- There are currently no refbacks.
Abava Кибербезопасность ИТ конгресс СНЭ
ISSN: 2307-8162