in Neural networks : the official journal of the International Neural Network Society by Jiaxu Leng, Qianru Chen, Taiyue Chen, Feng Gao, Ji Gan, Changjun Gu, Xinbo Gao
Few-shot object detection is a challenging task that aims to quickly adapt detectors to detect novel objects with only a minimal number of annotated examples. Although promising results have been achieved, performance still declines significantly when the number of shots decreases sharply. We argue that this shot sensitivity is due to the critical under-utilization of both internal few-shot data and external common knowledge bases. Therefore, the key insight is how to extract more discriminative notions to compensate for the insufficient task-specific information from the limited novel dataset. We propose a novel heterogeneous Graph Capsule Network for Few-Shot object Detection, named GCapNet-FSD. Specifically, we design a heterogeneous graph to combine the high-level visual capsule neurons from internal few-shot data and the stable semantic embeddings from the external easily available corpus for more discriminative task-specific representations. As a result, our proposed GCapNet-FSD is stable and robust for various settings of the shots. Our design outperforms current works in 1-shot of any split, with up to +3.7% on PASCAL VOC07&12 and +0.4% on challenging COCO benchmark, and extensive experiments on both PASCAL VOC07&12 and MS COCO benchmarks demonstrate that our GCapNet-FSD shows shot-stable detection performance and achieves significantly better performance at lower shots.