'Disengage AND Integrate': Personalized Causal Network for Gaze Estimation.

in IEEE transactions on image processing : a publication of the IEEE Signal Processing Society by Yi Tian, Xiyun Wang, Sihui Zhang, Wanru Xu, Yi Jin, Yaping Huang

TLDR

  • The study proposes a novel approach, PCNet, that efficiently disentangles and integrates personalized information for generalizable gaze estimation

Abstract

Gaze estimation task aims to predict a 3D gaze direction or a 2D gaze point given a face or eye image. To improve generalization of gaze estimation models to unseen new users, existing methods either disentangle personalized information of all subjects from their gaze features, or integrate unrefined personalized information into blended embeddings. Their methodologies are not rigorous whose performance is still unsatisfactory. In this paper, we put forward a comprehensive perspective named 'Disengage AND Integrate' to deal with personalized information, which elaborates that for specified users, their irrelevant personalized information should be discarded while relevant one should be considered. Accordingly, a novel Personalized Causal Network (PCNet) for generalizable gaze estimation has been proposed. The PCNet adopts a two-branch framework, which consists of a subject-deconfounded appearance sub-network (SdeANet) and a prototypical personalization sub-network (ProPNet). The SdeANet aims to explore causalities among facial images, gazes, and personalized information and extract a subject-invariant appearance-aware feature of each image by means of causal intervention. The ProPNet aims to characterize customized personalization-aware features of arbitrary users with the help of a prototype-based subject identification task. Furthermore, our whole PCNet is optimized in a hybrid episodic training paradigm, which further improve its adaptability to new users. Experiments on three challenging datasets over within-domain and cross-domain gaze estimation tasks demonstrate the effectiveness of our method.

Overview

  • The study proposes a novel approach, 'Disengage AND Integrate', to handle personalized information in gaze estimation models.
  • The model, called PCNet, consists of a two-branch framework: SdeANet for subject-invariant appearance-aware features and ProPNet for personalization-aware features.
  • The PCNet is optimized in a hybrid episodic training paradigm to improve its adaptability to new users.

Comparative Analysis & Findings

  • The study compares PCNet with existing methods and shows its effectiveness in generalizable gaze estimation tasks on three challenging datasets.
  • PCNet outperforms existing methods in within-domain and cross-domain gaze estimation tasks.
  • The results demonstrate the superiority of PCNet in dealing with personalized information and improving the performance of gaze estimation models.

Implications and Future Directions

  • The study's findings have implications for the development of gaze estimation models that can generalize well to new users and improve their performance in real-world applications.
  • Future research directions could explore the use of PCNet in other computer vision tasks, such as facial recognition and emotion recognition.
  • The study also opens up possibilities for exploring novel techniques for handling personalized information in gaze estimation models, such as multi-task learning and transfer learning.