A Flask Framework for Visual Attention-Prompted Prediction Open Access

Song, James (Spring 2024)

Permanent URL: https://etd.library.emory.edu/concern/etds/xw42n932t?locale=pt-BR%2A
Published

Abstract

Deep Neural Networks (DNN) have demonstrated remarkable performances in the field of Computer Vision (CV), showing promising potentials in various areas. However, the ’black box’ nature of the DNNs introduces challenges on ensuring the interpretability of such models, making it difficult to trust DNN models on fields with high stakes. In response, visual explanation-guided learning utilizes human annotated explanations in training to guide DNN model's reasoning process, making it more reasonable and trustworthy. However, this process requires a large number of explanation annotations, which takes a lot of resource to prepare, resulting in the emergence of visual attention-prompted prediction. This approach involves utilizing visual attention guidance during the application stage instead of in the learning phase, and thus eliminates the need of large amount of visual explanations and enables the direct guidance from the end user. To help facilitate this process, we propose a visual attention-prompted prediction framework that provides a user friendly application for real-time visual attention annotation and comparison between predictions with and without explanation guidance. Though the proposed framework can work with any convolutional neural networks (CNN) provided by the user, we still provide an already trained CNN model for out-of-box experience. The provided model incorporates a novel co-training process for prompted and non-prompted models, making the non-prompted model to have similar reasoning process as the prompted model. Extensive experiments on four real world datasets demonstrate the effectiveness of our provided model in situation where visual prompt is scarce. A detailed instruction of how to use our framework is also provided. 

Table of Contents

Contents

1 Introduction

2 Background

2.1 Attention-guided Learning

2.2 Attention Prompt

3 Approach

3.1 Problem Formulation

3.2 Implementation

3.3 Overall Framework

3.3.1 Loading Model and Data

3.3.2 Annotation

3.3.3 Result & Evaluation

3.4 Model

3.4.1 ResNet

3.4.2 Visual Attention-Prompted Learning

4 Experiments

4.1 Dataset

4.2 Experimental Setup

4.2.1 Comparison Methods

4.2.2 Implementation Details

5 Analysis

5.1 Results

5.2 Instruction for the Visual Attention-Prompted Prediction Framework

6 Conclusion

Bibliography

About this Honors Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files