Amortized Bayesian Meta-Learning with Accelerated Gradient Descent Steps

Zhewei Zhang; Xuejing Li; Shengjin Wang

doi:10.3390/app13158653

Applied Sciences (Jul 2023)

Amortized Bayesian Meta-Learning with Accelerated Gradient Descent Steps

Zhewei Zhang,
Xuejing Li,
Shengjin Wang

Affiliations

Zhewei Zhang: Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
Xuejing Li: Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
Shengjin Wang: Department of Electronic Engineering, Tsinghua University, Beijing 100084, China

DOI: https://doi.org/10.3390/app13158653
Journal volume & issue: Vol. 13, no. 15
p. 8653

Abstract

Read online

Recent meta-learning models often learn priors from observed tasks using a network optimized via stochastic gradient descent (SGD), which usually takes more training steps to convergence. In this paper, we propose an accelerated Bayesian meta-learning structure with a stochastic inference network (ABML-SIN). The proposed model aims to solve the training procedure of Bayesian meta-learning to improve the training speed and efficiency. Current approaches of meta-learning hardly converge within a few descent steps, owing to the small number of training samples. Therefore, we introduce an accelerated gradient descent learning network based on teacher–student architecture to learn the meta-latent variable θt for task t. With this amortized fast inference network, the meta-learner is able to learn the task-specific latent θt within a few training steps; thus, it improves the learning speed of the meta-learner. To refine the latent variables generated from the transductive amortization network of the meta-learner, SIN—followed by a conventional SGD-optimized network—is introduced as the student–teacher network to online-update the parameters. SIN extracts the local latent variables and accelerates the convergence of the meta-learning network. Our experiments on simulation data demonstrate that the proposed method provides generalization and scalability on unseen samples, and produces competitive/superior uncertainty estimations on few-shot learning tasks on two widely adopted 2D datasets with fewer training epochs compared to the state-of-the-art meta-learning approaches. Furthermore, the parameters generated by SIN act as perturbations on latent weights, enhancing the probability of accelerating the training efficiency of the meta-learner. Extensive qualitative experiments show that our method performs well across different meta-learning tasks in both simulated and real-world circumstances.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords