IEEE Access (Jan 2024)

SE-HCL: Schema Enhanced Hybrid Curriculum Learning for Multi-Turn Text-to-SQL

  • Yiyun Zhang,
  • Sheng'an Zhou,
  • Gengsheng Huang

DOI
https://doi.org/10.1109/ACCESS.2024.3365522
Journal volume & issue
Vol. 12
pp. 39902 – 39912

Abstract

Read online

Existing multi-turn Text-to-SQL approaches, mainly use data in a randomized order when training the model, ignoring the rich structural information contained in the dialog and schema. In this paper, we propose to use curriculum learning (CL) to better leverage the curriculum structure of schema, query, and dialog for multi-turn question-query pairs. We design a model-agnostic framework named Schema Enhanced Hybrid Curriculum Learning (SE-HCL) for multi-turn Text-to-SQL to help the models gain a full contextual semantic understanding. Concretely, We measure the difficulty of the data from both a structural and model perspective. In terms of data structure, we mainly consider the turns of the question and the complexity of the schema and SQL query. Accordingly, we designed a data course module to dynamically adjust the difficulty of the data based on the convergence of the model and the schema enhancement method we designed. In terms of the model, we propose a scoring module that will judge the difficulty of a problem based on whether the model could solve the question effectively. Finally, we will consider both aspects and design a hybrid curriculum to determine the flow of model training. Our experiments show that our proposed method improves SQL-generated performance over previous state-of-the-art models on SparC and CoSQL, especially for hard and long-turn questions.

Keywords