From f03f8c5de81ca552d5b1f778cab0e70b91ff0d88 Mon Sep 17 00:00:00 2001 From: Rucliuzenghao <18253019739@163.com> Date: Wed, 18 Sep 2024 11:27:51 +0800 Subject: [PATCH] update_21 --- .../index.md | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/content/publication/21-Efficient Model Store and Reuse in an OLML Database System/index.md b/content/publication/21-Efficient Model Store and Reuse in an OLML Database System/index.md index 7a0d959..11c662c 100644 --- a/content/publication/21-Efficient Model Store and Reuse in an OLML Database System/index.md +++ b/content/publication/21-Efficient Model Store and Reuse in an OLML Database System/index.md @@ -26,13 +26,7 @@ publication_types: ['journal-article'] publication: in *JCST* publication_short: "" -abstract: Deep learning has shown significant improvements on various machine learning tasks by introducing a wide -spectrum of neural network models. Yet, for these neural network models, it is necessary to label a tremendous amount -of training data, which is prohibitively expensive in reality. In this paper, we propose OnLine Machine Learning (OLML) -database which stores trained models and reuses these models in a new training task to achieve a better training effect -with a small amount of training data. An efficient model reuse algorithm AdaReuse is developed in the OLML database. -Specifically, AdaReuse firstly estimates the reuse potential of trained models from domain relatedness and model quality, -through which a group of trained models with high reuse potential for the training task could be selected efficiently. Then,multi selected models will be trained iteratively to encourage diverse models, with which a better training effect could be achieved by ensemble. We evaluate AdaReuse on two types of natural language processing (NLP) tasks, and the results show AdaReuse could improve the training effect significantly compared with models training from scratch when the training data is limited. Based on AdaReuse, we implement an OLML database prototype system which could accept a training task as an SQL-like query and automatically generate a training plan by selecting and reusing trained models. Usability studies are conducted to illustrate the OLML database could properly store the trained models, and reuse the trained models efficiently in new training tasks. +abstract: Deep learning has shown significant improvements on various machine learning tasks by introducing a wide spectrum of neural network models. Yet, for these neural network models, it is necessary to label a tremendous amount of training data, which is prohibitively expensive in reality. In this paper, we propose OnLine Machine Learning (OLML) database which stores trained models and reuses these models in a new training task to achieve a better training effect with a small amount of training data. An efficient model reuse algorithm AdaReuse is developed in the OLML database. Specifically, AdaReuse firstly estimates the reuse potential of trained models from domain relatedness and model quality, through which a group of trained models with high reuse potential for the training task could be selected efficiently. Then,multi selected models will be trained iteratively to encourage diverse models, with which a better training effect could be achieved by ensemble. We evaluate AdaReuse on two types of natural language processing (NLP) tasks, and the results show AdaReuse could improve the training effect significantly compared with models training from scratch when the training data is limited. Based on AdaReuse, we implement an OLML database prototype system which could accept a training task as an SQL-like query and automatically generate a training plan by selecting and reusing trained models. Usability studies are conducted to illustrate the OLML database could properly store the trained models, and reuse the trained models efficiently in new training tasks. # Summary. An optional shortened abstract. summary: ""