AdaSent: Efficient Domain-Adapted Sentence Embeddings for Few-Shot Classification

Huang, Yongxin; Wang, Kexin; Dutta, Sourav; Patel, Raj Nath; Glavaš, Goran; Gurevych, Iryna

计算机科学 > 计算与语言

arXiv:2311.00408 (cs)

[提交于 2023年11月1日 ]

标题： AdaSent：用于少样本分类的高效领域自适应句嵌入

标题： AdaSent: Efficient Domain-Adapted Sentence Embeddings for Few-Shot Classification

Authors:Yongxin Huang, Kexin Wang, Sourav Dutta, Raj Nath Patel, Goran Glavaš, Iryna Gurevych

摘要：最近的研究发现，基于预训练句子编码器（SEs）的少样本句子分类是高效、稳健和有效的。在本工作中，我们研究了在基于SEs的少样本句子分类背景下领域专业化策略。我们首先确定，对基础预训练语言模型（PLM）（即不是SE）进行无监督领域自适应预训练（DAPT），可使少样本句子分类的准确率提高多达8.4个百分点。然而，将DAPT应用于SEs一方面会干扰其（通用领域）句子嵌入预训练（SEPT）的效果。另一方面，在经过DAPT的通用领域SEPT基础上进行领域自适应的PLM（即DAPT后）是有效的，但效率低下，因为计算成本高昂的SEPT需要在每个领域的DAPT后的PLM上执行。作为解决方案，我们提出了AdaSent，它通过在基础PLM上训练一个SEPT适配器，将SEPT与DAPT解耦。该适配器可以插入任何领域的DAPT后的PLM中。我们在17个不同的少样本句子分类数据集上进行了广泛的实验，证明了AdaSent的有效性。 AdaSent在DAPT后的PLM上的性能可以达到或超过完整的SEPT，同时大幅降低了训练成本。 AdaSent的代码已公开。

摘要： Recent work has found that few-shot sentence classification based on pre-trained Sentence Encoders (SEs) is efficient, robust, and effective. In this work, we investigate strategies for domain-specialization in the context of few-shot sentence classification with SEs. We first establish that unsupervised Domain-Adaptive Pre-Training (DAPT) of a base Pre-trained Language Model (PLM) (i.e., not an SE) substantially improves the accuracy of few-shot sentence classification by up to 8.4 points. However, applying DAPT on SEs, on the one hand, disrupts the effects of their (general-domain) Sentence Embedding Pre-Training (SEPT). On the other hand, applying general-domain SEPT on top of a domain-adapted base PLM (i.e., after DAPT) is effective but inefficient, since the computationally expensive SEPT needs to be executed on top of a DAPT-ed PLM of each domain. As a solution, we propose AdaSent, which decouples SEPT from DAPT by training a SEPT adapter on the base PLM. The adapter can be inserted into DAPT-ed PLMs from any domain. We demonstrate AdaSent's effectiveness in extensive experiments on 17 different few-shot sentence classification datasets. AdaSent matches or surpasses the performance of full SEPT on DAPT-ed PLM, while substantially reducing the training costs. The code for AdaSent is available.

评论：	被EMNLP 2023主会议接收
主题：	计算与语言 (cs.CL)
引用方式：	arXiv:2311.00408 [cs.CL]
	(或者 arXiv:2311.00408v1 [cs.CL] 对于此版本)
	https://doi.org/10.48550/arXiv.2311.00408

提交历史

来自： Yongxin Huang [查看电子邮件]
[v1] 星期三， 2023 年 11 月 1 日 10:00:15 UTC (200 KB)

计算机科学 > 计算与语言

标题： AdaSent：用于少样本分类的高效领域自适应句嵌入

标题： AdaSent: Efficient Domain-Adapted Sentence Embeddings for Few-Shot Classification

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 计算与语言

标题： AdaSent：用于少样本分类的高效领域自适应句嵌入 显示英文标题

标题： AdaSent: Efficient Domain-Adapted Sentence Embeddings for Few-Shot Classification

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： AdaSent：用于少样本分类的高效领域自适应句嵌入