淘先锋技术网

首页 1 2 3 4 5 6 7

https://arxiv.org/pdf/2305.08848.pdficon-default.png?t=N3I4https://arxiv.org/pdf/2305.08848.pdf

In this paper, we propose Super In-Context Learning (SuperICL) which allows black-box LLMs to work with locally fine-tuned smaller models, resulting in superior performance on supervised tasks. Our experiments demonstrate that SuperICL can improve performance beyond state-of-the-art fine-tuned models while addressing the instability problem of in-context learning. Furthermore, SuperICL can enhance the capabilities of smaller models, such as multilinguality and interpretability.

Despite the impressive performance of these recently released models, their size and limited accessibility of model weights can lead to difficulties in fine-tuning these models with supervised data, which is an effective way to adapt the models to specific tasks

To address these limitations, we propose Super In-Context Learning (SuperICL), a novel approach that enables black-box language models (e.g., GPT3.5) to work with locally fine-tuned smaller models (e.g., RoBERTa, Liu et al., 2019), resulting in improved performance on supervised tasks. SuperICL is designed to overcome the challenges of poor performance and instability of ICL.

 

Different from these works, SuperICL demonstrates that smaller models can be integrated into large language models for supervised tasks. Although it is orthogonal to these prior works, by fine-tuning the plug-in model with the entire training set, SuperICL reduces the necessity for selecting the optimal examples from the training set.

Different from these works, our work is under a classic supervised learning and demonstrates that even tasks like text classification, which is sometimes considered “solved” by smaller language models, can still benefit from combination with a large language model.