ServiceNow 通过精调 Embedding 模型提升 RAG 的准确性

Jan 10, 2025

—

ServiceNow 使用 RAG 技术解决什么问题

曾经简单的研究过 ServiceNow 这家公司的产品，了解到它主要是围绕 ITSM / 低代码领域做企业流程。

The context is an enterprise company that deploys several GenAI applications that currently rely or will rely on RAG: Flow Generation, Playbook Generation, and Code Generation. Workflows are step-by-step processes that automate one or more goals while playbooks contain workflows and other UI components such as forms. The objective of these applications is to generate domain-specific workflows, playbooks, and code from textual input.

这里的背景是一家企业部署了众多 Gen-AI 应用，它们当前依赖或者未来会依赖 RAG，包括：Flow 生成，Playbook 生成，Code 生成。Workflow 是一步步的自动化一个或多个目标的流程，playbook 包含 workflow 和别的组件如表单。这些 Gen-AI 应用的目的是从文本输入，生成领域内的 workflow，playbook 和 code

The problem we are trying to solve is then: how to adapt the retrieval step to a specific domain and to a variety of retrieval tasks?

ServiceNow 要解决的问题是：怎样把 RAG 流程中的 Retrieval 步骤适配到特定领域知识内、以及适配到多种检索任务中

For our solution, we fine-tune mGTE because of its large context length (8,192 tokens), allowing it to receive long instructions, and because of its multilingual capabilities. We compare it with BM25, a simple yet powerful term-frequency method, and with recent open source multilingual embedding models: mE5 and mGTE.

ServiceNow 的办法是：Fine-tuning mGTE 模型，因为这个模型 Context Length 有 8k 所以支持接受长指令，并且它也是多语言的。他们用精调模型来对比了 BM25、mE5 和 mGTE 模型

ServiceNow 的 Flow 生成文本举例

精调效果评测

多任务检索

Table 3 shows that downsampling the data causes an improvement of 8% in step retrieval with a small loss in field retrieval, on a fine-tuned mGTE-base model.

These findings suggest that when fine-tuning a retriever for RAG, one needs to be careful on the dataset make-up.

下采样给 step 场景的检索提升了 8% 同时造成 field 场景的检索有少量下降。说明在处理特定领域的数据时，需要关注数据分布的不平衡问题。

与对照模型对比

在 OOD（训练数据域外）下精调模型与对照模型对比，在不同场景下的召回准确性比 BM25 模型、mE5 的 small/base/large、mGTE base 模型效果都好，有 16% ～ 26% 的提升

在多语言方面，由于基座模型 mGTE-base 支持多语言，训练数据也通过翻译构造了多语言。在多语言的召回正确性上都有提升

在泛化能力方面，模型在 workflow 场景的召回准确性也有提升，显示出了泛化能力

论文摘要链接

https://plify.co/paper/detail/2501.04652?lang=zh

Heycc's blog