如何通过unSloth 微调(Fine-tuning)专业大模型

微调、RAG和蒸馏的对比和区别。

AI程序猿人

2470人浏览 · 2025-02-20 11:50:40

AI程序猿人 · 2025-02-20 11:50:40 发布

学习目标：

● 理解大模型微调的基本过程

● 理解模型微调的方法和工具

● 微调、蒸馏、RAG的比较

工具及要件

● 数据集 - DataSet

● 微调工具 - unSloth 和 LLaMA-Factory

● 预训练模型和方法 - 如DeepSeek R1 & LoRA & QLoRA

● 运行环境 - GPU环境（私有&公有云）

基本概念

● 什么是微调

是指在已有的预训练模型的基础上，根据特定的任务需求，使用特定的数据集，对模型的参数进行进一步的调整和优化，从而新生成一个小模型，以便能够更好地适应这个特定任务。

● 微调有什么用

○ 更新知识：引入新的领域特定信息。

○ 自定义行为：调整模型的语气、个性或响应风格。

○ 针对任务优化：提高特定用例的准确性和相关性。

● 微调的方法

● 数据集以及数据集格式

Dataset数据集是可用于训练我们模型的集合

微调全过程

● 准备环境和依赖

# 操作系统要求
支持 Linux 和 Windows 通过 WSL 运行
# N卡及CUDA
2018 年以后 NVIDIA GPU
CUDA  7.0 + 
# 依赖及其他
必须支持 xformers 、 torch 、 BitsandBytes 和 triton 
# 内存预估，看这个网址：https://docs.unsloth.ai/get-started/beginner-start-here/unsloth-requirements

● 安装微调工具和依赖

# 安装 unsloth
pip install --upgrade pip  # 检查python环境，更新pip
pip install "unsloth[cu121-torch240] @ git+https://github.com/unslothai/unsloth.git" # 通过github文件安装

📢 ：这里要提示的是，环境安装依据个人环境会有不同的报错，最好是搞一个新的

版本，比如unbantu升级到版本和补丁，否则很容易浪费很多时间

● 准备好数据集

如下示

"Instruction": "Task we want the model to perform."
"Input": "Optional, but useful, it will essentially be the user's query."
"Output": "The expected result of the task and the output of the model."
{
  "conversations": [
    {
      "from": "human",
      "value": "Can you help me make pasta carbonara?"
    },
    {
      "from": "assistant",
      "value": "Would you like the traditional Roman recipe, or a simpler version?"
    },
    {
      "from": "human",
      "value": "The traditional version please"
    },
    {
      "from": "assistant",
      "value": "The authentic Roman carbonara uses just a few ingredients: pasta, guanciale, eggs, Pecorino Romano, and black pepper. Would you like the detailed recipe?"
    }
  ]
}

● 数据格式及Token化

● 选择模型和微调方法

○ 选择模型

官方首推llama3.1的指令模型，这个其实在魔搭和抱脸上很多，可以自己选择。从具体来说这个要方向（图像、文本）、数据量、资源配置情况来看，但这个肯定很啰嗦，这里梳理了一个表格，大家自己对照来看吧。

这两个表格统计来自于unsloth官方，但实际测试我们目前都采用DeepSeek V3 & R1 ,当然你也可以使用llama及其Qwen的版本，所以这个就看你具体要做什么方向的训练了，社区上面比较多，自己查询接口。

○ 选择方法

unsloth官方推荐QLoRA和LoRA ,当然这个看数据量和知识库的方向了

● 配置对应参数

LoRA参数配置如下，原文link如下：

https://docs.unsloth.ai/get-started/beginner-start-here/lora-parameters-encyclopedia

● 进行微调动作及评估

准备好数据和环境，设置好参数，运行程序，开启微调程序运行过程，通过查看Training Loss 来观察

# 什么Training Loss 
是指在模型训练过程中，模型在训练数据集上的损失值，用于衡量模型预测结果与实际标签之间的差异

Evaluation 评估

● 导出及合并微调模型

模型保存本地

model.save_pretrained_merged("merged_model", tokenizer, save_method = "merged_16bit",)  # 16位格式保存

使用终端执行

git clone --recursive https://github.com/ggerganov/llama.cpp
make clean -C llama.cpp
make all -j -C llama.cpp
pip install gguf protobuf

python llama.cpp/convert-hf-to-gguf.py FOLDER --outfile OUTPUT --outtype f16