GraphRAG-Ollama，構建本地精準全局問答系統！

引言

RAG 是目前大語言模型相關最知名的工具之一，從外部知識庫中檢索事實，以便爲大型語言模型 (LLM) 提供最準確、最新的信息。但 RAG 並不完美，在更好的使用 RAG 方面仍存在許多挑戰。例如當針對整個文本文檔提出一個全局的問題時，RAG 會失敗，因爲 RAG 本質是一個查詢聚焦摘要任務，需要先基於 index 做檢索，而且不是一個明確的檢索全文任務。同時受限於大語言模型的上下文窗口限制，不可避免中間信息和關聯信息丟失的問題。

爲了解決這些問題，微軟提出了 Graph RAG 方法，使用 LLM 在兩個階段構建基於圖的文本索引：首先從源文檔中推導出實體知識圖，然後爲所有密切相關的實體組預生成社區摘要。給定一個問題，每個社區摘要用於生成部分響應，然後對所有部分響應進行總結以提供最終響應。對於一類關於 100 萬個標記範圍的數據集的全局理解問題，Graph RAG 證明了圖 RAG 在生成答案的全面性和多樣性方面相對於簡單的 RAG 基線有了顯着改進。

但是，Graph RAG 使用大語言模型從源文件抽取圖 entity 和總結，並建設圖索引，對 token 的消耗非常大，小編給大家算了一筆賬，如果使用 GPT-4o，一篇 5 萬字左右的文檔，Graph RAG 的示例代碼構建圖的文本索引消耗 27 萬左右 tokens，單次問答消耗約 1 萬 tokens，做個測試預計消費 2-4 美元，這也太貴了！

最佳實踐

爲了讓更多的人更加容易體驗 Graph RAG，本文在魔搭社區的免費 Notebook 算力中，體驗使用本地模型 + Ollama+GraphRAG。

參考項目：

https://github.com/TheAiSingularity/graphrag-local-ollama

代碼解析

該項目主要修改了文件路徑 / graphrag-local-ollama/graphrag/llm/openai/openai_embeddings_llm.py 文件，將 embedding 的調用方式從 OpenAI 格式改爲 Ollama 格式，大家也可以 clone 官方代碼做如下修改，或者使用如 Text-embedding-inference 之類的支持 OpenAI embedding API 格式的庫。

class OpenAIEmbeddingsLLM(BaseLLM[EmbeddingInput, EmbeddingOutput]):
    _client: OpenAIClientTypes
    _configuration: OpenAIConfiguration
    def __init__(self, client: OpenAIClientTypes, configuration: OpenAIConfiguration):
        self._client = client
        self._configuration = configuration
    async def _execute_llm(
        self, input: EmbeddingInput, **kwargs: Unpack[LLMInput]
    ) -> EmbeddingOutput | None:
        args = {
            "model": self._configuration.model,
            **(kwargs.get("model_parameters") or {}),
        }
        embedding_list = []
        for inp in input:
            embedding = ollama.embeddings(model="nomic-embed-text", prompt=inp)
            embedding_list.append(embedding["embedding"])
        return embedding_list

模型配置

安裝 Ollama

# 直接從modelscope下載ollama安裝包
modelscope download --model=modelscope/ollama-linux --local_dir ./ollama-linux
# 運行ollama安裝腳本
cd ollama-linux
sudo chmod 777 ./ollama-modelscope-install.sh
./ollama-modelscope-install.sh

embedding 模型使用 Ollama 自帶的 nomic-embed-text

ollama pull nomic-embed-text  #embedding

LLM 使用 ModelScope 的 Mistral-7B-Instruct-v0.3

模型鏈接：

https://modelscope.cn/models/LLM-Research/Mistral-7B-Instruct-v0.3-GGUF

modelscope download --model=LLM-Research/Mistral-7B-Instruct-v0.3-GGUF --local_dir . Mistral-7B-Instruct-v0.3.fp16.gguf

創建 ModelFile

FROM /mnt/workspace/Mistral-7B-Instruct-v0.3.fp16.gguf
PARAMETER stop "[INST]"
PARAMETER stop "[/INST]"
TEMPLATE """{{- if .Messages }}
{{- range $index, $_ := .Messages }}
{{- if eq .Role "user" }}
{{- if and (eq (len (slice $.Messages $index)) 1) $.Tools }}[AVAILABLE_TOOLS] {{ $.Tools }}[/AVAILABLE_TOOLS]
{{- end }}[INST] {{ if and $.System (eq (len (slice $.Messages $index)) 1) }}{{ $.System }}
{{ end }}{{ .Content }}[/INST]
{{- else if eq .Role "assistant" }}
{{- if .Content }} {{ .Content }}
{{- else if .ToolCalls }}[TOOL_CALLS] [
{{- range .ToolCalls }}{"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}}
{{- end }}]
{{- end }}</s>
{{- else if eq .Role "tool" }}[TOOL_RESULTS] {"content": {{ .Content }}} [/TOOL_RESULTS]
{{- end }}
{{- end }}
{{- else }}[INST] {{ if .System }}{{ .System }}
{{ end }}{{ .Prompt }}[/INST]
{{- end }} {{ .Response }}
{{- if .Response }}</s>
{{- end }}"""

創建模型

ollama create mymistral --file ./ModelFile

clone Graphrag（ollama 版本）repo 並安裝

git clone https://github.com/TheAiSingularity/graphrag-local-ollama.git
cd graphrag-local-ollama/
pip install -e .

創建輸入文件夾

將實驗數據複製保存在./ragtest 中，也可以增加自己的數據，目前僅支持. txt 格式

mkdir -p ./ragtest/input
cp input/* ./ragtest/input

初始化

初始化 ragtest 文件夾，並存入配置文件

python -m graphrag.index --init --root ./ragtest
mv settings.yaml ./ragtest

可以將配置文件中的模型文件和 embedding 模型按照需求做對應的修改，如：

運行索引並創建圖：

這部分對 LLM 有蠻大的要求，如果 LLM 的輸出 json 格式不穩定，創建圖的過程將被中斷，在過程中，我們也嘗試了多個模型，mistral 的 json 輸出穩定性比較好。

python -m graphrag.index --root ./ragtest

運行 query，目前僅支持全局方式

python -m graphrag.query --root ./ragtest --method global "What is machinelearning?"

同時，使用如下 python 代碼，生成可視化的 graphml 文件

from pygraphml import GraphMLParser
parser = GraphMLParser()
g = parser.parse("./graphrag-local-ollama/ragtest/output/***/artifacts/summarized_graph.graphml")
g.show()

本文由 Readfog 進行 AMP 轉碼，版權歸原作者所有。
來源：https://mp.weixin.qq.com/s/BVw7rQz82SFHvTkmXMhjYQ