DeepSeek R1[1]是一個(gè)功能強(qiáng)大且用途廣泛的 AI 模型，它憑借先進(jìn)的推理能力、成本效益和開(kāi)源可用性向 OpenAI 等老牌企業(yè)發(fā)起了挑戰(zhàn)。雖然它有一些局限性，但其創(chuàng)新的方法和強(qiáng)大的性能使其成為開(kāi)發(fā)人員、研究人員和企業(yè)的寶貴工具。對(duì)于那些有興趣探索其功能的人來(lái)說(shuō)，該模型及其精簡(jiǎn)版本可以在 Hugging Face 和 GitHub 等平臺(tái)上獲得。

由受 GPU 限制的中國(guó)團(tuán)隊(duì)訓(xùn)練，它在數(shù)學(xué)、編碼甚至一些相當(dāng)復(fù)雜的推理方面表現(xiàn)出色。最有趣的是，它是一個(gè)“精簡(jiǎn)”模型，這意味著它比它所基于的巨型模型更小、更高效。這很重要，因?yàn)樗谷藗冊(cè)趯?shí)際使用和構(gòu)建它時(shí)更加實(shí)用。

本文我們將介紹

如何在自己的設(shè)備上運(yùn)行開(kāi)源 DeepSeek 模型
如何使用最新的 DeepSeek 模型創(chuàng)建與 OpenAI 兼容的 API 服務(wù)

我們將使用 LlamaEdge[2]（Rust + Wasm 技術(shù)棧）來(lái)開(kāi)發(fā)和部署這個(gè)模型的應(yīng)用程序。無(wú)需安裝復(fù)雜的 Python 包或 C++ 工具鏈[3]！了解我們選擇這項(xiàng)技術(shù)的原因[4]。

在自己的設(shè)備上運(yùn)行 DeepSeek-R1-Distill-Llama-8B 模型

第一步：通過(guò)以下命令行安裝WasmEge[5]。

curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s -- -v 0.14.1

第二步：下載量化過(guò)的DeepSeek-R1-Distill-Llama-8B-GGUF[6]模型文件。這可能需要一定時(shí)間，因?yàn)槟Ｐ偷拇笮?5.73 GB。

curl -LO https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B-GGUF/resolve/main/DeepSeek-R1-Distill-Llama-8B-Q5_K_M.gguf`

第三步：下載 LlamaEdge API 服務(wù)器應(yīng)用程序。它也是一個(gè)跨平臺(tái)的便可移植的 Wasm 應(yīng)用程序，可以在許多 CPU 和 GPU 設(shè)備上運(yùn)行。

curl -LO https://github.com/LlamaEdge/LlamaEdge/releases/latest/download/llama-api-server.wasm

第四步: 下載chatbot UI，以便在瀏覽器中與 DeepSeek-R1-Distill-Llama-8B 模型進(jìn)行交互。

curl -LO https://github.com/LlamaEdge/chatbot-ui/releases/latest/download/chatbot-ui.tar.gztar xzf chatbot-ui.tar.gzrm chatbot-ui.tar.gz

接下來(lái)，使用以下命令行為模型啟動(dòng) LlamaEdge API 服務(wù)器。

wasmedge --dir .:. --nn-preload default:GGML:AUTO:DeepSeek-R1-Distill-Llama-8B-Q5_K_M.gguf \ llama-api-server.wasm \ --prompt-template llama-3-chat \ --ctx-size 8096

然后，打開(kāi)瀏覽器訪問(wèn) http://localhost:8080[7] 開(kāi)始聊天！

或者可以向模型發(fā)送 API 請(qǐng)求。

curl -X POST http://localhost:8080/v1/chat/completions \  -H 'accept:application/json' \  -H 'Content-Type: application/json' \  -d '{'messages':[{'role':'system', 'content': 'You are a helpful assistant.'}, {'role':'user', 'content': 'What is the capital of France?'}], 'model': 'DeepSeek-R1-Distill-Llama-8B'}'  {'id':'chatcmpl-68158f69-8577-4da2-a24b-ae8614f88fea','object':'chat.completion','created':1737533170,'model':'default','choices':[{'index':0,'message':{'content':'The capital of France is Paris.\n</think>\n\nThe capital of France is Paris.<｜end▁of▁sentence｜>','role':'assistant'},'finish_reason':'stop','logprobs':null}],'usage':{'prompt_tokens':34,'completion_tokens':18,'total_tokens':52}}

為 DeepSeek-R1-Distill-Llama-8B 創(chuàng)建與 OpenAI 兼容的 API 服務(wù)

LlamaEdge 是輕量級(jí)的，不需要守護(hù)進(jìn)程或 sudo 進(jìn)程即可運(yùn)行。它可以輕松嵌入到您自己的應(yīng)用程序中！通過(guò)支持聊天和 embedding 模型，LlamaEdge 可以成為本地計(jì)算機(jī)上應(yīng)用程序內(nèi)部的 OpenAI API 替代品！

接下來(lái)，我們將展示如何為 DeepSeek-R1 模型以及 embedding 模型啟動(dòng)完整的 API 服務(wù)器。API 服務(wù)器將具有 chat/completions 和 embeddings 端點(diǎn)。除了上一節(jié)中的步驟之外，我們還需要：

第五步：下載 embedding 模型。

curl -LO https://huggingface.co/second-state/Nomic-embed-text-v1.5-Embedding-GGUF/resolve/main/nomic-embed-text-v1.5.f16.gguf

然后，我們可以使用以下命令行啟動(dòng)具有聊天和 embedding 模型的 LlamaEdge API 服務(wù)器。更詳細(xì)的說(shuō)明，請(qǐng)查看文檔——啟動(dòng) LlamaEdge API 服務(wù)[8]。

wasmedge --dir .:. \   --nn-preload default:GGML:AUTO:DeepSeek-R1-Distill-Llama-8B-Q5_K_M.gguf \   --nn-preload embedding:GGML:AUTO:nomic-embed-text-v1.5.f16.gguf \   llama-api-server.wasm -p llama-3-chat,embedding \     --model-name DeepSeek-R1-Distill-Llama-8B,nomic-embed-text-v1.5.f16 \     --ctx-size 8192,8192 \     --batch-size 128,8192 \     --log-prompts --log-stat

最后，可以按照這些教程將 LlamaEdge API 服務(wù)器作為 OpenAI 的替代與其他 Agent 框架集成。具體來(lái)說(shuō)，在你的應(yīng)用或 Agent 配置中使用以下值來(lái)替換 OpenAI API。

Config option 值 Base API URL http://localhost:8080/v1 --- --- 模型名稱(chēng) (大模型) DeepSeek-R1-Distill-Llama-8B 模型名稱(chēng)(文本 embedding) nomic-embed

就是這樣啦！立即訪問(wèn) LlamaEdge 倉(cāng)庫(kù)并構(gòu)建你的第一個(gè) AI Agent！如果覺(jué)得有意思，請(qǐng)?jiān)诖颂帪槲覀兊?/span>repo[9]加注星標(biāo)。在運(yùn)行此模型時(shí)有任何問(wèn)題，也可以請(qǐng)前往該 repo 提出問(wèn)題或與我們預(yù)約演示，以跨設(shè)備運(yùn)行自己的 LLM！

參考資料

[1]

DeepSeek R1: https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B

[2]

LlamaEdge: https://link.zhihu.com/?target=https%3A//github.com/second-state/LlamaEdge/

[3]

工具鏈: https://zhida.zhihu.com/search?content_id=249066187&content_type=Article&match_order=1&q=%E5%B7%A5%E5%85%B7%E9%93%BE&zhida_source=entity

[4]

選擇這項(xiàng)技術(shù)的原因: https://www.secondstate.io/articles/wasm-runtime-agi/

[5]

WasmEge: https://github.com/WasmEdge/WasmEdge

[6]

DeepSeek-R1-Distill-Llama-8B-GGUF: https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B-GGUF/

[7]

http://localhost:8080: http://localhost:8080/

[8]

啟動(dòng) LlamaEdge API 服務(wù): https://llamaedge.com/docs/user-guide/openai-api/intro/

[9]

repo: https://github.com/LlamaEdge/LlamaEdge

本站僅提供存儲(chǔ)服務(wù)，所有內(nèi)容均由用戶(hù)發(fā)布，如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容，請(qǐng)點(diǎn)擊舉報(bào)。

精品伊人久久大香线蕉,开心久久婷婷综合中文字幕,杏田冲梨,人妻无码aⅴ不卡中文字幕

在自己的設(shè)備上運(yùn)行 DeepSeek-R1-Distill-Llama-8B 模型

為 DeepSeek-R1-Distill-Llama-8B 創(chuàng)建與 OpenAI 兼容的 API 服務(wù)