本地部署AI大模型 —— Ollama文档中文翻译
写在前面
来自Ollama GitHub项目的README.md 文档。文档中涉及的其它文档未翻译,但是对于本地部署大模型而言足够了。
Ollama
开始使用大模型。
macOS
Download
Windows 预览版
Download
Linux
curl -fsSL https://ollama.com/install.sh | sh
手动安装说明
Docker
官方 Ollama Docker 镜像 ollama/ollama 已在 Docker Hub 上可用.
库资源
- ollama-python
- ollama-js
快速启动
使用 Llama 3 本地大模型:
ollama run llama3
模型库
查询 Ollama 支持的可用大模型列表 ollama.com/library
这里是一些可以下载的大模型的例子:
模型 参数 大小 下载 Llama 3 8B 4.7GB ollama run llama3 Llama 3 70B 40GB ollama run llama3:70b Phi 3 Mini 3.8B 2.3GB ollama run phi3 Phi 3 Medium 14B 7.9GB ollama run phi3:medium Gemma 2B 1.4GB ollama run gemma:2b Gemma 7B 4.8GB ollama run gemma:7b Mistral 7B 4.1GB ollama run mistral Moondream 2 1.4B 829MB ollama run moondream Neural Chat 7B 4.1GB ollama run neural-chat Starling 7B 4.1GB ollama run starling-lm Code Llama 7B 3.8GB ollama run codellama Llama 2 Uncensored 7B 3.8GB ollama run llama2-uncensored LLaVA 7B 4.5GB ollama run llava Solar 10.7B 6.1GB ollama run solar Note: 你需要至少8GB RAM 来运行7B 参数的模型, 16GB 来运行 13B 大模型, 32GB 来运行33B.
自定义模型
从 GGUF 引入
Ollama支持在Modelfile中导入GGUF模型:
-
创建一个名为 Modelfile 的文件, 使用带有要导入的模型的本地文件路径的“FROM”指令。
FROM ./vicuna-33b.Q4_0.gguf
-
在 Ollama 里创建模型
ollama create example -f Modelfile
-
运行模型
ollama run example
从 PyTorch 或 Safetensors 引入
检查 引导 来获得关于引入模型的更多信息. (中文版不可用)
自定义 prompt
从Ollama 库下载的大模型可以用prompt 自定义. 例如, 要自定义 llama3 模型:
ollama pull llama3
创建 Modelfile:
FROM llama3 # 将参数设置为1[越高越有创意,越低越连贯] PARAMETER temperature 1 # 设置系统信息 SYSTEM """ You are Mario from Super Mario Bros. Answer as Mario, the assistant, only. """
下一步, 创建并运行模型:
ollama create mario -f ./Modelfile ollama run mario >>> hi Hello! It's your friend Mario.
有关更多示例,请参阅examples目录。有关使用模型文件的更多信息,请参阅Modelfile文档。(中文版未翻译)
命令参考
创建模型
ollama create 用于通过Modelfile 来创建模型.
ollama create mymodel -f ./Modelfile
下载一个模型
ollama pull llama3
这个命令也可以用来更新本地模型。只有不同的部分会被下载。
删除模型
ollama rm llama3
复制模型
ollama cp llama3 my-model
多行输入
要实现多行输入, 你可以用 """ 包围它们:
>>> """Hello, ... world! ... """ I'm a basic program that prints the famous "Hello, world!" message to the console.
多模式模型
>>> What's in this image? /Users/jmorgan/Desktop/smile.png The image features a yellow smiley face, which is likely the central focus of the picture.
将Prompt 作为参数传递
$ ollama run llama3 "Summarize this file: $(cat README.md)" Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.
列出你电脑上的模型
ollama list
启动Ollama
ollama serve 用于在不运行桌面应用程序的情况下启动ollama.
构建
检查 开发者引导
运行本地构建
随后,启动服务:
./ollama serve
最后,在一个单独的shell中,运行一个模型:
./ollama run llama3
REST API
Ollama有一个用于运行和管理模型的REST API.
生成回应
curl http://localhost:11434/api/generate -d '{ "model": "llama3", "prompt":"Why is the sky blue?" }'
和模型对话
curl http://localhost:11434/api/chat -d '{ "model": "llama3", "messages": [ { "role": "user", "content": "why is the sky blue?" } ] }'
检查 API documentation 得到所有终端.
社区整合
Web & Desktop
- Open WebUI
- Enchanted (macOS native)
- Hollama
- Lollms-Webui
- LibreChat
- Bionic GPT
- HTML UI
- Saddle
- Chatbot UI
- Chatbot UI v2
- Typescript UI
- Minimalistic React UI for Ollama Models
- Ollamac
- big-AGI
- Cheshire Cat assistant framework
- Amica
- chatd
- Ollama-SwiftUI
- Dify.AI
- MindMac
- NextJS Web Interface for Ollama
- Msty
- Chatbox
- WinForm Ollama Copilot
- NextChat with Get Started Doc
- Alpaca WebUI
- OllamaGUI
- OpenAOE
- Odin Runes
- LLM-X (Progressive Web App)
- AnythingLLM (Docker + MacOs/Windows/Linux native app)
- Ollama Basic Chat: Uses HyperDiv Reactive UI
- Ollama-chats RPG
- QA-Pilot (Chat with Code Repository)
- ChatOllama (Open Source Chatbot based on Ollama with Knowledge Bases)
- CRAG Ollama Chat (Simple Web Search with Corrective RAG)
- RAGFlow (Open-source Retrieval-Augmented Generation engine based on deep document understanding)
- StreamDeploy (LLM Application Scaffold)
- chat (chat web app for teams)
- Lobe Chat with Integrating Doc
- Ollama RAG Chatbot (Local Chat with multiple PDFs using Ollama and RAG)
- BrainSoup (Flexible native client with RAG & multi-agent automation)
- macai (macOS client for Ollama, ChatGPT, and other compatible API back-ends)
- Olpaka (User-friendly Flutter Web App for Ollama)
- OllamaSpring (Ollama Client for macOS)
- LLocal.in (Easy to use Electron Desktop Client for Ollama)
Terminal
- oterm
- Ellama Emacs client
- Emacs client
- gen.nvim
- ollama.nvim
- ollero.nvim
- ollama-chat.nvim
- ogpt.nvim
- gptel Emacs client
- Oatmeal
- cmdh
- ooo
- shell-pilot
- tenere
- llm-ollama for Datasette’s LLM CLI.
- typechat-cli
- ShellOracle
- tlm
- podman-ollama
- gollama
Database
- MindsDB (Connects Ollama models with nearly 200 data platforms and apps)
- chromem-go with example
Package managers
- Pacman
- Helm Chart
- Guix channel
Libraries
- LangChain and LangChain.js with example
- LangChainGo with example
- LangChain4j with example
- LangChainRust with example
- LlamaIndex
- LiteLLM
- OllamaSharp for .NET
- Ollama for Ruby
- Ollama-rs for Rust
- Ollama4j for Java
- ModelFusion Typescript Library
- OllamaKit for Swift
- Ollama for Dart
- Ollama for Laravel
- LangChainDart
- Semantic Kernel - Python
- Haystack
- Elixir LangChain
- Ollama for R - rollama
- Ollama for R - ollama-r
- Ollama-ex for Elixir
- Ollama Connector for SAP ABAP
- Testcontainers
- Portkey
- PromptingTools.jl with an example
- LlamaScript
Mobile
- Enchanted
- Maid
Extensions & Plugins
- Raycast extension
- Discollama (Discord bot inside the Ollama discord channel)
- Continue
- Obsidian Ollama plugin
- Logseq Ollama plugin
- NotesOllama (Apple Notes Ollama plugin)
- Dagger Chatbot
- Discord AI Bot
- Ollama Telegram Bot
- Hass Ollama Conversation
- Rivet plugin
- Obsidian BMO Chatbot plugin
- Cliobot (Telegram bot with Ollama support)
- Copilot for Obsidian plugin
- Obsidian Local GPT plugin
- Open Interpreter
- Llama Coder (Copilot alternative using Ollama)
- Ollama Copilot (Proxy that allows you to use ollama as a copilot like Github copilot)
- twinny (Copilot and Copilot chat alternative using Ollama)
- Wingman-AI (Copilot code and chat alternative using Ollama and HuggingFace)
- Page Assist (Chrome Extension)
- AI Telegram Bot (Telegram bot using Ollama in backend)
- AI ST Completion (Sublime Text 4 AI assistant plugin with Ollama support)
- Discord-Ollama Chat Bot (Generalized TypeScript Discord Bot w/ Tuning Documentation)
- Discord AI chat/moderation bot Chat/moderation bot written in python. Uses Ollama to create personalities.
- Headless Ollama (Scripts to automatically install ollama client & models on any OS for apps that depends on ollama server)
Supported backends
- llama.cpp project founded by Georgi Gerganov.
-