Ubuntu 設定(每次新 Runpod 都要)
|
1 2 3 4 5 6 7 8 9 |
sudo apt update sudo apt install -y locales sudo locale-gen zh_TW sudo locale-gen zh_TW.UTF-8 sudo update-locale LANG="zh_TW.UTF-8" LANGUAGE="zh_TW" echo "export LC_ALL=zh_TW.UTF-8" >> ~/.bashrc sudo apt-get install -y nvtop bash |
模型下載(二選一)
Ollama
|
1 2 3 4 5 |
curl -fsSL https://ollama.com/install.sh | sh export OLLAMA_MODELS=/workspace/ollama/models ollama serve &> /dev/null & |
測試小模型是否真的用 GPU
|
1 2 |
ollama run qwen3:4b "hello" |
跑太久就是沒吃到 GPU
拉大模型,並設定模型 context 長度
|
1 2 3 4 5 6 |
ollama run qwen3-coder-next:q4_K_M >>> /set parameter num_ctx 256000 >>> /save qwen3-coder-next-256k >>> /bye Ollama run qwen3-coder-next-256k “hello” |
vLLM
|
1 2 3 4 5 6 7 8 9 10 11 12 |
pip install vllm==0.15.0 pip install hf_transfer export HF_HOME=/workspace/huggingface VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 vllm serve cyankiwi/Qwen3-Coder-Next-AWQ-4bit \ --port 8000 \ --tensor-parallel-size 1 \ --max-model-len 262144 \ --gpu-memory-utilization 0.80 \ --enable-auto-tool-choice \ --tool-call-parser qwen3_coder & |
測試 vllm
|
1 2 3 4 5 6 7 8 |
curl http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "cyankiwi/Qwen3-Coder-Next-AWQ-4bit", "messages": [{"role": "user", "content": "Say hello in 3 languages."}], "max_tokens": 256 }' |
使用 opencode
自動安裝
|
1 2 3 4 5 6 |
curl -fsSL https://deb.nodesource.com/setup_22.x | bash - curl -fsSL https://opencode.ai/install | bash export PATH="$HOME/.local/bin:$PATH" bash opencode #跑一次 opencode |
下載安裝
|
1 2 3 4 5 6 7 |
wget https://github.com/anomalyco/opencode/releases/download/v1.2.10/opencode-linux-x64.tar.gz tar zvfx opencode-linux-x64.tar.gz mkdir /workspace/bin mv opencode /workspace/bin/ rm opencode-linux-x64.tar.gz ./opencode #跑一次 opencode |
For Ollama 設定檔
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
mkdir ~/.config/opencode vim ~/.config/opencode/opencode.json { "$schema": "https://opencode.ai/config.json", "provider": { "ollama": { "npm": "@ai-sdk/openai-compatible", "name": "Ollama (local)", "options": { "baseURL": "http://127.0.0.1:11434/v1" }, "models": { "qwen3:4b": { "name": "qwen3:4b" } } } }, "model": "ollama/qwen3:4b", "small_model": "ollama/qwen3:4b" } |
For vllm 設定檔
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
mkdir ~/.config/opencode vim ~/.config/opencode/opencode.json { "$schema": "https://opencode.ai/config.json", "provider": { "vllm": { "npm": "@ai-sdk/openai-compatible", "name": "vLLM (local)", "options": { "baseURL": "http://localhost:8000/v1" }, "models": { "cyankiwi/Qwen3-Coder-Next-AWQ-4bit": { "name": "Qwen3-Coder-Next AWQ 4bit", "contextWindow": 200000, "maxTokens": 8192 } } } }, "model": "vllm/cyankiwi/Qwen3-Coder-Next-AWQ-4bit", "small_model": "vllm/cyankiwi/Qwen3-Coder-Next-AWQ-4bit" } |
用 Linux kernel 測試
下載
|
1 2 3 4 5 |
wget https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-5.10.251.tar.xz tar Jvxf linux-5.10.251.tar.xz rm linux-5.10.251.tar.xz cd linux-5.10.251 |
測試
|
1 2 |
opencode -p “這個專案是什麼用途” |



