Xorbits Inference(Xinference) is an open-source project to run language models on your own machine. You can use it to serve open-source LLMs like Llama-3 locally.Documentation Index
Fetch the complete documentation index at: https://doc.chathub.gg/llms.txt
Use this file to discover all available pages before exploring further.
Preparation
Follow the instructions at Using Xinference to setup Xinference and run thellama-2-chat model.
Configuration

- API Host:
http://127.0.0.1:9997 - API Key: random strings
- Model:
llama-3-chat
Troubleshooting
- Only models with
chatin their name are supported.
