> ## Documentation Index
> Fetch the complete documentation index at: https://doc.chathub.gg/llms.txt
> Use this file to discover all available pages before exploring further.

# Xinference

Xorbits Inference(Xinference) is an open-source project to run language models on your own machine. You can use it to serve open-source LLMs like Llama-3 locally.

## Preparation

Follow the instructions at [Using Xinference](https://inference.readthedocs.io/en/latest/getting_started/using_xinference.html) to setup Xinference and run the `llama-2-chat` model.

## Configuration

<img src="https://mintcdn.com/chathub-5dac209e/ozs4YL2c0N1LB4gO/images/custom-bots/xinference.png?fit=max&auto=format&n=ozs4YL2c0N1LB4gO&q=85&s=596f81958ad2134e953193ae2a19552a" alt="" width="1302" height="1060" data-path="images/custom-bots/xinference.png" />

* **API Host**: `http://127.0.0.1:9997`
* **API Key**: random strings
* **Model**: `llama-3-chat`

You can find all the available models at [https://inference.readthedocs.io/en/latest/models/builtin/llm/index.html](https://inference.readthedocs.io/en/latest/models/builtin/llm/index.html)

## Troubleshooting

* Only models with `chat` in their name are supported.