Skip to main content

Nitro with Pal Chat

This guide demonstrates how to use Nitro with Pal Chat, enabling local AI chat capabilities on mobile devices.

What is Pal - AI Chat Client?

Pal is a mobile app available on the App Store. It offers a customizable chat playground and supports various AI models including GPT-4 Turbo, GPT-4 Vision, DALL-E 3, Claude 2, PaLM, Openrouter, and locally hosted LLMs.

Using Pal with Nitro

1. Start Nitro server

Open your terminal:

Run Nitro
nitro

2. Download Model

Use these commands to download and save the Llama2 7B chat model:

Get a model
mkdir model && cd model
wget -O llama-2-7b-model.gguf https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_M.gguf

For more GGUF model, please look at The Bloke.

3. Run the Model

To load the model, use the following command:

Load model to the server
curl http://localhost:3928/inferences/llamacpp/loadmodel \
-H 'Content-Type: application/json' \
-d '{
"llama_model_path": "model/llama-2-7b-chat.Q5_K_M.gguf",
"ctx_len": 512,
"ngl": 100,
}'

4. Configure Pal Chat

In the OpenAI API Key field, just type any random text (e.g. key-xxxxxx).

Adjust the provide custom host setting under advanced settings in Pal Chat with your LAN IPv4 address (a series of numbers like 192.xxx.x.xxx).

For instruction: How to find your IP

PalChat

5. Chat with the Model

Once the setup is complete, you can start chatting with the model using Pal Chat.

Futher Usage

For convenient usage, you can utilize Jan, as it is integrated with Nitro.