About Nitro

Nitro is a high-efficiency C++ inference engine for edge computing, powering Jan. It is lightweight and embeddable, ideal for product integration.

Learn more on GitHub.

Why Nitro?

Fast Inference: Built on top of the cutting-edge inference library llama.cpp, modified to be production ready.
Lightweight: Only 3MB, ideal for resource-sensitive environments.
Easily Embeddable: Simple integration into existing applications, offering flexibility.
Quick Setup: Approximately 10-second initialization.
Enhanced Web Framework: Incorporates drogon cpp to boost web service efficiency.

OpenAI-compatible API

Nitro's compatibility with OpenAI's API structure is a notable advantage. Its command format for inference calls closely mirrors that of OpenAI, facilitating an easy transition for users.

For instance, compare the Nitro inference call:

Nitro chat completion
curl http://localhost:3928/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-3.5-turbo",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Who won the world series in 2020?"
      },
    ]
  }'

OpenAI API chat completion
curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-3.5-turbo",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Who won the world series in 2020?"
      },
    ]
  }'

Extends OpenAI's API with helpful model methods:
- Unload model
- Checking model status

Cross-Platform

Operating Systems: Nitro Supports Windows, Linux, and MacOS.
Hardware Compatibility:
- CPUs: ARM, x86.
- GPUs: Nvidia, AMD.
Detailed Resources: Windows Installation Guide, Linux and MacOS Installation Guide.

Coming Soon: Expansion to multi-modal functionalities - enabling Nitro to process and generate images, and audio.
Features to Expect:
- Large Language-and-Vision Assistant.
- Speech recognition and transcription.

Architecture

Overview: Nitro's architecture is designed for scalability and efficiency, utilizing a modular framework that supports diverse AI functionalities.
Detailed Specifications: For an in-depth understanding of Nitro's internal workings, components, and design philosophy, refer to our Architecture Specifications.

Support

GitHub Issue Tracking

Report Problems: Encounter an issue with Nitro? File a GitHub issue. Please include detailed error logs and steps to reproduce the problem.

Discord Community

Join the Conversation: Discuss Nitro development and seek peer support in our #nitro-dev channel on Discord.

Contributing

How to Contribute

Nitro welcomes contributions in various forms, not just coding. Here are some ways you can get involved:

Understand Nitro: Start with the Getting Started guide. Found an issue or have a suggestion? Open an issue to let us know.
Feature Development: Engage with community feature requests. Bring ideas to life by opening a pull request for features that interest you.

Acknowledgements

drogon: The fast C++ web framework
llama.cpp: Inference of LLaMA model in pure C/C++

Why Nitro?​

OpenAI-compatible API​

Cross-Platform​

Multi-modal Capabilities​

Architecture​

Support​

GitHub Issue Tracking​

Discord Community​

Contributing​

How to Contribute​

Links​

Acknowledgements​