Skip to main content

About Nitro

Nitro is a high-efficiency C++ inference engine for edge computing, powering Jan. It is lightweight and embeddable, ideal for product integration.

Learn more on GitHub.

Why Nitro?

  • Fast Inference: Built on top of the cutting-edge inference library llama.cpp, modified to be production ready.
  • Lightweight: Only 3MB, ideal for resource-sensitive environments.
  • Easily Embeddable: Simple integration into existing applications, offering flexibility.
  • Quick Setup: Approximately 10-second initialization.
  • Enhanced Web Framework: Incorporates drogon cpp to boost web service efficiency.

OpenAI-compatible API

Nitro's compatibility with OpenAI's API structure is a notable advantage. Its command format for inference calls closely mirrors that of OpenAI, facilitating an easy transition for users.

For instance, compare the Nitro inference call:

Nitro chat completion
curl http://localhost:3928/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Who won the world series in 2020?"
},
]
}'

OpenAI API chat completion
curl https://api.openai.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Who won the world series in 2020?"
},
]
}'

Cross-Platform

Multi-modal Capabilities

  • Coming Soon: Expansion to multi-modal functionalities - enabling Nitro to process and generate images, and audio.
  • Features to Expect:
    • Large Language-and-Vision Assistant.
    • Speech recognition and transcription.

Architecture

  • Overview: Nitro's architecture is designed for scalability and efficiency, utilizing a modular framework that supports diverse AI functionalities.
  • Detailed Specifications: For an in-depth understanding of Nitro's internal workings, components, and design philosophy, refer to our Architecture Specifications.

Support

GitHub Issue Tracking

  • Report Problems: Encounter an issue with Nitro? File a GitHub issue. Please include detailed error logs and steps to reproduce the problem.

Discord Community

  • Join the Conversation: Discuss Nitro development and seek peer support in our #nitro-dev channel on Discord.

Contributing

How to Contribute

Nitro welcomes contributions in various forms, not just coding. Here are some ways you can get involved:

  • Understand Nitro: Start with the Getting Started guide. Found an issue or have a suggestion? Open an issue to let us know.

  • Feature Development: Engage with community feature requests. Bring ideas to life by opening a pull request for features that interest you.

Acknowledgements

  • drogon: The fast C++ web framework
  • llama.cpp: Inference of LLaMA model in pure C/C++