Documentation
Local Engines
llama.cpp

llama.cpp (Cortex)

Overview

Jan uses llama.cpp for running local AI models. You can find its settings in Settings () > Local Engine > llama.cpp:


llama.cpp


These settings are for advanced users, you would want to check these settings when:

  • Your AI models are running slowly or not working
  • You've installed new hardware (like a graphics card)
  • You want to tinker & test performance with different backends

Engine Version and Updates

  • Engine Version: View current version of llama.cpp engine
  • Check Updates: Verify if a newer version is available & install available updates when it's available

Available Backends

Jan offers different backend variants for llama.cpp based on your operating system, you can:

  • Download different backends as needed
  • Switch between backends for different hardware configurations
  • View currently installed backends in the list
⚠️

Choose the backend that matches your hardware. Using the wrong variant may cause performance issues or prevent models from loading.

CUDA Support (NVIDIA GPUs)

  • llama.cpp-avx-cuda-11-7
  • llama.cpp-avx-cuda-12-0
  • llama.cpp-avx2-cuda-11-7
  • llama.cpp-avx2-cuda-12-0
  • llama.cpp-avx512-cuda-11-7
  • llama.cpp-avx512-cuda-12-0
  • llama.cpp-noavx-cuda-11-7
  • llama.cpp-noavx-cuda-12-0

CPU Only

  • llama.cpp-avx
  • llama.cpp-avx2
  • llama.cpp-avx512
  • llama.cpp-noavx

Other Accelerators

  • llama.cpp-vulkan
  • For detailed hardware compatibility, please visit our guide for Windows.
  • AVX, AVX2, and AVX-512 are CPU instruction sets. For best performance, use the most advanced instruction set your CPU supports.
  • CUDA versions should match your installed NVIDIA drivers.