Candle

Link

Description

Candle is a Rust-based, minimalist machine learning framework maintained by HuggingFace and designed to offer efficient serverless inference. Though recently developed, Candle already offers excellent support for efficient performance on both GPU and CPU, leveraging compute primitive libraries such as cuDNN for Nvidia GPUs and Intel MKL for CPUs, as well as providing the ability to easily add custom kernels for state-of-the-art algorithms like Flash-Attention. By removing all Python dependencies while retaining an easy to use, torch-like syntax, Candle can greatly improve the efficiency of the production workload and allow for the deployment of lightweight binaries. In addition to this, it also supports WASM, allowing models to be run fully in the browser.

Features

Efficient Serverless Inference
Lightweight Binaries
Easy Integration of HuggingFace Model Weights
WASM support

Candle

Tags

Link

Description

Features

Category