Provides 'R' bindings to the 'GGML' tensor library for machine learning, designed primarily for 'Vulkan' GPU acceleration with full CPU fallback. Requires 'Vulkan' 1.2+ with legacy pipeline barriers (avoids 'Synchronization2' due to 'RADV' performance issues); supports 'Push Descriptors' ('VK_KHR_push_descriptor') to eliminate descriptor pool overhead when available. 'Vulkan' support is auto-detected at build time on Linux (when 'libvulkan-dev' and 'glslc' are installed) and on Windows (when 'Vulkan' 'SDK' is installed and 'VULKAN_SDK' environment variable is set); all operations fall back to CPU transparently when no GPU is available. Supports tensors up to 5D natively (GGML_MAX_DIMS=5). Implements tensor operations, neural network layers, 'quantization', and a 'Keras'-like sequential model API for building and training networks. Includes 'AdamW' (Adam with Weight decay) and 'SGD' (Stochastic Gradient Descent) optimizers with 'MSE' (Mean Squared Error) and cross-entropy losses. Also provides a dynamic 'autograd' engine ('PyTorch'-style) with data-parallel training via 'dp_train()', broadcast arithmetic, 'f16' (half-precision) support on 'Vulkan' GPU, and a multi-head attention layer for building Transformer architectures. Supports 'ONNX' model import via built-in zero-dependency 'protobuf' parser: load 'pretrained' 'ONNX' models from 'PyTorch', 'TensorFlow', or other frameworks and run inference on 'Vulkan' GPU or CPU. Covers 50+ 'ONNX' ops including convolutions, attention primitives, normalization, quantized ops, shape operations, 'ScatterElements' (with 'Vulkan' 'atomicAdd' for GNN scatter-add), and fused custom ops (RelPosBias2D for 'BoTNet') — sufficient to run real-world models such as 'RoBERTa', 'BERT', 'GPT-NeoX', 'SqueezeNet', 'Inception v3', 'BAT-ResNeXt', 'BoTNet', and 'MNIST' out of the box. Reads 'GGUF' files natively: load 'pretrained' weights from any 'gguf'-compatible source ('llama.cpp', 'Hugging Face') with automatic weight conversion and metadata access. Uses a dedicated weight buffer architecture for zero-overhead repeated inference — weights are loaded to GPU once and never re-transferred. Serves as backend for 'LLM' (Large Language Model) inference via 'llamaR' and Stable Diffusion image generation via 'sd2R'. See <https://github.com/ggml-org/ggml> for more information about the underlying library.
| Version: | 0.7.0 |
| Depends: | R (≥ 4.1.0) |
| Suggests: | Rcpp, testthat (≥ 3.0.0) |
| Published: | 2026-04-06 |
| DOI: | 10.32614/CRAN.package.ggmlR |
| Author: | Yuri Baramykov [aut, cre], Georgi Gerganov [ctb, cph] (Author of the GGML library), Jeffrey Quesnelle [ctb, cph] (Contributor to ops.cpp), Bowen Peng [ctb, cph] (Contributor to ops.cpp), Mozilla Foundation [ctb, cph] (Author of llamafile/sgemm.cpp) |
| Maintainer: | Yuri Baramykov <lbsbmsu at mail.ru> |
| BugReports: | https://github.com/Zabis13/ggmlR/issues |
| License: | MIT + file LICENSE |
| URL: | https://github.com/Zabis13/ggmlR |
| NeedsCompilation: | yes |
| SystemRequirements: | C++17, GNU make, libvulkan-dev, glslc (optional, for GPU on Linux), 'Vulkan' 'SDK' (optional, for GPU on Windows) |
| Materials: | README, NEWS |
| CRAN checks: | ggmlR results |
| Reference manual: | ggmlR.html , ggmlR.pdf |
| Vignettes: |
Autograd Engine (source) Data Parallel Training (source) Embedding ggmlR (source) GPU Vulkan Backend (source) Keras-like API (source) ONNX Import (source) Quantization (source) |
| Package source: | ggmlR_0.7.0.tar.gz |
| Windows binaries: | r-devel: ggmlR_0.6.7.zip, r-release: ggmlR_0.6.7.zip, r-oldrel: ggmlR_0.6.7.zip |
| macOS binaries: | r-release (arm64): ggmlR_0.6.7.tgz, r-oldrel (arm64): ggmlR_0.6.3.tgz, r-release (x86_64): ggmlR_0.6.7.tgz, r-oldrel (x86_64): ggmlR_0.6.7.tgz |
| Old sources: | ggmlR archive |
| Reverse depends: | llamaR |
| Reverse imports: | sd2R |
| Reverse linking to: | llamaR, sd2R |
| Reverse suggests: | cayleyR |
Please use the canonical form https://CRAN.R-project.org/package=ggmlR to link to this page.