Home

Welcome to the mlx-lm-lora wiki!

MLX-LM-LORA is a toolkit for fine-tuning large language models (LLMs) locally on Apple M-series (Silicon) machines, leveraging the MLX-LM backend. It supports all MLX-compatible models (LLaMA 3, Phi, Mistral, Gemma, Qwen, etc.) and allows efficient adapter-style tuning as well as full-model fine-tuning. Training can be invoked via a single CLI (mlx_lm_lora.train) or via YAML configs/Jupyter notebooks (see examples).

Installation (Apple Silicon): Install the package with pip install mlx-lm-lora. The package uses Apple’s Accelerate/Metal framework under the hood, so no special emulation is needed to run on M1/M2/M3 Macs. Ensure you have a recent macOS.

Basic Usage:

The main command is mlx_lm_lora.train. For example, to start a simple LoRA fine-tuning run, you could use:

mlx_lm_lora.train \
    --model mlx-community/Josiefied-Qwen3-8B-abliterated-v1-4bit \
    --train \
    --data mlx-community/wikisql \
    --iters 600

By default this runs LoRA adapter training. You can specify a YAML config file instead with --config config.yaml. Command-line flags always override YAML entries.

Supported Training Types:

LoRA (default), DoRA, and Full. These are set with --train-type lora|dora|full. (LoRA is default).

LoRA: Low-Rank Adaptation (default).
DoRA: Decomposed LoRA (see below).
Full: Full fine-tuning of all model weights.

Supported Training Modes:

sft (standard supervised fine-tuning),
dpo (Direct Preference Optimization),
cpo (Contrastive Preference Optimization),
orpo (Odds Ratio Preference Optimization),
grpo (Group Relative Policy Optimization).
online dpo (Online Direct Preference Optimization).
xpo (Exploratory Preference Optimization).

These are set via --train-mode sft|dpo|cpo|orpo|grpo. Each mode uses different loss/objective (detailed below).

Optimizers:

MLX-LM-LORA supports several optimizers: Adam, AdamW, QHAdam, and Muon. Choose with --optim. (Default is Adam).

YAML configs and Notebooks: All examples here are shown as CLI commands for brevity, but you can equivalently put the same settings in a YAML config (--config) or drive training from a Jupyter notebook. The example config syntax is documented in the repo under the examples path.

⚙️ MLX-LM-LORA is proudly built on top of MLX-LM and optimized exclusively for Apple Silicon.
Trained with state-of-the-art fine-tuning algorithms like LoRA, DoRA, DPO, ORPO, GRPO, and CPO.

Made with ❤️ by Gökdeniz Gülmez · Powered by MLX

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Welcome to the mlx-lm-lora wiki!

Basic Usage:

Supported Training Types:

Supported Training Modes:

Optimizers:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally