Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion end-to-end-use-cases/RAFT-Chatbot/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ Once the RAFT dataset is ready in JSON format, we can start fine-tuning. Unfortu
```bash
export PATH_TO_ROOT_FOLDER=./raft-8b
export PATH_TO_RAFT_JSON=recipes/use_cases/end2end-recipes/raft/output/raft.jsonl
torchrun --nnodes 1 --nproc_per_node 4 recipes/quickstart/finetuning/finetuning.py --enable_fsdp --lr 1e-5 --context_length 8192 --num_epochs 1 --batch_size_training 1 --model_name meta-Llama/Meta-Llama-3-8B-Instruct --dist_checkpoint_root_folder $PATH_TO_ROOT_FOLDER --dist_checkpoint_folder fine-tuned --use_fast_kernels --dataset "custom_dataset" --custom_dataset.test_split "test" --custom_dataset.file "recipes/finetuning/datasets/raft_dataset.py" --use-wandb --run_validation True --custom_dataset.data_path $PATH_TO_RAFT_JSON
torchrun --nnodes 1 --nproc_per_node 4 getting-started/finetuning/finetuning.py --enable_fsdp --lr 1e-5 --context_length 8192 --num_epochs 1 --batch_size_training 1 --model_name meta-Llama/Meta-Llama-3-8B-Instruct --dist_checkpoint_root_folder $PATH_TO_ROOT_FOLDER --dist_checkpoint_folder fine-tuned --use_fast_kernels --dataset "custom_dataset" --custom_dataset.test_split "test" --custom_dataset.file "getting-started/finetuning/datasets/raft_dataset.py" --use-wandb --run_validation True --custom_dataset.data_path $PATH_TO_RAFT_JSON
```

For more details on multi-GPU fine-tuning, please refer to the [multigpu_finetuning.md](../../getting-started/finetuning/multigpu_finetuning.md) in the finetuning recipe.
Expand Down
2 changes: 1 addition & 1 deletion getting-started/finetuning/datasets/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Datasets and Evaluation Metrics

The provided fine tuning scripts allows you to select between three datasets by passing the `dataset` arg to the `llama_cookbook.finetuning` module or [`recipes/quickstart/finetuning/finetuning.py`](../finetuning.py) script. The current options are `grammar_dataset`, `alpaca_dataset`and `samsum_dataset`. Additionally, we integrate the OpenAssistant/oasst1 dataset as an [example for a custom dataset](custom_dataset.py) Note: Use of any of the datasets should be in compliance with the dataset's underlying licenses (including but not limited to non-commercial uses)
The provided fine tuning scripts allows you to select between three datasets by passing the `dataset` arg to the `llama_cookbook.finetuning` module or [`getting-started/finetuning/finetuning.py`](../finetuning.py) script. The current options are `grammar_dataset`, `alpaca_dataset`and `samsum_dataset`. Additionally, we integrate the OpenAssistant/oasst1 dataset as an [example for a custom dataset](custom_dataset.py) Note: Use of any of the datasets should be in compliance with the dataset's underlying licenses (including but not limited to non-commercial uses)

* [grammar_dataset](https://huggingface.co/datasets/jfleg) contains 150K pairs of english sentences and possible corrections.
* [alpaca_dataset](https://github.com/tatsu-lab/stanford_alpaca) provides 52K instruction-response pairs as generated by `text-davinci-003`.
Expand Down
8 changes: 4 additions & 4 deletions getting-started/finetuning/finetune_vision_model.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,19 +10,19 @@ We created an example script [ocrvqa_dataset.py](./datasets/ocrvqa_dataset.py) t
For **full finetuning with FSDP**, we can run the following code:

```bash
torchrun --nnodes 1 --nproc_per_node 4 recipes/quickstart/finetuning/finetuning.py --enable_fsdp --lr 1e-5 --num_epochs 3 --batch_size_training 2 --model_name meta-llama/Llama-3.2-11B-Vision-Instruct --dist_checkpoint_root_folder ./finetuned_model --dist_checkpoint_folder fine-tuned --use_fast_kernels --dataset "custom_dataset" --custom_dataset.test_split "test" --custom_dataset.file "recipes/quickstart/finetuning/datasets/ocrvqa_dataset.py" --run_validation True --batching_strategy padding
torchrun --nnodes 1 --nproc_per_node 4 getting-started/finetuning/finetuning.py --enable_fsdp --lr 1e-5 --num_epochs 3 --batch_size_training 2 --model_name meta-llama/Llama-3.2-11B-Vision-Instruct --dist_checkpoint_root_folder ./finetuned_model --dist_checkpoint_folder fine-tuned --use_fast_kernels --dataset "custom_dataset" --custom_dataset.test_split "test" --custom_dataset.file "getting-started/finetuning/datasets/ocrvqa_dataset.py" --run_validation True --batching_strategy padding
```

For **LoRA finetuning with FSDP**, we can run the following code:

```bash
torchrun --nnodes 1 --nproc_per_node 4 recipes/quickstart/finetuning/finetuning.py --enable_fsdp --lr 1e-5 --num_epochs 3 --batch_size_training 2 --model_name meta-llama/Llama-3.2-11B-Vision-Instruct --dist_checkpoint_root_folder ./finetuned_model --dist_checkpoint_folder fine-tuned --use_fast_kernels --dataset "custom_dataset" --custom_dataset.test_split "test" --custom_dataset.file "recipes/quickstart/finetuning/datasets/ocrvqa_dataset.py" --run_validation True --batching_strategy padding --use_peft --peft_method lora
torchrun --nnodes 1 --nproc_per_node 4 getting-started/finetuning/finetuning.py --enable_fsdp --lr 1e-5 --num_epochs 3 --batch_size_training 2 --model_name meta-llama/Llama-3.2-11B-Vision-Instruct --dist_checkpoint_root_folder ./finetuned_model --dist_checkpoint_folder fine-tuned --use_fast_kernels --dataset "custom_dataset" --custom_dataset.test_split "test" --custom_dataset.file "getting-started/finetuning/datasets/ocrvqa_dataset.py" --run_validation True --batching_strategy padding --use_peft --peft_method lora
```

For **finetuning with LLM freeze using FSDP**, we can run the following code:

```bash
torchrun --nnodes 1 --nproc_per_node 4 recipes/quickstart/finetuning/finetuning.py --enable_fsdp --lr 1e-5 --num_epochs 3 --batch_size_training 2 --model_name meta-llama/Llama-3.2-11B-Vision-Instruct --dist_checkpoint_root_folder ./finetuned_model --dist_checkpoint_folder fine-tuned --use_fast_kernels --dataset "custom_dataset" --custom_dataset.test_split "test" --custom_dataset.file "recipes/quickstart/finetuning/datasets/ocrvqa_dataset.py" --run_validation True --batching_strategy padding --freeze_LLM_only True
torchrun --nnodes 1 --nproc_per_node 4 getting-started/finetuning/finetuning.py --enable_fsdp --lr 1e-5 --num_epochs 3 --batch_size_training 2 --model_name meta-llama/Llama-3.2-11B-Vision-Instruct --dist_checkpoint_root_folder ./finetuned_model --dist_checkpoint_folder fine-tuned --use_fast_kernels --dataset "custom_dataset" --custom_dataset.test_split "test" --custom_dataset.file "getting-started/finetuning/datasets/ocrvqa_dataset.py" --run_validation True --batching_strategy padding --freeze_LLM_only True
```
**Note**: `--batching_strategy padding` is needed as the vision model will not work with `packing` method.

Expand All @@ -34,7 +34,7 @@ For more details about local inference with the fine-tuned checkpoint, please re

In order to use a custom dataset, please follow the steps below:

1. Create a new dataset python file under `recipes/quickstart/finetuning/dataset` folder.
1. Create a new dataset python file under `getting-started/finetuning/datasets` folder.
2. In this python file, you need to define a `get_custom_dataset(dataset_config, processor, split, split_ratio=0.9)` function that handles the data loading.
3. In this python file, you need to define a `get_data_collator(processor)` function that returns a custom data collator that can be used by the Pytorch Data Loader.
4. This custom data collator class must have a `__call__(self, samples)` function that converts the image and text samples into the actual inputs that vision model expects.
Expand Down
2 changes: 1 addition & 1 deletion getting-started/finetuning/multigpu_finetuning.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ Here we use a slurm script to schedule a job with slurm over multiple nodes.

```bash

sbatch recipes/quickstart/finetuning/multi_node.slurm
sbatch getting-started/finetuning/multi_node.slurm
# Change the num nodes and GPU per nodes in the script before running.

```
Expand Down
2 changes: 1 addition & 1 deletion getting-started/inference/local_inference/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ In case you have fine-tuned your model with pure FSDP and saved the checkpoints
This is helpful if you have fine-tuned you model using FSDP only as follows:

```bash
torchrun --nnodes 1 --nproc_per_node 8 recipes/quickstart/finetuning/finetuning.py --enable_fsdp --model_name /path_of_model_folder/7B --dist_checkpoint_root_folder model_checkpoints --dist_checkpoint_folder fine-tuned --fsdp_config.pure_bf16
torchrun --nnodes 1 --nproc_per_node 8 getting-started/finetuning/finetuning.py --enable_fsdp --model_name /path_of_model_folder/7B --dist_checkpoint_root_folder model_checkpoints --dist_checkpoint_folder fine-tuned --fsdp_config.pure_bf16
```
Then convert your FSDP checkpoint to HuggingFace checkpoints using:
```bash
Expand Down