Skip to content

Commit ca5c06c

Browse files
docs: add runtime configuration best practices guide (#690)
Signed-off-by: Ayush-Patel-56 <ayushpatel2731@gmail.com> Signed-off-by: Ayush Patel <ayushpatel2731@gmail.com>
1 parent 5e0bbd2 commit ca5c06c

4 files changed

Lines changed: 153 additions & 1 deletion

File tree

docs/en/TOC.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,8 @@
1010
+ Get Started
1111
- [Quick Start](userguide/get_started.md)
1212
- [Installation](userguide/install.md)
13-
- [Trubleshooting](userguide/troubleshooting.md)
13+
- [Configuration Best Practices](userguide/config_best_practices.md)
14+
- [Troubleshooting](userguide/troubleshooting.md)
1415
+ Dataset
1516
+ Creation
1617
- [Accelerate Data Accessing(via POSIX)](samples/accelerate_data_accessing.md)
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
# Fluid Configuration Guide: Best Practices and Tuning
2+
3+
This document serves as a deep-dive into the configuration knobs of Fluid. While Fluid works out-of-the-box with sensible defaults, achieving production-grade performance requires tuning based on your specific storage backend and workload characteristics.
4+
5+
## 1. Dataset: The Foundation
6+
7+
The `Dataset` resource defines **where** your data lives and **how** it should be accessed.
8+
9+
### Key Considerations
10+
* **Mount Point Naming**: When mounting multiple sources, use explicit `name` fields. Fluid uses these names to create the internal directory structure. Without them, you risk path collisions if two sources have similar root structures.
11+
* **Read-Only vs. Read-Write**: For most AI training workloads, set `readOnly: true` in your mounts. This allows the caching engine (like Alluxio) to optimize for read-heavy traffic and avoid the overhead of consistency checks for writes.
12+
13+
| Config Point | Why it matters |
14+
| :--- | :--- |
15+
| `spec.placement: Exclusive` | **Performance Isolation.** Prevents other datasets from "stealing" cache space on the same node. Essential for low-latency requirements. |
16+
| `spec.nodeAffinity` | **Disk Type Targeting.** If your cluster has a mix of HDD and NVMe nodes, use affinity to ensure Fluid only caches data on the high-speed nodes. |
17+
18+
---
19+
20+
## 2. AlluxioRuntime: High-Performance Caching
21+
22+
Alluxio is the "engine" for most Fluid deployments. Its configuration determines your data-plane throughput.
23+
24+
### Tuning the Memory Tier (MEM)
25+
For the fastest possible access, use `/dev/shm` (Ramdisk).
26+
* **Best Practice**: Ensure your `tieredstore` levels point to a medium of type `MEM`.
27+
* **Gotcha**: If your node runs out of RAM, the Alluxio Worker might be OOMKilled. Always set `resources.limits.memory` slightly higher than your total `quota`.
28+
29+
### JVM Heap Management
30+
Since Alluxio is Java-based, `jvmOptions` are critical. If you have millions of small files, the Master node needs more heap space to track metadata.
31+
```yaml
32+
# Example: Increasing Master Heap for large metadata
33+
master:
34+
jvmOptions:
35+
- "-Xms4g"
36+
- "-Xmx4g"
37+
```
38+
39+
---
40+
41+
## 3. JuiceFSRuntime: Cloud-Native POSIX
42+
43+
JuiceFS is excellent for environments where POSIX compliance is a hard requirement.
44+
45+
### Metadata vs. Data
46+
JuiceFS separates metadata (Redis/MySQL/TiKV) from data (S3/OSS).
47+
* **Optimization**: Use the `attr-cache` option in `spec.fuse.options`. Setting this to `60s` or higher can drastically reduce the load on your metadata service during repetitive tasks like `ls -R`.
48+
* **Worker Caching**: Prefer configuring local cache capacity and directories through `spec.tieredstore.levels`, which is the recommended way to size JuiceFS worker cache and avoid filling the node's root partition. If you are maintaining an older configuration, note that `spec.worker.options` uses the key `cache-size` (no leading dashes), but `cache-size`/`cache-dir` there are deprecated in favor of `tieredstore.levels`.
49+
50+
---
51+
52+
## 4. JindoRuntime: Alibaba Cloud Optimization
53+
54+
If you are running in ACK (Alibaba Cloud Container Service), JindoRuntime provides native optimizations for OSS.
55+
56+
* **Credential Management**: Avoid hardcoding AK/SK in the YAML. Use `hadoopConfig` to reference a ConfigMap containing `core-site.xml` with your OSS credentials.
57+
* **Log Bloat**: Jindo can be chatty. Set `spec.fuse.logConfig` to `level: warn` for stable production environments to save disk space on logs.
58+
59+
---
60+
61+
## 5. ThinRuntime: The "Universal" Adapter
62+
63+
ThinRuntime is intended for storage systems that don't have a dedicated Fluid controller (e.g., NFS, Ceph).
64+
65+
* **Standardization**: Leverage `ThinRuntimeProfile`. It allows you to define the "how-to-mount" logic once and reuse it across multiple datasets.
66+
* **Health Probes**: Since ThinRuntime relies on external FUSE binaries, always define `livenessProbe`. This allows Kubernetes to auto-restart the FUSE pod if the mount point becomes "stale" or "transport endpoint is not connected."
67+
68+
---
69+
70+
## Common Production Checklist
71+
72+
1. **Resource Quotas**: Never run workers without `limits`. A caching engine will naturally try to consume all available resources.
73+
2. **Pull Secrets**: If your images are in a private registry, `imagePullSecrets` must be defined at the spec level so the Master, Worker, and Fuse pods can all pull successfully.
74+
3. **Tiered Locality**: Use `storage-network` labels if your storage and compute are on separate network planes to avoid cross-switch bottlenecking.
75+

docs/zh/TOC.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
+ 入门
1515
- [安装](userguide/install.md)
1616
- [快速开始](userguide/get_started.md)
17+
- [配置最佳实践](userguide/config_best_practices.md)
1718
- [问题诊断](userguide/troubleshooting.md)
1819
+ 数据集使用
1920
+ 创建
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
# Fluid 配置指南:最佳实践与性能调优
2+
3+
本文档旨在深入探讨 Fluid 的各项配置参数。虽然 Fluid 提供了开箱即用的默认值,但在生产环境中,针对特定的存储后端和工作负载特性进行调优是确保高性能的关键。
4+
5+
## 1. Dataset: 核心基础
6+
7+
`Dataset` 资源定义了数据的**来源**以及**访问方式**
8+
9+
### 关键注意事项
10+
* **挂载点命名**: 在挂载多个数据源时,请务必指定清晰的 `name` 字段。Fluid 会根据这些名称构建内部目录结构。如果不指定名称,当多个数据源具有相似的根目录结构时,可能会发生路径冲突。
11+
* **只读与读写**: 对于大多数 AI 训练任务,建议将挂载点设置为 `readOnly: true`。这允许像 Alluxio 这样的缓存引擎针对纯读流量进行优化,并避免维护写入一致性带来的额外开销。
12+
13+
| 配置项 | 核心价值 |
14+
| :--- | :--- |
15+
| `spec.placement: Exclusive` | **性能隔离。** 防止同一节点上的其他数据集“挤占”缓存空间,是低延迟要求的保障。 |
16+
| `spec.nodeAffinity` | **精准定位。** 如果集群中包含 HDD 和 NVMe 混合节点,通过亲和性确保 Fluid 只在高速节点上配置缓存。 |
17+
18+
---
19+
20+
## 2. AlluxioRuntime: 高性能分布式缓存
21+
22+
Alluxio 是 Fluid 中应用最广泛的缓存引擎,其配置直接决定了数据层(Data-Plane)的吞吐量。
23+
24+
### 内存级缓存调优 (MEM)
25+
为了获得极速访问,通常使用 `/dev/shm`(内存盘)。
26+
* **最佳实践**: 确保 `tieredstore` 层级设置中,介质类型指向 `MEM`
27+
* **风险提示**: 如果节点内存不足,Alluxio Worker 可能会因 OOM 被 kill。务必将 `resources.limits.memory` 设置为略高于 `配额`
28+
29+
### JVM 堆内存管理
30+
由于 Alluxio 基于 Java 开发,`jvmOptions` 至关重要。如果存在数百万个小文件,Master 节点需要更多的堆内存来跟踪元数据。
31+
```yaml
32+
# 示例:为元数据较多的场景增加 Master 堆内存
33+
master:
34+
jvmOptions:
35+
- "-Xms4g"
36+
- "-Xmx4g"
37+
```
38+
39+
---
40+
41+
## 3. JuiceFSRuntime: 云原生 POSIX 存储
42+
43+
JuiceFS 非常适合那些对 POSIX 兼容性有硬性要求的环境。
44+
45+
### 元数据与性能
46+
JuiceFS 将元数据与数据物理隔离。
47+
* **优化建议**: 利用 `spec.fuse.options` 中的 `attr-cache` 选项。将其设置为 `60s` 或更长,可以显著减轻元数据服务在执行 `ls -R` 等高频扫描任务时的压力。
48+
* **空间配额**: 优先通过 `spec.tieredstore.levels` 规划本地缓存目录与容量,限制本地磁盘占用,防止存储填满宿主机根分区。避免继续在 `spec.worker.options` 中使用 `cache-size` / `cache-dir` 这类已弃用配置。
49+
50+
---
51+
52+
## 4. JindoRuntime: 阿里云生态优化
53+
54+
在阿里云 ACK 环境中,JindoRuntime 针对 OSS 提供了原生加速。
55+
56+
* **凭据安全**: 避免在 YAML 中硬编码 AK/SK。推荐使用 `hadoopConfig` 引用包含 `core-site.xml` 的 ConfigMap。
57+
* **日志控制**: Jindo 在默认情况下日志量可能较大。生产环境中建议设置 `spec.fuse.logConfig` 为 `level: warn`,以节省节点日志存储空间。
58+
59+
---
60+
61+
## 5. ThinRuntime: 通用适配器
62+
63+
ThinRuntime 专为尚未内置在 Fluid 中的存储系统(如 NFS、Ceph)而设计。
64+
65+
* **标准化部署**: 充分利用 `ThinRuntimeProfile`。您可以一次性定义挂载逻辑,并在多个 Dataset 中复用。
66+
* **健康检查**: 由于 ThinRuntime 依赖外部 FUSE 进程,务必定义 `livenessProbe`。这能确保在挂载点出现“传输端点未连接”等异常时,Kubernetes 能自动重启 FUSE Pod。
67+
68+
---
69+
70+
## 生产环境 Checklist
71+
72+
1. **资源配额**: 严禁在不设置 `limits` 的情况下运行 Worker。缓存引擎通常会倾向于耗尽所有可用资源。
73+
2. **镜像密钥**: 如果镜像存储在私有仓库,必须在 Spec 级配置 `imagePullSecrets`,以确保所有组件 Pod(Master, Worker, Fuse)都能成功拉取镜像并启动。
74+
3. **分层本地性**: 如果计算节点与存储节点位于不同的网络平面,建议结合网络标签(storage-network)使用,以避免跨核心交换机的流量瓶颈。
75+

0 commit comments

Comments
 (0)