|
| 1 | +# CacheRuntime Integration Guide |
| 2 | + |
| 3 | +# Installation |
| 4 | + |
| 5 | +* Install Fluid version that supports CacheRuntime. |
| 6 | + |
| 7 | + |
| 8 | +```shell |
| 9 | +helm repo add fluid https://fluid-cloudnative.github.io/charts |
| 10 | + |
| 11 | +helm repo update |
| 12 | + |
| 13 | +helm search repo fluid --devel |
| 14 | + |
| 15 | +helm install fluid fluid/fluid --devel --version xxx -n fluid-system |
| 16 | +``` |
| 17 | + |
| 18 | +# Integration |
| 19 | + |
| 20 | +## Step 1. Plan Cluster Topology |
| 21 | + |
| 22 | +First, you need to plan a cluster topology: |
| 23 | + |
| 24 | +* Determine the topology type and which components are included: |
| 25 | + |
| 26 | + |
| 27 | +* MasterSlave: Master/Worker/Client |
| 28 | + |
| 29 | +* P2P/DHT: Worker/Client |
| 30 | + |
| 31 | +* ClientOnly: Client |
| 32 | + |
| 33 | + |
| 34 | +* Determine the form and configuration of each component: |
| 35 | + |
| 36 | + |
| 37 | +* Stateful/Stateless - Determines the workload type |
| 38 | + |
| 39 | +* Standalone/Active-Standby/Cluster |
| 40 | + |
| 41 | + |
| 42 | +The table below shows basic information examples for deploying several major cache topology types. |
| 43 | + |
| 44 | +* MasterSlave: CubeFS/Alluxio |
| 45 | + |
| 46 | + |
| 47 | +| Topology | | Settings | |
| 48 | +| --- | --- |---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |
| 49 | +| Master | | * workLoadType: apps/v1/StatefulSet<br> <br>* Image configuration<br> <br>* Startup command<br> <br>* UFS mount command<br> <br>* HeadlessService needs to be created<br> <br>* Authentication keys need to be mounted | |
| 50 | +| Worker: Used for single worker role definition | | * workLoadType: apps/v1/StatefulSet<br> <br>* Image configuration<br> <br>* Startup command<br> <br>* HeadlessService needs to be created<br> <br>* Authentication keys do NOT need to be mounted<br> <br>* TieredStore needs to be configured | |
| 51 | +| Client | Fuse | * Role: Posix client<br> <br>* workLoadType: apps/v1/DaemonSet<br> <br>* Image configuration<br> <br>* Startup command<br> <br>* Authentication parameters do NOT need to be mounted<br> <br>* TieredStore is NOT supported | |
| 52 | + |
| 53 | +* P2P Worker: JuiceFS |
| 54 | + |
| 55 | + |
| 56 | +| Topology | Settings | |
| 57 | +| --- | --- | |
| 58 | +| Worker: Used for single worker role definition | * workLoadType: apps/v1/StatefulSet<br> <br>* Image configuration<br> <br>* Startup command<br> <br>* HeadlessService<br> <br>* Authentication parameters need to be mounted<br> <br>* TieredStore is supported | |
| 59 | +| Client | * Role: Fuse client<br> <br>* workLoadType: apps/v1/DaemonSet<br> <br>* Image configuration<br> <br>* Startup command<br> <br>* Service is NOT required<br> <br>* Authentication parameters need to be mounted<br> <br>* TieredStore is supported | |
| 60 | + |
| 61 | +## Step 2. Prepare Cache System Template |
| 62 | + |
| 63 | +A cache system template in Fluid contains the following parts: |
| 64 | + |
| 65 | +```yaml |
| 66 | +├── Name # runtimeClassName is specified in CacheRuntime |
| 67 | +├── FileSystemType # File system type, used for mount readiness verification |
| 68 | +├── Topology |
| 69 | +│ ├── Master[component] |
| 70 | +│ ├── Worker[component] |
| 71 | +│ └── client[component] |
| 72 | +└── ExtraResources |
| 73 | + └── ConfigMaps |
| 74 | +``` |
| 75 | + |
| 76 | +The component in Topology mainly contains the following content: |
| 77 | + |
| 78 | +| Content | Description | Recommendation | |
| 79 | +| --- |--------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |
| 80 | +| WorkloadType | The workload type of this component | For stateful applications like Master/Worker, StatefulSet is the most common choice, as it can more easily cooperate with formatted DNS domain names provided by Headless Service for access<br>If Client is a Fuse client responsible for providing Posix access capability for pods on nodes, DaemonSet is generally used<br>If Client is an SDK proxy as a centralized stateless application, Deployment with ClusterIP type Service is generally used | |
| 81 | +| Options | Default options, will be overridden by user settings | | |
| 82 | +| Template | PodTemplateSpec native field | | |
| 83 | +| Service | Currently only supports Headless | | |
| 84 | +| Dependencies | EncryptOption | Whether this component needs Fluid to mount the access keys defined in Dataset for accessing data sources [Not supported in current version], using the keys defined in Dataset for access. | |
| 85 | +| | ExtraResources | Whether this component needs to mount additional ConfigMaps (the dependent ConfigMap information is defined in the ExtraResources field of CacheRuntimeClass). | |
| 86 | +| ExecutionEntries| MountUFS | For Master-Worker architecture, when Master is Ready, the underlying file system mount operation needs to be executed. | |
| 87 | +| ExecutionEntries| ReportSummary | How the cache system defines operations to obtain cache information metrics [Not supported in current version]. | |
| 88 | + |
| 89 | +### Step 2.1 Prepare K8s-adapted Native Images and Define Component workloadType and PodTemplate |
| 90 | + |
| 91 | +You can first use native images, configure component **workloadType** and **PodTemplate**, manually start a fixed cache system in the K8s cluster, manually start the cache system in the pod, and make it locally accessible. This step is mainly used to clarify what K8s resources are needed and to prepare base images. |
| 92 | + |
| 93 | +### Step 2.2 Clarify What Configurations CacheRuntime Should Provide for Components |
| 94 | + |
| 95 | +Mainly clarify the following settings: |
| 96 | + |
| 97 | +* Service |
| 98 | + |
| 99 | +* Dependencies |
| 100 | + |
| 101 | + |
| 102 | +### Step 2.3 Confirm Default ENV Provided by Fluid CacheRuntime for Components, Applicable by Scripts Inside Containers |
| 103 | + |
| 104 | +| ENV | Description | |
| 105 | +| --- |----------------------------------| |
| 106 | +| FLUID_DATASET_NAME | Dataset name, generally used for isolation between groups in cache group concepts | |
| 107 | +| FLUID_DATASET_NAMESPACE | Namespace where the dataset is located | |
| 108 | +| FLUID_RUNTIME_CONFIG_PATH | Runtime configuration path provided by fluid | |
| 109 | +| FLUID_RUNTIME_MOUNT_PATH | Often used by Client, the target path where client performs mount action | |
| 110 | +| FLUID_RUNTIME_COMPONENT_TYPE | Indicates whether the current component is master, worker, or client | |
| 111 | +| FLUID_RUNTIME_COMPONENT_SVC_NAME | If the component defines a service, this value is the service name | |
| 112 | + |
| 113 | +### Step 2.4 Create RuntimeClass Example and Field Description: |
| 114 | + |
| 115 | +```yaml |
| 116 | +apiVersion: data.fluid.io/v1alpha1 |
| 117 | +kind: CacheRuntimeClass |
| 118 | +metadata: |
| 119 | + name: demofs |
| 120 | +fileSystemType: $fsType |
| 121 | +topology: |
| 122 | + master: |
| 123 | + workloadType: # Create master with StatefulSet workload |
| 124 | + apiVersion: apps/v1 |
| 125 | + kind: StatefulSet |
| 126 | + service: # Need to create Headless Service for master, only supported when workloadType is StatefulSet |
| 127 | + headless: {} |
| 128 | + dependencies: |
| 129 | + encryptOption: {} # Current not support |
| 130 | + podTemplateSpec: |
| 131 | + spec: |
| 132 | + restartPolicy: Always |
| 133 | + containers: |
| 134 | + - name: master |
| 135 | + image: $image |
| 136 | + args: |
| 137 | + - /bin/sh |
| 138 | + - -c |
| 139 | + - custom-endpoint.sh |
| 140 | + imagePullPolicy: IfNotPresent |
| 141 | + worker: |
| 142 | + workloadType: # Create worker with StatefulSet workload |
| 143 | + apiVersion: apps/v1 |
| 144 | + kind: StatefulSet |
| 145 | + service: |
| 146 | + headless: {} # Need to create Headless Service for worker, only supported when workloadType is StatefulSet |
| 147 | + dependencies: {} |
| 148 | + podTemplateSpec: |
| 149 | + spec: |
| 150 | + restartPolicy: Always |
| 151 | + containers: |
| 152 | + - name: worker |
| 153 | + image: $image |
| 154 | + args: |
| 155 | + - /bin/sh |
| 156 | + - -c |
| 157 | + - custom-endpoint.sh |
| 158 | + imagePullPolicy: IfNotPresent |
| 159 | + client: |
| 160 | + workloadType: # Create client with DaemonSet workload |
| 161 | + apiVersion: apps/v1 |
| 162 | + kind: DaemonSet |
| 163 | + dependencies: |
| 164 | + encryptOption: {} # Need to provide encryptOption declared by user in dataset for client |
| 165 | + podTemplateSpec: |
| 166 | + spec: |
| 167 | + restartPolicy: Always |
| 168 | + containers: |
| 169 | + - name: client |
| 170 | + image: $image |
| 171 | + securityContext: # Usually client needs to configure privileged for operating fuse device |
| 172 | + privileged: true |
| 173 | + runAsUser: 0 |
| 174 | + args: |
| 175 | + - /bin/sh |
| 176 | + - -c |
| 177 | + - custom-endpoint.sh |
| 178 | + imagePullPolicy: IfNotPresent |
| 179 | +``` |
| 180 | +
|
| 181 | +### Step 2.5 User Creates Runtime |
| 182 | +
|
| 183 | +```yaml |
| 184 | +apiVersion: data.fluid.io/v1alpha1 |
| 185 | +kind: Dataset |
| 186 | +metadata: |
| 187 | + name: demofs |
| 188 | + namespace: default |
| 189 | +spec: |
| 190 | + placement: Shared |
| 191 | + accessModes: |
| 192 | + - ReadWriteMany |
| 193 | + mounts: |
| 194 | + - name: demo |
| 195 | + mountPoint: "demofs:///" |
| 196 | + options: |
| 197 | + key1: value1 |
| 198 | + key2: value2 |
| 199 | + encryptOptions: |
| 200 | + - name: token |
| 201 | + valueFrom: |
| 202 | + secretKeyRef: |
| 203 | + name: jfs-secret |
| 204 | + key: token |
| 205 | + - name: access-key |
| 206 | + valueFrom: |
| 207 | + secretKeyRef: |
| 208 | + name: jfs-secret |
| 209 | + key: access-key |
| 210 | + - name: secret-key |
| 211 | + valueFrom: |
| 212 | + secretKeyRef: |
| 213 | + name: jfs-secret |
| 214 | + key: secret-key |
| 215 | +--- |
| 216 | +apiVersion: data.fluid.io/v1alpha1 |
| 217 | +kind: CacheRuntime |
| 218 | +metadata: |
| 219 | + name: demofs |
| 220 | + namespace: default |
| 221 | +spec: |
| 222 | + runtimeClassName: demofs |
| 223 | + master: |
| 224 | + options: # master option |
| 225 | + key1: value1 |
| 226 | + key2: value2 |
| 227 | + replicas: 2 # master replica count |
| 228 | + worker: |
| 229 | + options: # worker option |
| 230 | + key1: value1 |
| 231 | + key2: value2 |
| 232 | + replicas: 2 # worker |
| 233 | + tieredStore: |
| 234 | + levels: # worker cache configuration |
| 235 | + - quota: 40Gi |
| 236 | + low: "0.5" |
| 237 | + high: "0.8" |
| 238 | + path: "/cache-data" |
| 239 | + medium: |
| 240 | + emptyDir: # Use tmpfs as cache medium |
| 241 | + medium: Memory |
| 242 | + client: |
| 243 | + options: |
| 244 | + key1: value1 |
| 245 | + key2: value2 |
| 246 | + volumeMounts: # Can configure volumes and corresponding volumeMounts |
| 247 | + - name: demo |
| 248 | + mountPath: /mnt |
| 249 | + volumes: |
| 250 | + - name: demo |
| 251 | + persistentVolumeClaim: |
| 252 | + claimName: test |
| 253 | + |
| 254 | +``` |
| 255 | + |
| 256 | +### Step 2.6 Confirm RuntimeConfig Provided by Fluid CacheRuntime for Components, Parse Parameters to Start Containers |
| 257 | +> You can modify the entryPoint script based on the native image, first parse RuntimeConfig, generate corresponding configuration files, and then start the container. |
| 258 | +> You can refer to the integration example in test/gha-e2e/curvine in the official repository. |
| 259 | +
|
| 260 | +In cacheruntime, all control plane processes are handled by Fluid. However, as a data caching engine, when providing services, the entire cache system requires **topology**, **data source**, **authentication**, and **cache information**. Fluid will provide this information to components through configuration files based on different Component roles. The component's internal process is responsible for parsing this configuration to perform environment variable configuration, data engine configuration file generation, and other operations. After preparation is complete, the data engine process can be started. For specific parsing details, please refer to the table below: |
| 261 | + |
| 262 | +* Taking the above resources as an example, the Config examples mounted by Master/Worker/Client and maintained by Fluid are as follows: |
| 263 | + |
| 264 | + |
| 265 | +```json |
| 266 | +{ |
| 267 | + "mounts": [ |
| 268 | + { |
| 269 | + "mountPoint": "s3://test", |
| 270 | + "options": { |
| 271 | + "access": "minioadmin", |
| 272 | + "endpoint_url": "http://minio:9000", |
| 273 | + "path_style": "true", |
| 274 | + "region_name": "us-east-1", |
| 275 | + "secret": "minioadmin" |
| 276 | + }, |
| 277 | + "name": "minio", |
| 278 | + "path": "/minio" |
| 279 | + } |
| 280 | + ], |
| 281 | + "accessModes": [ |
| 282 | + "ReadWriteMany" |
| 283 | + ], |
| 284 | + "targetPath": "/runtime-mnt/cache/default/curvine-demo/cache-fuse", |
| 285 | + "master": { |
| 286 | + "enabled": true, |
| 287 | + "name": "curvine-demo-master", |
| 288 | + "options": { |
| 289 | + "key1": "master-value1" |
| 290 | + }, |
| 291 | + "replicas": 1, |
| 292 | + "service": { |
| 293 | + "name": "svc-curvine-demo-master" |
| 294 | + } |
| 295 | + }, |
| 296 | + "worker": { |
| 297 | + "enabled": true, |
| 298 | + "name": "curvine-demo-worker", |
| 299 | + "options": { |
| 300 | + "key1": "worker-value1" |
| 301 | + }, |
| 302 | + "replicas": 1, |
| 303 | + "service": { |
| 304 | + "name": "svc-curvine-demo-worker" |
| 305 | + } |
| 306 | + }, |
| 307 | + "client": { |
| 308 | + "enabled": true, |
| 309 | + "name": "curvine-demo-client", |
| 310 | + "options": { |
| 311 | + "key1": "value1" |
| 312 | + }, |
| 313 | + "service": { |
| 314 | + "name": "" |
| 315 | + } |
| 316 | + } |
| 317 | +} |
| 318 | +``` |
0 commit comments