Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/model-serving/storage/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ KServe can serve models from various storage locations, including:
- **[Git](./providers/git.md)** - Serve models from git repositories.
- **[Persistent Volume Claims (PVC)](./providers/pvc.md)** - Use Kubernetes PVCs to store and serve models.
- **[Hugging Face](./providers/hf.md)** - Directly serve models from the Hugging Face model hub.
- **[ModelScope](./providers/ms.md)** - Directly serve models from the ModelScope model hub.
- **[OCI Images](./providers/oci.md)** - Package and serve models as OCI container images using Modelcars.

## Storage Initializer
Expand Down
122 changes: 122 additions & 0 deletions docs/model-serving/storage/providers/ms.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
---
title: ModelScope
description: Deploy InferenceService with models from ModelScope Hub in KServe, including support for both public and private models.
---

# Deploy InferenceService with models from ModelScope Hub

You can specify the `storageUri` field on `InferenceService` YAML with the following format to deploy the models from ModelScope Hub.

```
ms://${NAMESPACE}/${MODEL}:${REVISION}(optional)
```

e.g. `ms://qwen/Qwen2-0.5B-Instruct`

[ModelScope](https://www.modelscope.cn) is one of the largest model hubs in China, hosting popular models such as Qwen, DeepSeek, and many others.

## Public ModelScope Models

If no credential is provided, an anonymous client will be used to download the model from the ModelScope repository.

## Private ModelScope Models

KServe supports authenticating with `MS_TOKEN` for downloading the model. Create a Kubernetes secret to store the ModelScope token.

```yaml title="yaml"
apiVersion: v1
kind: Secret
metadata:
name: storage-config
type: Opaque
data:
MS_TOKEN: bXN0X1ZOVXdSV0FHQmtJeFpmTEx1a3NlR3lvVVZvbnVOaUR1VU0==
```

## Deploy InferenceService with Models from ModelScope Hub

### Option 1: Use Service Account with Secret Ref
Create a Kubernetes `ServiceAccount` with the ModelScope token secret name reference and specify the `ServiceAccountName` in the `InferenceService` Spec.
Comment thread
sivanantha321 marked this conversation as resolved.

```yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: msserviceacc
secrets:
- name: storage-config
---
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
name: modelscope-qwen
spec:
predictor:
serviceAccountName: msserviceacc # Option 1 for authenticating with MS_TOKEN
model:
modelFormat:
name: huggingface
args:
- --model_name=qwen
- --model_dir=/mnt/models
storageUri: ms://qwen/Qwen2-0.5B-Instruct
resources:
limits:
cpu: "6"
memory: 24Gi
nvidia.com/gpu: "1"
requests:
cpu: "6"
memory: 24Gi
nvidia.com/gpu: "1"
```

### Option 2: Use Environment Variable with Secret Ref
Create a Kubernetes ModelScope token secret and specify the MS token secret reference using environment variable in the `InferenceService` Spec.

```yaml
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
name: modelscope-qwen
spec:
predictor:
model:
modelFormat:
name: huggingface
args:
- --model_name=qwen
- --model_dir=/mnt/models
storageUri: ms://qwen/Qwen2-0.5B-Instruct
resources:
limits:
cpu: "6"
memory: 24Gi
nvidia.com/gpu: "1"
requests:
cpu: "6"
memory: 24Gi
nvidia.com/gpu: "1"
env:
- name: MS_TOKEN # Option 2 for authenticating with MS_TOKEN
valueFrom:
secretKeyRef:
name: storage-config
key: MS_TOKEN
optional: false
```

## Check the InferenceService status.

```bash
kubectl get inferenceservices modelscope-qwen
```

:::tip[Expected Output]

```bash
NAME URL READY PREV LATEST PREVROLLEDOUTREVISION LATESTREADYREVISION AGE
modelscope-qwen http://modelscope-qwen.default.example.com True 100 modelscope-qwen-predictor-default-47q2g 7d23h
```

:::
1 change: 1 addition & 0 deletions sidebars.ts
Original file line number Diff line number Diff line change
Expand Up @@ -298,6 +298,7 @@ const sidebars: SidebarsConfig = {
label: 'Supported Providers',
items: [
'model-serving/storage/providers/hf',
'model-serving/storage/providers/ms',
'model-serving/storage/providers/azure',
'model-serving/storage/providers/s3/s3',
'model-serving/storage/providers/gcs',
Expand Down