-
Notifications
You must be signed in to change notification settings - Fork 176
docs: add ModelScope storage provider documentation #640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
xrwang8
wants to merge
4
commits into
kserve:main
Choose a base branch
from
xrwang8:add-modelscope-docs
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+124
−0
Open
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
45c5e21
docs: add ModelScope storage provider documentation
xrwang8 1f24ace
Update docs/model-serving/storage/providers/ms.md
xrwang8 828713a
Update docs/model-serving/storage/providers/ms.md
xrwang8 4e1d2dd
Update docs/model-serving/storage/providers/ms.md
xrwang8 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,122 @@ | ||
| --- | ||
| title: ModelScope | ||
| description: Deploy InferenceService with models from ModelScope Hub in KServe, including support for both public and private models. | ||
| --- | ||
|
|
||
| # Deploy InferenceService with models from ModelScope Hub | ||
|
|
||
| You can specify the `storageUri` field on `InferenceService` YAML with the following format to deploy the models from ModelScope Hub. | ||
|
|
||
| ``` | ||
| ms://${NAMESPACE}/${MODEL}:${REVISION}(optional) | ||
| ``` | ||
|
|
||
| e.g. `ms://qwen/Qwen2-0.5B-Instruct` | ||
|
|
||
| [ModelScope](https://www.modelscope.cn) is one of the largest model hubs in China, hosting popular models such as Qwen, DeepSeek, and many others. | ||
|
|
||
| ## Public ModelScope Models | ||
|
|
||
| If no credential is provided, an anonymous client will be used to download the model from the ModelScope repository. | ||
|
|
||
| ## Private ModelScope Models | ||
|
|
||
| KServe supports authenticating with `MS_TOKEN` for downloading the model. Create a Kubernetes secret to store the ModelScope token. | ||
|
|
||
| ```yaml title="yaml" | ||
| apiVersion: v1 | ||
| kind: Secret | ||
| metadata: | ||
| name: storage-config | ||
| type: Opaque | ||
| data: | ||
| MS_TOKEN: bXN0X1ZOVXdSV0FHQmtJeFpmTEx1a3NlR3lvVVZvbnVOaUR1VU0== | ||
| ``` | ||
|
|
||
| ## Deploy InferenceService with Models from ModelScope Hub | ||
|
|
||
| ### Option 1: Use Service Account with Secret Ref | ||
| Create a Kubernetes `ServiceAccount` with the ModelScope token secret name reference and specify the `ServiceAccountName` in the `InferenceService` Spec. | ||
|
|
||
| ```yaml | ||
| apiVersion: v1 | ||
| kind: ServiceAccount | ||
| metadata: | ||
| name: msserviceacc | ||
| secrets: | ||
| - name: storage-config | ||
| --- | ||
| apiVersion: serving.kserve.io/v1beta1 | ||
| kind: InferenceService | ||
| metadata: | ||
| name: modelscope-qwen | ||
| spec: | ||
| predictor: | ||
| serviceAccountName: msserviceacc # Option 1 for authenticating with MS_TOKEN | ||
| model: | ||
| modelFormat: | ||
| name: huggingface | ||
| args: | ||
| - --model_name=qwen | ||
| - --model_dir=/mnt/models | ||
| storageUri: ms://qwen/Qwen2-0.5B-Instruct | ||
| resources: | ||
| limits: | ||
| cpu: "6" | ||
| memory: 24Gi | ||
| nvidia.com/gpu: "1" | ||
| requests: | ||
| cpu: "6" | ||
| memory: 24Gi | ||
| nvidia.com/gpu: "1" | ||
| ``` | ||
|
|
||
| ### Option 2: Use Environment Variable with Secret Ref | ||
| Create a Kubernetes ModelScope token secret and specify the MS token secret reference using environment variable in the `InferenceService` Spec. | ||
|
|
||
| ```yaml | ||
| apiVersion: serving.kserve.io/v1beta1 | ||
| kind: InferenceService | ||
| metadata: | ||
| name: modelscope-qwen | ||
| spec: | ||
| predictor: | ||
| model: | ||
| modelFormat: | ||
| name: huggingface | ||
| args: | ||
| - --model_name=qwen | ||
| - --model_dir=/mnt/models | ||
| storageUri: ms://qwen/Qwen2-0.5B-Instruct | ||
| resources: | ||
| limits: | ||
| cpu: "6" | ||
| memory: 24Gi | ||
| nvidia.com/gpu: "1" | ||
| requests: | ||
| cpu: "6" | ||
| memory: 24Gi | ||
| nvidia.com/gpu: "1" | ||
| env: | ||
| - name: MS_TOKEN # Option 2 for authenticating with MS_TOKEN | ||
| valueFrom: | ||
| secretKeyRef: | ||
| name: storage-config | ||
| key: MS_TOKEN | ||
| optional: false | ||
| ``` | ||
|
|
||
| ## Check the InferenceService status. | ||
|
|
||
| ```bash | ||
| kubectl get inferenceservices modelscope-qwen | ||
| ``` | ||
|
|
||
| :::tip[Expected Output] | ||
|
|
||
| ```bash | ||
| NAME URL READY PREV LATEST PREVROLLEDOUTREVISION LATESTREADYREVISION AGE | ||
| modelscope-qwen http://modelscope-qwen.default.example.com True 100 modelscope-qwen-predictor-default-47q2g 7d23h | ||
| ``` | ||
|
|
||
| ::: | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.