Skip to content

Commit d59d017

Browse files
author
will.yang
committed
release v1.0.1
1 parent b81deb2 commit d59d017

27 files changed

+1141
-77
lines changed

CHANGELOG.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# CHANGELOG
2+
## v1.0.1
3+
- Optimize model conversion memory occupation
4+
- Optimize inference memory occupation
5+
- Increase prefill speed
6+
- Reduce initialization time
7+
- Improve quantization accuracy
8+
- Add support for Gemma, ChatGLM3, MiniCPM, InternLM2, and Phi-3
9+
- Add Server invocation
10+
- Add inference interruption interface
11+
- Add logprob and token_id to the return value
12+
13+
## v1.0.0
14+
- Supports the conversion and deployment of LLM models on RK3588/RK3576 platforms
15+
- Compatible with Hugging Face model architectures
16+
- Currently supports the models Llama, Qwen, Qwen2, and Phi-2
17+
- Supports quantization with w8a8 and w4a16 precision

LICENSE

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
Copyright (c) Rockchip Electronics Co., Ltd.
2+
All rights reserved.
3+
4+
// Redistribution and use in source and binary forms, with or without
5+
// modification, are permitted provided that the following conditions are met:
6+
//
7+
// 1. Redistributions of source code must retain the above copyright notice,
8+
// this list of conditions and the following disclaimer.
9+
//
10+
// 2. Redistributions in binary form must reproduce the above copyright notice,
11+
// this list of conditions and the following disclaimer in the documentation
12+
// and/or other materials provided with the distribution.
13+
//
14+
// 3. Neither the name of the copyright holder nor the names of its contributors
15+
// may be used to endorse or promote products derived from this software without
16+
// specific prior written permission.
17+
//
18+
// 4. This Software may contain some Open Source Software. You may not redistribute
19+
// and/or modify such Open Source Software except in compliance with the applicable
20+
// Open Source License.
21+
22+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
23+
// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
24+
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
25+
// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
26+
// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
27+
// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
28+
// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
29+
// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
30+
// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
31+
// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
32+
// POSSIBILITY OF SUCH DAMAGE.
33+
34+
The following Open Source Software have been modified by Rockchip Electronics Co., Ltd.
35+
----------------------------------------------------------------------------------------
36+
1. ggml master
37+
Copyright (c) 2023-2024 The ggml authors
38+
All rights reserved.
39+
Licensed under the terms of the MIT License
40+
41+
2. llama.cpp master
42+
Copyright (c) 2023-2024 The ggml authors
43+
All rights reserved.
44+
Licensed under the terms of the MIT License
45+
46+
The terms of the MIT License:
47+
--------------------------------------------------------------------
48+
Permission is hereby granted, free of charge, to any person obtaining a copy
49+
of this software and associated documentation files (the "Software"), to deal
50+
in the Software without restriction, including without limitation the rights
51+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
52+
copies of the Software, and to permit persons to whom the Software is
53+
furnished to do so, subject to the following conditions:
54+
55+
The above copyright notice and this permission notice shall be included in all
56+
copies or substantial portions of the Software.
57+
58+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
59+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
60+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
61+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
62+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
63+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
64+
SOFTWARE.

README.md

Lines changed: 24 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -17,23 +17,35 @@
1717
- RK3588 Series
1818
- RK3576 Series
1919

20+
# Support Models
21+
- [X] [TinyLLAMA 1.1B](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0/tree/fe8a4ea1ffedaf415f4da2f062534de366a451e6)
22+
- [X] [Qwen 1.8B](https://huggingface.co/Qwen/Qwen-1_8B-Chat/tree/1d0f68de57b88cfde81f3c3e537f24464d889081)
23+
- [X] [Qwen2 0.5B](https://huggingface.co/Qwen/Qwen1.5-0.5B/tree/8f445e3628f3500ee69f24e1303c9f10f5342a39)
24+
- [X] [Phi-2 2.7B](https://hf-mirror.com/microsoft/phi-2/tree/834565c23f9b28b96ccbeabe614dd906b6db551a)
25+
- [X] [Phi-3 3.8B](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/tree/291e9e30e38030c23497afa30f3af1f104837aa6)
26+
- [X] [ChatGLM3 6B](https://huggingface.co/THUDM/chatglm3-6b/tree/103caa40027ebfd8450289ca2f278eac4ff26405)
27+
- [X] [Gemma 2B](https://huggingface.co/google/gemma-2b-it/tree/de144fb2268dee1066f515465df532c05e699d48)
28+
- [X] [InternLM2 1.8B](https://huggingface.co/internlm/internlm2-chat-1_8b/tree/ecccbb5c87079ad84e5788baa55dd6e21a9c614d)
29+
- [X] [MiniCPM 2B](https://huggingface.co/openbmb/MiniCPM-2B-sft-bf16/tree/79fbb1db171e6d8bf77cdb0a94076a43003abd9e)
30+
2031
# Download
2132
- You can also download all packages, docker image, examples, docs and platform-tools from [RKLLM_SDK](https://console.zbox.filez.com/l/RJJDmB), fetch code: rkllm
2233

2334
# RKNN Toolkit2
24-
If you want to deploy additional AI model, we have introduced a new SDK called RKNN-Toolkit2. For details, please refer to:
35+
If you want to deploy additional AI model, we have introduced a SDK called RKNN-Toolkit2. For details, please refer to:
2536

2637
https://github.com/airockchip/rknn-toolkit2
2738

28-
# Notes
29-
30-
Due to recent updates to the Phi2 model, the current version of the RKLLM SDK does not yet support these changes.
31-
Please ensure to download a version of the [Phi2](https://hf-mirror.com/microsoft/phi-2/tree/834565c23f9b28b96ccbeabe614dd906b6db551a) model that is supported.
32-
3339
# CHANGELOG
34-
35-
## v1.0.0-beta
36-
- Supports the conversion and deployment of LLM models on RK3588/RK3576 platforms
37-
- Compatible with Hugging Face model architectures
38-
- Currently supports the models LLaMA, Qwen, Qwen2, and Phi-2
39-
- Supports quantization with w8a8 and w4a16 precision
40+
## v1.0.1
41+
- Optimize model conversion memory occupation
42+
- Optimize inference memory occupation
43+
- Increase prefill speed
44+
- Reduce initialization time
45+
- Improve quantization accuracy
46+
- Add support for Gemma, ChatGLM3, MiniCPM, InternLM2, and Phi-3
47+
- Add Server invocation
48+
- Add inference interruption interface
49+
- Add logprob and token_id to the return value
50+
51+
for older version, please refer [CHANGELOG](CHANGELOG.md)

doc/Rockchip_RKLLM_SDK_CN.pdf

457 KB
Binary file not shown.

doc/Rockchip_RKLLM_SDK_EN.pdf

957 KB
Binary file not shown.

rkllm-runtime/example/CMakeLists.txt renamed to rkllm-runtime/examples/rkllm_api_demo/CMakeLists.txt

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,14 @@ set(SOURCE_FILES src/main.cpp)
88

99
add_executable(${PROJECT_NAME} ${SOURCE_FILES})
1010

11-
set(RKLLM_API_PATH "${CMAKE_SOURCE_DIR}/../runtime/${CMAKE_SYSTEM_NAME}/librkllm_api")
11+
set(RKLLM_API_PATH "${CMAKE_SOURCE_DIR}/../../runtime/${CMAKE_SYSTEM_NAME}/librkllm_api")
1212
include_directories(${RKLLM_API_PATH}/include)
1313
if(CMAKE_SYSTEM_NAME STREQUAL "Android")
1414
set(RKLLM_RT_LIB ${RKLLM_API_PATH}/${CMAKE_ANDROID_ARCH_ABI}/librkllmrt.so)
15+
target_link_libraries(${PROJECT_NAME} ${RKLLM_RT_LIB} log)
1516
elseif(CMAKE_SYSTEM_NAME STREQUAL "Linux")
1617
set(RKLLM_RT_LIB ${RKLLM_API_PATH}/aarch64/librkllmrt.so)
18+
target_link_libraries(${PROJECT_NAME} ${RKLLM_RT_LIB})
1719
endif()
1820

1921

20-
target_link_libraries(${PROJECT_NAME} ${RKLLM_RT_LIB})

rkllm-runtime/example/Readme.md renamed to rkllm-runtime/examples/rkllm_api_demo/Readme.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ bash build-linux.sh
1313
Push the compiled `llm_demo` file and `librkllmrt.so` file to the device:
1414
```bash
1515
adb push build/build_linux_aarch64_Release/llm_demo /userdata/llm
16-
adb push ../runtime/Linux/librkllm_api/aarch64/librkllmrt.so /userdata/llm/lib
16+
adb push ../../runtime/Linux/librkllm_api/aarch64/librkllmrt.so /userdata/llm/lib
1717
```
1818

1919
## Run
@@ -39,7 +39,7 @@ bash build-android.sh
3939
Push the compiled `llm_demo` file and `librkllmrt.so` file to the device:
4040
```bash
4141
adb push build/build_android_arm64-v8a_Release/llm_demo /userdata/llm
42-
adb push ../runtime/Android/librkllm_api/arm64-v8a/librkllmrt.so /userdata/llm/lib
42+
adb push ../../runtime/Android/librkllm_api/arm64-v8a/librkllmrt.so /userdata/llm/lib
4343
```
4444

4545
## Run

rkllm-runtime/example/build-android.sh renamed to rkllm-runtime/examples/rkllm_api_demo/build-android.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ if [[ -z ${BUILD_TYPE} ]];then
44
BUILD_TYPE=Release
55
fi
66

7-
ANDROID_NDK_PATH=~/android-ndk-r18b
7+
ANDROID_NDK_PATH=~/android-ndk-r21e
88
TARGET_ARCH=arm64-v8a
99

1010
TARGET_PLATFORM=android
File renamed without changes.

rkllm-runtime/example/src/main.cpp renamed to rkllm-runtime/examples/rkllm_api_demo/src/main.cpp

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -41,9 +41,8 @@ void exit_handler(int signal)
4141
}
4242
}
4343

44-
void callback(const char *text, void *userdata, LLMCallState state)
44+
void callback(RKLLMResult *result, void *userdata, LLMCallState state)
4545
{
46-
4746
if (state == LLM_RUN_FINISH)
4847
{
4948
printf("\n");
@@ -52,8 +51,9 @@ void callback(const char *text, void *userdata, LLMCallState state)
5251
{
5352
printf("\\run error\n");
5453
}
55-
else{
56-
printf("%s", text);
54+
else
55+
{
56+
printf("%s", result->text);
5757
}
5858
}
5959

@@ -70,12 +70,14 @@ int main(int argc, char **argv)
7070

7171
//设置参数及初始化
7272
RKLLMParam param = rkllm_createDefaultParam();
73-
param.modelPath = rkllm_model.c_str();
74-
param.target_platform = "rk3588";
73+
param.model_path = rkllm_model.c_str();
7574
param.num_npu_core = 2;
7675
param.top_k = 1;
7776
param.max_new_tokens = 256;
7877
param.max_context_len = 512;
78+
param.logprobs = false;
79+
param.top_logprobs = 5;
80+
param.use_gpu = false;
7981
rkllm_init(&llmHandle, param, callback);
8082
printf("rkllm init success\n");
8183

@@ -113,7 +115,9 @@ int main(int argc, char **argv)
113115
cout << input_str << endl;
114116
}
115117
}
116-
string text = PROMPT_TEXT_PREFIX + input_str + PROMPT_TEXT_POSTFIX;
118+
// string text = PROMPT_TEXT_PREFIX + input_str + PROMPT_TEXT_POSTFIX;
119+
string text = input_str;
120+
117121
printf("robot: ");
118122
rkllm_run(llmHandle, text.c_str(), NULL);
119123
}

0 commit comments

Comments
 (0)