diff --git a/docs/en/model_inference/inference_service/how_to/custom_inference_runtime.mdx b/docs/en/model_inference/inference_service/how_to/custom_inference_runtime.mdx
index ee9a99e..3451a9c 100644
--- a/docs/en/model_inference/inference_service/how_to/custom_inference_runtime.mdx
+++ b/docs/en/model_inference/inference_service/how_to/custom_inference_runtime.mdx
@@ -47,13 +47,15 @@ Before you start, please ensure you meet these conditions:
 4. You have **cluster administrator privileges** (needed to create CRD instances).
 
 
-## Steps
+## Standard Workflow (Example: Xinference)
+
+Follow these steps to extend the platform. We use **Xinference** as a baseline example to demonstrate the standard process.
 
 <Steps>
 
 ### Create Inference Runtime Resources
 
-You'll need to create the corresponding inference runtime resources based on your target hardware environment (GPU/CPU/NPU).
+You'll need to create the corresponding inference runtime `ClusterServingRuntime` resources based on your target hardware environment (GPU/CPU/NPU).
 
 1. **Prepare the Runtime YAML Configuration**:
 
@@ -190,4 +192,432 @@ Once the Xinference inference runtime resource is successfully created, you can
         * **Variable Name**: `MODEL_FAMILY`
         * **Variable Value**: `llama` (if you are using a Llama series model, checkout the [docs](https://inference.readthedocs.io/en/v1.2.2/getting_started/using_xinference.html#manage-models) for more detail. Or you can run `xinference registrations -t LLM` to list all supported model families.)
 
-</Steps>
\ No newline at end of file
+</Steps>
+
+## Specific Runtime Examples
+
+Once you understand the standard workflow, refer to these examples for specific configurations related to other runtimes.
+
+### MLServer
+
+The MLServer runtime is versatile and can be used on both NVIDIA GPUs and CPUs.
+
+```yaml
+kind: ClusterServingRuntime
+apiVersion: serving.kserve.io/v1alpha1
+metadata:
+  annotations:
+    cpaas.io/display-name: mlserver-cuda11.6-x86-arm
+  creationTimestamp: 2026-01-05T07:02:33Z
+  generation: 1
+  labels:
+    cpaas.io/accelerator-type: nvidia
+    cpaas.io/cuda-version: "11.6"
+    cpaas.io/runtime-class: mlserver
+  name: aml-mlserver-cuda-11.6
+spec:
+  containers:
+    - command:
+        - /bin/bash
+        - -lc
+        - |
+          if [ "$MODEL_TYPE" = "text-to-image" ]; then
+            MODEL_IMPL="mlserver_diffusers.StableDiffusionRuntime"
+          else
+            MODEL_IMPL="mlserver_huggingface.HuggingFaceRuntime"
+          fi            
+
+          MODEL_DIR="${MLSERVER_MODEL_URI}/${MLSERVER_MODEL_NAME}"
+          # a. using git lfs storage initializer, model will be in /mnt/models/<model_name>
+          # b. using hf storage initializer, model will be in /mnt/models
+          if [ ! -d "${MODEL_DIR}" ]; then
+              MODEL_DIR="${MLSERVER_MODEL_URI}"
+              echo "[WARNING] Model directory ${MODEL_DIR}/${MLSERVER_MODEL_NAME} not found, using ${MODEL_DIR} instead"
+          fi
+
+          export MLSERVER_MODEL_IMPLEMENTATION=${MODEL_IMPL}
+          export MLSERVER_MODEL_EXTRA="{\"task\":\"${MODEL_TYPE}\",\"pretrained_model\":\"${MODEL_DIR}\"}"            
+
+          mlserver start $MLSERVER_MODEL_URI $@
+        - bash
+      env:
+        - name: MLSERVER_MODEL_URI
+          value: /mnt/models
+        - name: MLSERVER_MODEL_NAME
+          value: '{{ index .Annotations "aml-model-repo" }}'
+        - name: MODEL_TYPE
+          value: '{{ index .Annotations "aml-pipeline-tag" }}'
+      image: alaudadockerhub/seldon-mlserver:1.6.0-cu116-v1.3.1
+      name: kserve-container
+      resources:
+        limits:
+          cpu: 2
+          memory: 6Gi
+        requests:
+          cpu: 2
+          memory: 6Gi
+      securityContext:
+        allowPrivilegeEscalation: false
+        capabilities:
+          drop:
+            - ALL
+        privileged: false
+        runAsNonRoot: true
+        runAsUser: 1000
+      startupProbe:
+        failureThreshold: 60
+        httpGet:
+          path: /v2/models/{{ index .Annotations "aml-model-repo" }}/ready
+          port: 8080
+          scheme: HTTP
+        periodSeconds: 10
+        timeoutSeconds: 10
+  labels:
+    modelClass: mlserver_sklearn.SKLearnModel
+  supportedModelFormats:
+    - name: mlflow
+      version: "1"
+    - name: transformers
+      version: "1"
+
+ ```
+
+
+### MindIE (Ascend NPU 310P)
+
+MindIE is specifically designed for Huawei Ascend hardware. Its configuration differs significantly in resource management and metadata.
+
+**1.ClusterServingRuntime**
+
+```yaml
+# This is a sample YAML for Ascend NPU runtime
+kind: ClusterServingRuntime
+apiVersion: serving.kserve.io/v1alpha1
+metadata:
+  annotations:
+    cpaas.io/display-name: mindie-2.2RC1
+  labels:
+    cpaas.io/accelerator-type: npu
+    cpaas.io/cann-version: 8.3.0
+    cpaas.io/runtime-class: mindie
+  name: mindie-2.2rc1-310p
+spec:
+  containers:
+    - command:
+        - bash
+        - -c
+        - |
+          REAL_SCRIPT=$(echo "$RAW_SCRIPT" | sed 's/__LT__/\x3c/g')
+          echo "$REAL_SCRIPT" > /tmp/startup.sh
+          chmod +x /tmp/startup.sh
+
+          CONFIG_FILE="${MODEL_PATH}/config.json"
+          echo "Checking for file: ${CONFIG_FILE}"
+
+          ls -ld "${MODEL_PATH}"
+          chmod -R 755 "${MODEL_PATH}"
+          echo "Fixing MODEL_PATH permission..."
+          ls -ld "${MODEL_PATH}"
+
+          /tmp/startup.sh --model-name "${MODEL_NAME}" --model-path "${MODEL_PATH}" --ip "${MY_POD_IP}"
+      env:
+        - name: RAW_SCRIPT
+          value: |
+            #!/bin/bash
+            #
+            #  Copyright 2024 Huawei Technologies Co., Ltd
+            #
+            #  Licensed under the Apache License, Version 2.0 (the "License");
+            #  you may not use this file except in compliance with the License.
+            #  You may obtain a copy of the License at
+            #
+            #  http://www.apache.org/licenses/LICENSE-2.0
+            #
+            #  Unless required by applicable law or agreed to in writing, software
+            #  distributed under the License is distributed on an "AS IS" BASIS,
+            #  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+            #  See the License for the specific language governing permissions and
+            #  limitations under the License.
+            #  ============================================================================
+            #
+
+            ##
+            # Script Instruction
+            ##
+            ### Name:
+            ### run_mindie.sh - Use to Start MindIE Service given a specific model
+            ###
+            ### Usage:
+            ###   bash run_mindie.sh --model-name xxx --model-path /path/to/model
+            ###
+            ### Required:
+            ###   --model-name             :Given a model name to identify MindIE Service.
+            ###   --model-path             :Given a model path which contain necessary files such as yaml/conf.json/tokenizer/vocab etc.
+            ### Options:
+            ###   --help                   :Show this message.
+            ###   --ip                     :The IP address bound to the MindIE Server business plane RESTful interface,default value: 127.0.0.1.
+            ###   --port                   :The port bound to the MindIE Server business plane RESTful interface,default value: 1025.
+            ###   --management-ip          :The IP address bound to the MindIE Server management plane RESTful interface,default value: 127.0.0.2.
+            ###   --management-port        :The port bound to the MindIE Server management plane RESTful interface,default value: 1026.
+            ###   --metrics-port           :The port bound to the performance indicator monitoring interface,default value: 1027.
+            ###   --max-seq-len            :Maximum sequence length,default value: 2560.
+            ###   --max-iter-times         :The global maximum output length of the model,default value: 512.
+            ###   --max-input-token-len    :The maximum length of the token id,default value: 2048.
+            ###   --max-prefill-tokens     :Each time prefill occurs, the total number of input tokens in the current batch,default value: 8192
+            ###   --truncation             :Whether to perform parameter rationalization check interception,default value: false.
+            ###   --template-type          :Reasoning type,default value: "Standard"
+            ###   --max-preempt-count      :The upper limit of the maximum number of preemptible requests in each batch,default value: 0.
+            ###   --support-select-batch   :Batch selection strategy,default value: false.
+            ###   --npu-mem-size           :This can be used to apply for the upper limit of the KV Cache size in the NPU,default value: 8.
+            ###   --max-prefill-batch-size :The maximum prefill batch size,default value: 50.
+            ###   --world-size             :Enable several cards for inference.
+            ###                             1. If it is not set, the parallel config in the YAML file is obtained by default. Set worldsize = dp*mp*pp.
+            ###                             2. If set, modify the parallel config in the YAML file. set parallel config: dp:1 mp:worldSize pp:1
+            ###   --ms-sched-host          :MS Scheduler IP address,default value: 127.0.0.1.
+            ###   --ms-sched-port          :MS Scheduler port,default value: 8119.
+            ###   For more details about config description, please check MindIE homepage: https://www.hiascend.com/document/detail/zh/mindie/10RC3/mindiellm/llmdev/mindie_llm0004.html
+            help() {
+              awk -F'### ' '/^###/ { print $2 }' "$0"
+            }
+
+            if [[ $# == 0 ]] || [[ "$1" == "--help" ]]; then
+              help
+              exit 1
+            fi
+
+            ##
+            # Get device info
+            ##
+            total_count=$(npu-smi info -l | grep "Total Count" | awk -F ':' '{print $2}' | xargs)
+
+            if [[ -z "$total_count" ]]; then
+                echo "Error: Unable to retrieve device info. Please check if npu-smi is available for current user (id 1001), or if you are specifying an occupied device."
+                exit 1
+            fi
+
+            echo "$total_count device(s) detected!"
+
+            ##
+            # Set toolkit envs
+            ##
+            echo "Setting toolkit envs..."
+            if [[ -f "/usr/local/Ascend/ascend-toolkit/set_env.sh" ]];then
+                            source /usr/local/Ascend/ascend-toolkit/set_env.sh
+                    else
+                            echo "ascend-toolkit package is incomplete please check it."
+                            exit 1
+                    fi
+            echo "Toolkit envs set succeeded!"
+
+            ##
+            # Set MindIE envs
+            ##
+            echo "Setting MindIE envs..."
+            if [[ -f "/usr/local/Ascend/mindie/set_env.sh" ]];then
+                            source /usr/local/Ascend/mindie/set_env.sh
+                    else
+                            echo "mindie package is incomplete please check it."
+                            exit 1
+                    fi
+            echo "MindIE envs set succeeded!"
+
+            ##
+            # Default MS envs
+            ##
+
+            # Set PYTHONPATH
+            MF_SCRIPTS_ROOT=$(realpath "$(dirname "$0")")
+            export PYTHONPATH=$MF_SCRIPTS_ROOT/../:$PYTHONPATH
+
+            ##
+            # Receive args and modify config.json
+            ##
+            export MIES_INSTALL_PATH=/usr/local/Ascend/mindie/latest/mindie-service
+            CONFIG_FILE=${MIES_INSTALL_PATH}/conf/config.json
+            echo "MindIE Service config path:$CONFIG_FILE"
+            #default config
+            BACKEND_TYPE="atb"
+            MAX_SEQ_LEN=2560
+            MAX_PREFILL_TOKENS=8192
+            MAX_ITER_TIMES=512
+            MAX_INPUT_TOKEN_LEN=2048
+            TRUNCATION=false
+            HTTPS_ENABLED=false
+            MULTI_NODES_INFER_ENABLED=false
+            NPU_MEM_SIZE=8
+            MAX_PREFILL_BATCH_SIZE=50
+            TEMPLATE_TYPE="Standard"
+            MAX_PREEMPT_COUNT=0
+            SUPPORT_SELECT_BATCH=false
+            IP_ADDRESS="127.0.0.1"
+            PORT=8080
+            MANAGEMENT_IP_ADDRESS="127.0.0.2"
+            MANAGEMENT_PORT=1026
+            METRICS_PORT=1027
+
+            #modify config
+            while [[ "$#" -gt 0 ]]; do
+                case $1 in
+                    --model-path) MODEL_WEIGHT_PATH="$2"; shift ;;
+                    --model-name) MODEL_NAME="$2"; shift ;;
+                    --max-seq-len) MAX_SEQ_LEN="$2"; shift ;;
+                    --max-iter-times) MAX_ITER_TIMES="$2"; shift ;;
+                    --max-input-token-len) MAX_INPUT_TOKEN_LEN="$2"; shift ;;
+                    --max-prefill-tokens) MAX_PREFILL_TOKENS="$2"; shift ;;
+                    --truncation) TRUNCATION="$2"; shift ;;
+                    --world-size) WORLD_SIZE="$2"; shift ;;
+                    --template-type) TEMPLATE_TYPE="$2"; shift ;;
+                    --max-preempt-count) MAX_PREEMPT_COUNT="$2"; shift ;;
+                    --support-select-batch) SUPPORT_SELECT_BATCH="$2"; shift ;;
+                    --npu-mem-size) NPU_MEM_SIZE="$2"; shift ;;
+                    --max-prefill-batch-size) MAX_PREFILL_BATCH_SIZE="$2"; shift ;;
+                    --ip) IP_ADDRESS="$2"; shift ;;
+                    --port) PORT="$2"; shift ;;
+                    --management-ip) MANAGEMENT_IP_ADDRESS="$2"; shift ;;
+                    --management-port) MANAGEMENT_PORT="$2"; shift ;;
+                    --metrics-port) METRICS_PORT="$2"; shift ;;
+                    --ms-sched-host) ENV_MS_SCHED_HOST="$2"; shift ;;
+                    --ms-sched-port) ENV_MS_SCHED_PORT="$2"; shift ;;
+                    *)
+                        echo "Unknown parameter: $1"
+                        echo "Please check your inputs."
+                        exit 1
+                        ;;
+                esac
+                shift
+            done
+
+            if [ -z "$MODEL_WEIGHT_PATH" ] || [ -z "$MODEL_NAME" ]; then
+                echo "Error: Both --model-path and --model-name are required."
+                exit 1
+            fi
+            MODEL_NAME=${MODEL_NAME:-$(basename "$MODEL_WEIGHT_PATH")}
+            echo "MODEL_NAME is set to: $MODEL_NAME"
+
+            WORLD_SIZE=$total_count
+            NPU_DEVICE_IDS=$(seq -s, 0 $(($WORLD_SIZE - 1)))
+
+            #validate config
+            if [[ "$BACKEND_TYPE" != "atb" ]]; then
+                echo "Error: BACKEND must be 'atb'. Current value: $BACKEND_TYPE"
+                exit 1
+            fi
+
+            if [[ ! "$IP_ADDRESS" =~ ^([0-9]{1,3}\.){3}[0-9]{1,3}$ ]] ||
+              [[ ! "$MANAGEMENT_IP_ADDRESS" =~ ^([0-9]{1,3}\.){3}[0-9]{1,3}$ ]]; then
+                echo "Error: IP_ADDRESS and MANAGEMENT_IP_ADDRESS must be valid IP addresses. Current values: IP_ADDRESS=$IP_ADDRESS, MANAGEMENT_IP_ADDRESS=$MANAGEMENT_IP_ADDRESS"
+                exit 1
+            fi
+
+            if [[ ! "$PORT" =~ ^[0-9]+$ ]] || (( PORT __LT__ 1025 || PORT > 65535 )) ||
+              [[ ! "$MANAGEMENT_PORT" =~ ^[0-9]+$ ]] || (( MANAGEMENT_PORT __LT__ 1025 || MANAGEMENT_PORT > 65535 )); then
+                echo "Error: PORT and MANAGEMENT_PORT must be integers between 1025 and 65535. Current values: PORT=$PORT, MANAGEMENT_PORT=$MANAGEMENT_PORT"
+                exit 1
+            fi
+
+            if [ "$MAX_PREFILL_TOKENS" -lt "$MAX_SEQ_LEN" ]; then
+                MAX_PREFILL_TOKENS=$MAX_SEQ_LEN
+                echo "MAX_PREFILL_TOKENS was less than MAX_SEQ_LEN. Setting MAX_PREFILL_TOKENS to $MAX_SEQ_LEN"
+            fi
+
+            MODEL_CONFIG_FILE="${MODEL_WEIGHT_PATH}/config.json"
+            if [ ! -f "$MODEL_CONFIG_FILE" ]; then
+                echo "Error: config.json file not found in $MODEL_WEIGHT_PATH."
+                exit 1
+            fi
+            chmod 600 "$MODEL_CONFIG_FILE"
+            #update config file
+            chmod u+w ${MIES_INSTALL_PATH}/conf/
+            sed -i "s/\"backendType\"\s*:\s*\"[^\"]*\"/\"backendType\": \"$BACKEND_TYPE\"/" $CONFIG_FILE
+            sed -i "s/\"modelName\"\s*:\s*\"[^\"]*\"/\"modelName\": \"$MODEL_NAME\"/" $CONFIG_FILE
+            sed -i "s|\"modelWeightPath\"\s*:\s*\"[^\"]*\"|\"modelWeightPath\": \"$MODEL_WEIGHT_PATH\"|" $CONFIG_FILE
+            sed -i "s/\"maxSeqLen\"\s*:\s*[0-9]*/\"maxSeqLen\": $MAX_SEQ_LEN/" "$CONFIG_FILE"
+            sed -i "s/\"maxPrefillTokens\"\s*:\s*[0-9]*/\"maxPrefillTokens\": $MAX_PREFILL_TOKENS/" "$CONFIG_FILE"
+            sed -i "s/\"maxIterTimes\"\s*:\s*[0-9]*/\"maxIterTimes\": $MAX_ITER_TIMES/" "$CONFIG_FILE"
+            sed -i "s/\"maxInputTokenLen\"\s*:\s*[0-9]*/\"maxInputTokenLen\": $MAX_INPUT_TOKEN_LEN/" "$CONFIG_FILE"
+            sed -i "s/\"truncation\"\s*:\s*[a-z]*/\"truncation\": $TRUNCATION/" "$CONFIG_FILE"
+            sed -i "s|\(\"npuDeviceIds\"\s*:\s*\[\[\)[^]]*\(]]\)|\1$NPU_DEVICE_IDS\2|" "$CONFIG_FILE"
+            sed -i "s/\"worldSize\"\s*:\s*[0-9]*/\"worldSize\": $WORLD_SIZE/" "$CONFIG_FILE"
+            sed -i "s/\"httpsEnabled\"\s*:\s*[a-z]*/\"httpsEnabled\": $HTTPS_ENABLED/" "$CONFIG_FILE"
+            sed -i "s/\"templateType\"\s*:\s*\"[^\"]*\"/\"templateType\": \"$TEMPLATE_TYPE\"/" $CONFIG_FILE
+            sed -i "s/\"maxPreemptCount\"\s*:\s*[0-9]*/\"maxPreemptCount\": $MAX_PREEMPT_COUNT/" $CONFIG_FILE
+            sed -i "s/\"supportSelectBatch\"\s*:\s*[a-z]*/\"supportSelectBatch\": $SUPPORT_SELECT_BATCH/" $CONFIG_FILE
+            sed -i "s/\"multiNodesInferEnabled\"\s*:\s*[a-z]*/\"multiNodesInferEnabled\": $MULTI_NODES_INFER_ENABLED/" "$CONFIG_FILE"
+            sed -i "s/\"maxPrefillBatchSize\"\s*:\s*[0-9]*/\"maxPrefillBatchSize\": $MAX_PREFILL_BATCH_SIZE/" "$CONFIG_FILE"
+            sed -i "s/\"ipAddress\"\s*:\s*\"[^\"]*\"/\"ipAddress\": \"$IP_ADDRESS\"/" "$CONFIG_FILE"
+            sed -i "s/\"port\"\s*:\s*[0-9]*/\"port\": $PORT/" "$CONFIG_FILE"
+            sed -i "s/\"managementIpAddress\"\s*:\s*\"[^\"]*\"/\"managementIpAddress\": \"$MANAGEMENT_IP_ADDRESS\"/" "$CONFIG_FILE"
+            sed -i "s/\"managementPort\"\s*:\s*[0-9]*/\"managementPort\": $MANAGEMENT_PORT/" "$CONFIG_FILE"
+            sed -i "s/\"metricsPort\"\s*:\s*[0-9]*/\"metricsPort\": $METRICS_PORT/" $CONFIG_FILE
+            sed -i "s/\"npuMemSize\"\s*:\s*-*[0-9]*/\"npuMemSize\": $NPU_MEM_SIZE/" "$CONFIG_FILE"
+
+            ##
+            # Start service
+            ##
+            echo "Current configurations are displayed as follows:"
+            cat $CONFIG_FILE
+            npu-smi info -m > ~/device_info
+
+            ${MIES_INSTALL_PATH}/bin/mindieservice_daemon
+        - name: MODEL_NAME
+          value: '{{ index .Annotations "aml-model-repo" }}'
+        - name: MODEL_PATH
+          value: /mnt/models/{{ index .Annotations "aml-model-repo" }}
+        - name: MY_POD_IP
+          valueFrom:
+            fieldRef:
+              fieldPath: status.podIP
+      image: swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie:2.2.RC1-300I-Duo-py311-openeuler24.03-lts
+      name: kserve-container
+      resources:
+        limits:
+          cpu: 2
+          memory: 6Gi
+        requests:
+          cpu: 2
+          memory: 6Gi
+      volumeMounts:
+        - mountPath: /dev/shm
+          name: dshm
+      startupProbe:
+        failureThreshold: 60
+        httpGet:
+          path: /v1/models
+          port: 8080
+          scheme: HTTP
+        periodSeconds: 10
+        timeoutSeconds: 180
+  supportedModelFormats:
+    - name: transformers
+      version: "1"
+  volumes:
+    - emptyDir:
+        medium: Memory
+        sizeLimit: 8Gi
+      name: dshm
+      
+```
+
+**2.Mandatory Annotations for InferenceService**
+
+Unlike other runtimes, MindIE **must** have annotations added to the `InferenceService` metadata during the final publishing step. This ensures the platform's scheduler correctly binds the NPU hardware to the service.
+
+| Configuration Key | Value | Purpose |
+| :--- | :--- | :--- |
+| `storage.kserve.io/readonly` | `"false"` | **Enables write access to the model storage volume.** |
+
+**3.User Privileges (Root Access)**
+
+Due to the requirements of the Ascend driver and hardware abstraction layer, the MindIE image **must run as the root user**. Ensure your `ClusterServingRuntime` or `InferenceService` security context is configured accordingly:
+
+**Note**: The MindIE ClusterServingRuntime YAML example above does not specify a `securityContext`, which means the container runs with the default settings of the image (typically root). Unlike MLServer which explicitly sets `runAsNonRoot: true` and `runAsUser: 1000`, MindIE requires root privileges to access the NPU hardware.
+
+## Comparison of Runtime Configurations
+
+Before proceeding, refer to this table to understand the specific requirements for different runtimes:
+
+| Runtime | Target Hardware | Supported Frameworks | Special Requirements |
+| :--- | :--- | :--- | :--- |
+| **Xinference** | CPU / NVIDIA GPU | transformers, pytorch | **Must** set `MODEL_FAMILY` environment variable |
+| **MLServer** | CPU / NVIDIA GPU | sklearn, xgboost, mlflow | Standard configuration |
+| **MindIE** | Huawei Ascend NPU | mindspore, transformers | **Must** add NPU required Annotations to InferenceService |