From 9d4fb2c02554cd67f3360df8c77016627710d437 Mon Sep 17 00:00:00 2001 From: Sam Stoelinga Date: Mon, 28 Oct 2024 22:59:47 -0700 Subject: [PATCH] Update kubernetes api reference (#290) --- docs/reference/kubernetes-api.md | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/docs/reference/kubernetes-api.md b/docs/reference/kubernetes-api.md index e9d1602a..98d813a3 100644 --- a/docs/reference/kubernetes-api.md +++ b/docs/reference/kubernetes-api.md @@ -59,10 +59,11 @@ _Appears in:_ | Field | Description | Default | Validation | | --- | --- | --- | --- | -| `url` _string_ | URL of the model to be served.
Currently only the following formats are supported:
For VLLM & FasterWhisper engines: "hf:///"
For OLlama engine: "ollama:// | | | +| `url` _string_ | URL of the model to be served.
Currently only the following formats are supported:
For VLLM & FasterWhisper engines: "hf:///"
For OLlama engine: "ollama:// | | Required: \{\}
| | `features` _[ModelFeature](#modelfeature) array_ | Features that the model supports.
Dictates the APIs that are available for the model. | | Enum: [TextGeneration TextEmbedding SpeechToText]
| -| `engine` _string_ | Engine to be used for the server process. | | Enum: [OLlama VLLM FasterWhisper Infinity]
| +| `engine` _string_ | Engine to be used for the server process. | | Enum: [OLlama VLLM FasterWhisper Infinity]
Required: \{\}
| | `resourceProfile` _string_ | ResourceProfile required to serve the model.
Use the format ":".
Example: "nvidia-gpu-l4:2" - 2x NVIDIA L4 GPUs.
Must be a valid ResourceProfile defined in the system config. | | | +| `cacheProfile` _string_ | CacheProfile to be used for caching model artifacts.
Must be a valid CacheProfile defined in the system config. | | | | `image` _string_ | Image to be used for the server process.
Will be set from ResourceProfile + Engine if not specified. | | | | `args` _string array_ | Args to be added to the server process. | | | | `env` _object (keys:string, values:string)_ | Env variables to be added to the server process. | | | @@ -89,6 +90,23 @@ _Appears in:_ | Field | Description | Default | Validation | | --- | --- | --- | --- | | `replicas` _[ModelStatusReplicas](#modelstatusreplicas)_ | | | | +| `cache` _[ModelStatusCache](#modelstatuscache)_ | | | | + + +#### ModelStatusCache + + + + + + + +_Appears in:_ +- [ModelStatus](#modelstatus) + +| Field | Description | Default | Validation | +| --- | --- | --- | --- | +| `loaded` _boolean_ | | | | #### ModelStatusReplicas