Skip to content

Commit

Permalink
update README (#337)
Browse files Browse the repository at this point in the history
  • Loading branch information
samos123 authored Dec 7, 2024
1 parent f4a7a16 commit 821f9e3
Showing 1 changed file with 23 additions and 23 deletions.
46 changes: 23 additions & 23 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
# KubeAI: AI Inferencing Operator

Get inferencing running on Kubernetes: LLMs, Embeddings, Speech-to-Text.

✅️ Drop-in replacement for OpenAI with API compatibility
⚖️ Scale from zero, autoscale based on load
🧠 Serve text generation models (LLMs, VLMs, etc.)
💬 Speech to Text API
🧮 Embedding/Vector API
🚀 Multi-platform: CPU-only, GPU, TPU
The easiest way to serve ML models in production. Supports LLMs, embeddings, and speech-to-text.

✅️ OpenAI API Compatibility: Drop-in replacement for OpenAI
⚖️ Autoscaling: Scale from zero, autoscale based on load
🧠 Serve text generation models with vLLM or Ollama
🔌 Lora Adapter aware routing
💬 Speech to Text API with FasterWhisper
🧮 Embedding/Vector API with Infinity
🚀 Multi-platform: CPU, GPU, TPU
💾 Model caching with shared filesystems (EFS, Filestore, etc.)
🛠️ Zero dependencies (does not depend on Istio, Knative, etc.)
💬 Chat UI included ([OpenWebUI](https://github.com/open-webui/open-webui))
🤖 Operates OSS model servers (vLLM, Ollama, FasterWhisper, Infinity)
✉ Stream/batch inference via messaging integrations (Kafka, PubSub, etc.)

Quotes from the community:
Expand All @@ -24,6 +24,20 @@ KubeAI serves an OpenAI compatible HTTP API. Admins can configure ML models via

<img src="./diagrams/arch.excalidraw.png"></img>

## Adopters

List of known adopters:

| Name | Description | Link |
| ---- | ----------- | ---- |
| Telescope | Telescope uses KubeAI for multi-region large scale batch LLM inference. | [trytelescope.ai](https://trytelescope.ai) |
| Google Cloud Distributed Edge | KubeAI is included as a reference architecture for inferencing at the edge. | [LinkedIn](https://www.linkedin.com/posts/mikeensor_gcp-solutions-public-retail-edge-available-cluster-traits-activity-7237515920259104769-vBs9?utm_source=share&utm_medium=member_desktop), [GitLab](https://gitlab.com/gcp-solutions-public/retail-edge/available-cluster-traits/kubeai-cluster-trait) |
| Lambda | You can try KubeAI on the Lambda AI Developer Cloud. See Lambda's [tutorial](https://docs.lambdalabs.com/education/large-language-models/kubeai-hermes-3/) and [video](https://youtu.be/HEtPO2Wuiac). | [Lambda](https://lambdalabs.com/) |
| Vultr | KubeAI can be deployed on Vultr Managed Kubernetes using the application marketplace. | [Vultr](https://www.vultr.com) |
| Arcee | Arcee uses KubeAI for multi-region, multi-tenant SLM inference. | [Arcee](https://www.arcee.ai/) |

If you are using KubeAI and would like to be listed as an adopter, please make a PR.

## Local Quickstart


Expand Down Expand Up @@ -113,20 +127,6 @@ Checkout our documentation on [kubeai.org](https://www.kubeai.org) to find info
* Concepts (how the components of KubeAI work).
* How to contribute

## Adopters

List of known adopters:

| Name | Description | Link |
| ---- | ----------- | ---- |
| Telescope | Telescope uses KubeAI for multi-region large scale batch LLM inference. | [trytelescope.ai](https://trytelescope.ai) |
| Google Cloud Distributed Edge | KubeAI is included as a reference architecture for inferencing at the edge. | [LinkedIn](https://www.linkedin.com/posts/mikeensor_gcp-solutions-public-retail-edge-available-cluster-traits-activity-7237515920259104769-vBs9?utm_source=share&utm_medium=member_desktop), [GitLab](https://gitlab.com/gcp-solutions-public/retail-edge/available-cluster-traits/kubeai-cluster-trait) |
| Lambda | You can try KubeAI on the Lambda AI Developer Cloud. See Lambda's [tutorial](https://docs.lambdalabs.com/education/large-language-models/kubeai-hermes-3/) and [video](https://youtu.be/HEtPO2Wuiac). | [Lambda](https://lambdalabs.com/) |
| Vultr | KubeAI can be deployed on Vultr Managed Kubernetes using the application marketplace. | [Vultr](https://www.vultr.com) |
| Arcee | Arcee uses KubeAI for multi-region, multi-tenant SLM inference. | [Arcee](https://www.arcee.ai/) |

If you are using KubeAI and would like to be listed as an adopter, please make a PR.

## OpenAI API Compatibility

```bash
Expand Down

0 comments on commit 821f9e3

Please sign in to comment.