-
Notifications
You must be signed in to change notification settings - Fork 17
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
sphinx + .nojekyll in docs(<-docs/build/html)
- Loading branch information
1 parent
5f9a2f0
commit 555c5ad
Showing
45 changed files
with
5,756 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# Sphinx build info version 1 | ||
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done. | ||
config: cb535833a47a005a6969e1994cc1d26c | ||
tags: 645f666f9bcd5a90fca523b33c5a78b7 |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
Agent Interactions | ||
================== | ||
|
||
Multi-turn Conversation | ||
----------------------- | ||
|
||
.. code-block:: python | ||
from phi_3_vision_mlx import Agent | ||
# Create an instance of the Agent | ||
agent = Agent() | ||
# First interaction: Analyze an image | ||
agent('Analyze this image and describe the architectural style:', 'https://images.metmuseum.org/CRDImages/rl/original/DP-19531-075.jpg') | ||
# Second interaction: Follow-up question | ||
agent('What historical period does this architecture likely belong to?') | ||
# End the conversation | ||
agent.end() | ||
Generative Feedback Loop | ||
------------------------ | ||
|
||
.. code-block:: python | ||
# Ask the agent to generate and execute code to create a plot | ||
agent('Plot a Lissajous Curve.') | ||
# Ask the agent to modify the generated code and create a new plot | ||
agent('Modify the code to plot 3:4 frequency') | ||
agent.end() | ||
External API Tool Use | ||
--------------------- | ||
|
||
.. code-block:: python | ||
# Request the agent to generate an image | ||
agent('Draw "A perfectly red apple, 32k HDR, studio lighting"') | ||
agent.end() | ||
# Request the agent to convert text to speech | ||
agent('Speak "People say nothing is impossible, but I do nothing every day."') | ||
agent.end() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
Benchmark | ||
================= | ||
|
||
.. code-block:: python | ||
from phi_3_vision_mlx import benchmark | ||
benchmark() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
Text Generation | ||
=============== | ||
|
||
Visual Question Answering | ||
------------------------- | ||
|
||
.. code-block:: python | ||
generate('What is shown in this image?', 'https://collectionapi.metmuseum.org/api/collection/v1/iiif/344291/725918/main-image') | ||
Batch Text Generation | ||
--------------------- | ||
|
||
.. code-block:: python | ||
prompts = [ | ||
"Explain the key concepts of quantum computing and provide a Rust code example demonstrating quantum superposition.", | ||
"Write a poem about the first snowfall of the year.", | ||
"Summarize the major events of the French Revolution.", | ||
"Describe a bustling alien marketplace on a distant planet with unique goods and creatures." | ||
"Implement a basic encryption algorithm in Python.", | ||
] | ||
# Generate responses using Phi-3-Vision (multimodal model) | ||
generate(prompts, max_tokens=100) | ||
# Generate responses using Phi-3-Mini-128K (language-only model) | ||
generate(prompts, max_tokens=100, blind_model=True) | ||
Model and Cache Quantization | ||
---------------------------- | ||
|
||
.. code-block:: python | ||
# Model quantization | ||
generate("Describe the water cycle.", quantize_model=True) | ||
# Cache quantization | ||
generate("Explain quantum computing.", quantize_cache=True) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
.. phi-3-vision-mlx documentation master file, created by | ||
sphinx-quickstart on Sun Jul 7 23:52:18 2024. | ||
You can adapt this file completely to your liking, but it should at least | ||
contain the root `toctree` directive. | ||
Welcome to phi-3-vision-mlx's documentation! | ||
============================================ | ||
|
||
Phi-3-MLX is a versatile AI framework that leverages both the Phi-3-Vision multimodal model and the recently updated Phi-3-Mini-128K language model, optimized for Apple Silicon using the MLX framework. | ||
|
||
`View the project on GitHub <https://github.com/JosefAlbers/Phi-3-Vision-MLX>`_ | ||
|
||
.. toctree:: | ||
:maxdepth: 2 | ||
:caption: Contents | ||
|
||
install | ||
generate | ||
train | ||
agent | ||
toolchain | ||
benchmark | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
:caption: API Reference | ||
|
||
module | ||
|
||
Features | ||
-------- | ||
|
||
- Support for the newly updated Phi-3-Mini-128K (language-only) model | ||
- Integration with Phi-3-Vision (multimodal) model | ||
- Optimized performance on Apple Silicon using MLX | ||
- Batched generation for processing multiple prompts | ||
- Flexible agent system for various AI tasks | ||
- Custom toolchains for specialized workflows | ||
- Model quantization for improved efficiency | ||
- LoRA fine-tuning capabilities | ||
- API integration for extended functionality (e.g., image generation, text-to-speech) | ||
|
||
Recent Updates: Phi-3 Mini Improvements | ||
--------------------------------------- | ||
|
||
Microsoft has recently released significant updates to the Phi-3 Mini model, dramatically improving its capabilities: | ||
|
||
- Substantially enhanced code understanding in Python, C++, Rust, and TypeScript | ||
- Improved post-training for better-structured output | ||
- Enhanced multi-turn instruction following | ||
- Added support for the `<|system|>` tag | ||
- Improved reasoning and long-context understanding | ||
|
||
For detailed benchmark results, please refer to the tables in the README. | ||
|
||
License | ||
------- | ||
|
||
This project is licensed under the MIT License. | ||
|
||
Citation | ||
-------- | ||
|
||
.. image:: https://zenodo.org/badge/806709541.svg | ||
:target: https://zenodo.org/doi/10.5281/zenodo.11403221 | ||
:alt: DOI |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
Installation | ||
============ | ||
|
||
You can install and launch Phi-3-MLX from the command line: | ||
|
||
.. code-block:: bash | ||
pip install phi-3-vision-mlx | ||
phi3v | ||
To use the library in a Python script: | ||
|
||
.. code-block:: python | ||
from phi_3_vision_mlx import generate |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
|
||
Classes | ||
------- | ||
|
||
Agent | ||
^^^^^ | ||
|
||
.. autoclass:: phi_3_vision_mlx.Agent | ||
:no-members: | ||
|
||
Functions | ||
--------- | ||
|
||
add_code | ||
^^^^^^^^ | ||
|
||
.. autofunction:: phi_3_vision_mlx.add_code | ||
|
||
benchmark | ||
^^^^^^^^^ | ||
|
||
.. autofunction:: phi_3_vision_mlx.benchmark | ||
|
||
chatui | ||
^^^^^^ | ||
|
||
.. autofunction:: phi_3_vision_mlx.chatui | ||
|
||
execute | ||
^^^^^^^ | ||
|
||
.. autofunction:: phi_3_vision_mlx.execute | ||
|
||
generate | ||
^^^^^^^^ | ||
|
||
.. autofunction:: phi_3_vision_mlx.generate | ||
|
||
get_api | ||
^^^^^^^ | ||
|
||
.. autofunction:: phi_3_vision_mlx.get_api | ||
|
||
load | ||
^^^^ | ||
|
||
.. autofunction:: phi_3_vision_mlx.load | ||
|
||
test_lora | ||
^^^^^^^^^ | ||
|
||
.. autofunction:: phi_3_vision_mlx.test_lora | ||
|
||
train_lora | ||
^^^^^^^^^^ | ||
|
||
.. autofunction:: phi_3_vision_mlx.train_lora |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
Custom Toolchains | ||
================= | ||
|
||
Example 1: In-Context Learning Agent | ||
------------------------------------ | ||
|
||
.. code-block:: python | ||
from phi_3_vision_mlx import _load_text | ||
# Create a custom tool named 'add_text' | ||
def add_text(prompt): | ||
prompt, path = prompt.split('@') | ||
return f'{_load_text(path)}\n<|end|>\n<|user|>{prompt}' | ||
# Define the toolchain as a string | ||
toolchain = """ | ||
prompt = add_text(prompt) | ||
responses = generate(prompt, images) | ||
""" | ||
# Create an Agent instance with the custom toolchain | ||
agent = Agent(toolchain, early_stop=100) | ||
# Run the agent | ||
agent('How to inspect API endpoints? @https://raw.githubusercontent.com/gradio-app/gradio/main/guides/08_gradio-clients-and-lite/01_getting-started-with-the-python-client.md') | ||
Example 2: Retrieval Augmented Coding Agent | ||
------------------------------------------- | ||
|
||
.. code-block:: python | ||
from phi_3_vision_mlx import VDB | ||
import datasets | ||
# Simulate user input | ||
user_input = 'Comparison of Sortino Ratio for Bitcoin and Ethereum.' | ||
# Create a custom RAG tool | ||
def rag(prompt, repo_id="JosefAlbers/sharegpt_python_mlx", n_topk=1): | ||
ds = datasets.load_dataset(repo_id, split='train') | ||
vdb = VDB(ds) | ||
context = vdb(prompt, n_topk)[0][0] | ||
return f'{context}\n<|end|>\n<|user|>Plot: {prompt}' | ||
# Define the toolchain | ||
toolchain_plot = """ | ||
prompt = rag(prompt) | ||
responses = generate(prompt, images) | ||
files = execute(responses, step) | ||
""" | ||
# Create an Agent instance with the RAG toolchain | ||
agent = Agent(toolchain_plot, False) | ||
# Run the agent with the user input | ||
_, images = agent(user_input) | ||
Example 3: Multi-Agent Interaction | ||
---------------------------------- | ||
|
||
.. code-block:: python | ||
# Continued from Example 2 above | ||
agent_writer = Agent(early_stop=100) | ||
agent_writer(f'Write a stock analysis report on: {user_input}', images) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
LoRA Fine-tuning | ||
================ | ||
|
||
Training a LoRA Adapter | ||
----------------------- | ||
|
||
.. code-block:: python | ||
from phi_3_vision_mlx import train_lora | ||
train_lora( | ||
lora_layers=5, # Number of layers to apply LoRA | ||
lora_rank=16, # Rank of the LoRA adaptation | ||
epochs=10, # Number of training epochs | ||
lr=1e-4, # Learning rate | ||
warmup=0.5, # Fraction of steps for learning rate warmup | ||
dataset_path="JosefAlbers/akemiH_MedQA_Reason" | ||
) | ||
Generating Text with LoRA | ||
------------------------- | ||
|
||
.. code-block:: python | ||
generate("Describe the potential applications of CRISPR gene editing in medicine.", | ||
blind_model=True, | ||
quantize_model=True, | ||
use_adapter=True) | ||
Comparing LoRA Adapters | ||
----------------------- | ||
|
||
.. code-block:: python | ||
from phi_3_vision_mlx import test_lora | ||
# Test model without LoRA adapter | ||
test_lora(adapter_path=None) | ||
# Output score: 0.6 (6/10) | ||
# Test model with the trained LoRA adapter (using default path) | ||
test_lora(adapter_path=True) | ||
# Output score: 0.8 (8/10) | ||
# Test model with a specific LoRA adapter path | ||
test_lora(adapter_path="/path/to/your/lora/adapter") |
Oops, something went wrong.