Skip to content

Commit

Permalink
sphinx + .nojekyll in docs(<-docs/build/html)
Browse files Browse the repository at this point in the history
  • Loading branch information
JosefAlbers authored Jul 8, 2024
1 parent 5f9a2f0 commit 555c5ad
Show file tree
Hide file tree
Showing 45 changed files with 5,756 additions and 0 deletions.
4 changes: 4 additions & 0 deletions docs/.buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: cb535833a47a005a6969e1994cc1d26c
tags: 645f666f9bcd5a90fca523b33c5a78b7
Empty file added docs/.nojekyll
Empty file.
46 changes: 46 additions & 0 deletions docs/_sources/agent.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
Agent Interactions
==================

Multi-turn Conversation
-----------------------

.. code-block:: python
from phi_3_vision_mlx import Agent
# Create an instance of the Agent
agent = Agent()
# First interaction: Analyze an image
agent('Analyze this image and describe the architectural style:', 'https://images.metmuseum.org/CRDImages/rl/original/DP-19531-075.jpg')
# Second interaction: Follow-up question
agent('What historical period does this architecture likely belong to?')
# End the conversation
agent.end()
Generative Feedback Loop
------------------------

.. code-block:: python
# Ask the agent to generate and execute code to create a plot
agent('Plot a Lissajous Curve.')
# Ask the agent to modify the generated code and create a new plot
agent('Modify the code to plot 3:4 frequency')
agent.end()
External API Tool Use
---------------------

.. code-block:: python
# Request the agent to generate an image
agent('Draw "A perfectly red apple, 32k HDR, studio lighting"')
agent.end()
# Request the agent to convert text to speech
agent('Speak "People say nothing is impossible, but I do nothing every day."')
agent.end()
8 changes: 8 additions & 0 deletions docs/_sources/benchmark.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Benchmark
=================

.. code-block:: python
from phi_3_vision_mlx import benchmark
benchmark()
39 changes: 39 additions & 0 deletions docs/_sources/generate.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
Text Generation
===============

Visual Question Answering
-------------------------

.. code-block:: python
generate('What is shown in this image?', 'https://collectionapi.metmuseum.org/api/collection/v1/iiif/344291/725918/main-image')
Batch Text Generation
---------------------

.. code-block:: python
prompts = [
"Explain the key concepts of quantum computing and provide a Rust code example demonstrating quantum superposition.",
"Write a poem about the first snowfall of the year.",
"Summarize the major events of the French Revolution.",
"Describe a bustling alien marketplace on a distant planet with unique goods and creatures."
"Implement a basic encryption algorithm in Python.",
]
# Generate responses using Phi-3-Vision (multimodal model)
generate(prompts, max_tokens=100)
# Generate responses using Phi-3-Mini-128K (language-only model)
generate(prompts, max_tokens=100, blind_model=True)
Model and Cache Quantization
----------------------------

.. code-block:: python
# Model quantization
generate("Describe the water cycle.", quantize_model=True)
# Cache quantization
generate("Explain quantum computing.", quantize_cache=True)
66 changes: 66 additions & 0 deletions docs/_sources/index.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
.. phi-3-vision-mlx documentation master file, created by
sphinx-quickstart on Sun Jul 7 23:52:18 2024.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Welcome to phi-3-vision-mlx's documentation!
============================================

Phi-3-MLX is a versatile AI framework that leverages both the Phi-3-Vision multimodal model and the recently updated Phi-3-Mini-128K language model, optimized for Apple Silicon using the MLX framework.

`View the project on GitHub <https://github.com/JosefAlbers/Phi-3-Vision-MLX>`_

.. toctree::
:maxdepth: 2
:caption: Contents

install
generate
train
agent
toolchain
benchmark

.. toctree::
:maxdepth: 1
:caption: API Reference

module

Features
--------

- Support for the newly updated Phi-3-Mini-128K (language-only) model
- Integration with Phi-3-Vision (multimodal) model
- Optimized performance on Apple Silicon using MLX
- Batched generation for processing multiple prompts
- Flexible agent system for various AI tasks
- Custom toolchains for specialized workflows
- Model quantization for improved efficiency
- LoRA fine-tuning capabilities
- API integration for extended functionality (e.g., image generation, text-to-speech)

Recent Updates: Phi-3 Mini Improvements
---------------------------------------

Microsoft has recently released significant updates to the Phi-3 Mini model, dramatically improving its capabilities:

- Substantially enhanced code understanding in Python, C++, Rust, and TypeScript
- Improved post-training for better-structured output
- Enhanced multi-turn instruction following
- Added support for the `<|system|>` tag
- Improved reasoning and long-context understanding

For detailed benchmark results, please refer to the tables in the README.

License
-------

This project is licensed under the MIT License.

Citation
--------

.. image:: https://zenodo.org/badge/806709541.svg
:target: https://zenodo.org/doi/10.5281/zenodo.11403221
:alt: DOI
15 changes: 15 additions & 0 deletions docs/_sources/install.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Installation
============

You can install and launch Phi-3-MLX from the command line:

.. code-block:: bash
pip install phi-3-vision-mlx
phi3v
To use the library in a Python script:

.. code-block:: python
from phi_3_vision_mlx import generate
57 changes: 57 additions & 0 deletions docs/_sources/module.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@

Classes
-------

Agent
^^^^^

.. autoclass:: phi_3_vision_mlx.Agent
:no-members:

Functions
---------

add_code
^^^^^^^^

.. autofunction:: phi_3_vision_mlx.add_code

benchmark
^^^^^^^^^

.. autofunction:: phi_3_vision_mlx.benchmark

chatui
^^^^^^

.. autofunction:: phi_3_vision_mlx.chatui

execute
^^^^^^^

.. autofunction:: phi_3_vision_mlx.execute

generate
^^^^^^^^

.. autofunction:: phi_3_vision_mlx.generate

get_api
^^^^^^^

.. autofunction:: phi_3_vision_mlx.get_api

load
^^^^

.. autofunction:: phi_3_vision_mlx.load

test_lora
^^^^^^^^^

.. autofunction:: phi_3_vision_mlx.test_lora

train_lora
^^^^^^^^^^

.. autofunction:: phi_3_vision_mlx.train_lora
66 changes: 66 additions & 0 deletions docs/_sources/toolchain.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
Custom Toolchains
=================

Example 1: In-Context Learning Agent
------------------------------------

.. code-block:: python
from phi_3_vision_mlx import _load_text
# Create a custom tool named 'add_text'
def add_text(prompt):
prompt, path = prompt.split('@')
return f'{_load_text(path)}\n<|end|>\n<|user|>{prompt}'
# Define the toolchain as a string
toolchain = """
prompt = add_text(prompt)
responses = generate(prompt, images)
"""
# Create an Agent instance with the custom toolchain
agent = Agent(toolchain, early_stop=100)
# Run the agent
agent('How to inspect API endpoints? @https://raw.githubusercontent.com/gradio-app/gradio/main/guides/08_gradio-clients-and-lite/01_getting-started-with-the-python-client.md')
Example 2: Retrieval Augmented Coding Agent
-------------------------------------------

.. code-block:: python
from phi_3_vision_mlx import VDB
import datasets
# Simulate user input
user_input = 'Comparison of Sortino Ratio for Bitcoin and Ethereum.'
# Create a custom RAG tool
def rag(prompt, repo_id="JosefAlbers/sharegpt_python_mlx", n_topk=1):
ds = datasets.load_dataset(repo_id, split='train')
vdb = VDB(ds)
context = vdb(prompt, n_topk)[0][0]
return f'{context}\n<|end|>\n<|user|>Plot: {prompt}'
# Define the toolchain
toolchain_plot = """
prompt = rag(prompt)
responses = generate(prompt, images)
files = execute(responses, step)
"""
# Create an Agent instance with the RAG toolchain
agent = Agent(toolchain_plot, False)
# Run the agent with the user input
_, images = agent(user_input)
Example 3: Multi-Agent Interaction
----------------------------------

.. code-block:: python
# Continued from Example 2 above
agent_writer = Agent(early_stop=100)
agent_writer(f'Write a stock analysis report on: {user_input}', images)
46 changes: 46 additions & 0 deletions docs/_sources/train.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
LoRA Fine-tuning
================

Training a LoRA Adapter
-----------------------

.. code-block:: python
from phi_3_vision_mlx import train_lora
train_lora(
lora_layers=5, # Number of layers to apply LoRA
lora_rank=16, # Rank of the LoRA adaptation
epochs=10, # Number of training epochs
lr=1e-4, # Learning rate
warmup=0.5, # Fraction of steps for learning rate warmup
dataset_path="JosefAlbers/akemiH_MedQA_Reason"
)
Generating Text with LoRA
-------------------------

.. code-block:: python
generate("Describe the potential applications of CRISPR gene editing in medicine.",
blind_model=True,
quantize_model=True,
use_adapter=True)
Comparing LoRA Adapters
-----------------------

.. code-block:: python
from phi_3_vision_mlx import test_lora
# Test model without LoRA adapter
test_lora(adapter_path=None)
# Output score: 0.6 (6/10)
# Test model with the trained LoRA adapter (using default path)
test_lora(adapter_path=True)
# Output score: 0.8 (8/10)
# Test model with a specific LoRA adapter path
test_lora(adapter_path="/path/to/your/lora/adapter")
Loading

0 comments on commit 555c5ad

Please sign in to comment.