-
Notifications
You must be signed in to change notification settings - Fork 0
The TEXTGEN Fix
This issue was recorded as NOT HELPFUL, and this solely exists so people can see. Steps to PATCH Here are the detailed steps for reproducing the FIX:
(Note: I will provide instructions for Linux, but if you are using Windows, you will need to make some adjustments. Windows cannot execute .sh files directly unless you're using WSL2. In that case, you would need a .bat file instead of a .sh file, which starts with @echo OFF instead of #!/bin/bash header.)
Create a pre-launcher for Textgen or type the following command on the command line (use export for Linux or set for Windows):
For Linux: export OPENAI_API_KEY="dummy"
For Windows: set OPENAI_API_KEY="dummy"
Here, "dummy" is just an example API key. You can use any string as long as it matches the one in the next step. This key acts as a security password to allow both devices to access each other's ports. For this example, we will use the default key "dummy" provided by Textgen.
Launch the Oobabooga Textgen with the command line arguments --api and --extensions openai, along with any other necessary setup commands for your specific environment. If you prefer the pre-launcher method to set all your GPU settings for two GPUs and have separate scripts for the desired configuration, API, and command line arguments, you can use the following example (which I modified to match your preferences):
(Note: The --trust-remote-code flag should only be used if you know what you are doing. It is typically used with large-context models that require additional offloaded transformers to fit in less than 28GB VRAM. Loading random models from untrusted sources can be risky, so avoid it.)
Here's an example launcher script named launcher.sh:
#!/bin/bash export HOST_PORT=7861 export CUDA_VISIBLE_DEVICES=0 export TORCH_USE_CUDA_DSA=1 export TORCH_CUDA_ARCH_LIST=8.6+PTX export PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.25,max_split_size_mb:128M export OPENAI_API_KEY="dummy"
python3 ./server.py --model TheBloke_orca_mini_7B-GPTQ --loader exllama --listen --listen-port 7861 --api --gpu-split 8_20 --trust-remote-code --extensions openai Create another pre-launcher or use the command line to export/set the Textgen fake key security password and API base:
For Linux: export OPENAI_API_KEY="dummy" export OPENAI_API_BASE=http://0.0.0.0:5001/v1
For Windows: set OPENAI_API_KEY="dummy" set OPENAI_API_BASE=http://0.0.0.0:5001/v1
Again, replace "dummy" with the same key you used in step 1. This step ensures that Textgen uses the correct API key and API base for communication.
If you prefer the pre-launcher method, here's an example launcher script named launcher.sh:
#!/bin/bash
export OPENAI_API_KEY="dummy" export OPENAI_API_BASE=http://0.0.0.0:5001/v1
cd /home/pheonix/Github/gpt-engineer
python3 gpt_engineer/main.py projects/MasterCodeBase/ --temperature 1 --steps benchmark ``
That's it! You're done! Have fun with your Textgen setup!
The GPT Engineer Wiki 😄