All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
- example env min app_api ver. 3.0.0 (#137) @kyteinsky
- add logging files to Dockerfile (#139) @kyteinsky
- make logs persistent (#140) @kyteinsky
- better handling of embedding server failure (#140) @kyteinsky
- remove root logs dir (#140) @kyteinsky
- revert CI env var in embedding server proc (#140) @kyteinsky
- remove unsupported vectordb configs (#140) @kyteinsky
- update readme: remove beta status (#140) @kyteinsky
- update readme for logs (#140) @kyteinsky
- proper handling of index locks (#133) @kyteinsky
- better logging (#135) @kyteinsky
- remove update_progress fn now that there is no caller (#135) @kyteinsky
- prevent two concurrent requests processing the same source (#130) @kyteinsky
- shorten wait to 10 mins before failing in load sources for max requests allowed (#130) @kyteinsky
- fix utf-8 encoding fixes (#118) @kyteinsky
- decl access update in doc indexing (#125) @kyteinsky
- ignore temp exceptions during task polling (#127) @kyteinsky
- only remove leftovers from previous db (#115) @kyteinsky
- database schema change for no data duplication (#108) @kyteinsky
- remove support for Chroma DB and Weaviate (#108) @kyteinsky
- consider delete successful if doc was not in db (#107) @kyteinsky
- catch exceptions in update_access loop (#109) @kyteinsky
- add reuse compliance (#111) @AndyScherzinger
- selective context args (#105) @kyteinsky
- print the original traceback of the exception (#106) @kyteinsky
- do not import types from llama before symlink fix (#102) @kyteinsky
- change postgres port to 5001 + fixes (#103) @kyteinsky
- import signal package directly (#100) @kyteinsky
- reset the vector db in favour of a new embedding model (#98) @kyteinsky
Documents will be reindexed in this version. They will be reindexed again in the stable 4.0.0 release. This version is not recommended for production use.
- Better error and context handling (#83) @kyteinsky
- Remove null bytes out of document texts (#86) @kyteinsky
- Add title to header validation in load docs (#87) @kyteinsky
- Memory leak fixes and marker tests (#89) @marcelklehr
- Download model in background task (#96) @kyteinsky @marcelklehr
- Isolate doc ingestion with a llama http server (#90) @kyteinsky
- Add postgresql vectordb support (#84) @kyteinsky
- Use multilingual embedding model (#81) @marcelklehr
- Install postgresql in the docker container (#95) @kyteinsky
- New minor version to maintain versioning consistency with the companion app
- lock embedding model forward pass (#78) @kyteinsky
- fix remaining lowercase comparisons for COMPUTE_DEVICE @kyteinsky
- use uppercase comparisons for COMPUTE_DEVICE @kyteinsky
- add traceback to caught exception in doc loader @kyteinsky
- make stuff fit in 8GB VRAM and don't lock text2text api calls (#70) @kyteinsky
- fix: detect additional NVIDIA GPUs (#68) @kyteinsky
- update llama-cpp-python package in dockerfile @kyteinsky
- nvidia-cuda/llama.cpp compat issue @kyteinsky
- New major version to maintain versioning consistency with the companion app
- Update readme @kyteinsky
- Use Taskprocessing TextToText provider as LLM (#60) @marcelklehr
- Upgrade base image to cuda 12.2 @kyteinsky
- use COMPUTE_DEVICE env var if present for config @kyteinsky
- add cuda compat llib path back @kyteinsky
- leave room for generated tokens in the context window @kyteinsky
- Dockerfile llama-cpp-python install @kyteinsky
- Version based repair and other changes (#54) @kyteinsky
- .in.txt and use compiled llama-cpp-python @kyteinsky
- correctly log exceptions @kyteinsky
- do not verify docs before delete in Chroma (#53) @kyteinsky
- offload only when instantiated @kyteinsky
- add odfpy back and update deps @kyteinsky
- up context limit to 30 @kyteinsky
- update configs @kyteinsky
- change repairs to be version based @kyteinsky
- upgrade base image to cuda 12.1 and drop cuda dev deps @kyteinsky
- gh: run the prompts without strategy matrix @kyteinsky
- simple queueing of prompts @kyteinsky
- dynamic loader and unloader @kyteinsky
- add
GET /enabled
for init check @kyteinsky - Use the user's language (#50) @marcelklehr
- use 8192 as context length
- replace @ with .at. in collection name
- replace pandoc completely due to random memory hogs with other python packages
- types fixes and langchain import updates
- no context generation is now a chat completion
- filter sources before document decode
- set the memory limit for pandoc to 4GB (#29)
- adjustments for changes in AppAPI in last two months (#26)
- pass useContext to the query function
- prune context/query to fit the context window
- pandoc hangs
- accelerator detection on container boot
- repair steps
- increase context length to 16384
- user_id sanitisation for vectordb collection names
- symlink config.yaml in the persistent dir
- use requirements.cpu.txt in CI due to space constraints
- use ubuntu-22.04 to use gh runners
- migrate useful env vars to config(.cpu).yaml
- move config(.cpu)?.yaml to persistent_storage
- modifications to scoped context chat
- fix: location of config.yaml in the dockerfiles
- pre-commit autoupdate
- fix: convert getenv's output to str
- type fixes and other numerous fixes
- fix: metadata search for provider
- move COLLECTION_NAME to vectordb dir
- skip ingestion .pot files
- Added initial cuda11.8 support (#16)
- Introduce /deleteSourcesByProviderForAllUsers and fixes
- add support for scoped context in query
- add integration test
- use /init and persistent storage
- drop
.run/
- revert pytorch to cpu-only package
- add end_separator option in config
- support new content providers
- update app_api auth middleware
- fix add ctransformers to supported models list
- update readme with new tips and tricks
- update Makefile
- update llama_cpp_python to fix inference on gpu machines
- use normal torch for arm builds
- updated app store description and readme
- the app