Skip to content

Latest commit

 

History

History
759 lines (489 loc) · 20.6 KB

CHANGELOG.md

File metadata and controls

759 lines (489 loc) · 20.6 KB

Changelog

1.2.14

  • update: labelstudio endpoint

1.2.13

  • fix: evaluation pipelines

1.2.12

  • update: package dependencies for python 3.9

1.2.11

  • update: python version to 3.9 for slu repo conda environments

1.2.10

  • update: python version for slu repo conda environments

1.2.9

  • update: GITLAB token

1.2.8

  • update: AWS credentials and GITHUB token

1.2.7

  • update: DB user

1.2.6

  • replace: Console links with Studio links in pipelines

1.2.5

  • fix: Missing dependency in chrome installation

1.2.4

  • fix: Upgrade Selenium version

1.2.3

  • fix: Chrome and chromedriver issue for image build

1.2.2

  • update: Update Gitlab personal token

1.2.1

  • remove: Remove support for fetch_n_tag_calls and tag_calls

1.1.24

  • add: Support for a new column in conversations table

1.1.23

  • Bugfix: Fix missing metrics folder and s3 upload region

1.1.22

  • add: blocking_disposition is now being extracted as part of call-level tagging jobs

1.1.21

  • add: Pipeline to invalidate situations in DB for LLMs

1.1.20

  • add: New s3_buckets for sandbox regions in India and US

1.1.19

  • add: previous_disposition is now being extracted as part of call-level tagging jobs

1.1.18

  • Bugfix: fix issue with sample conversation display

1.1.17

  • Update skit-labels version

1.1.16

  • Bugfix: Fix db connection issue

1.1.15

  • add: Pipeline to generate conversations and upload it for tagging

1.1.14

  • Update github access token in secrets

1.1.12

  • Bugfix: fix issue with fetch_calls pipeline

1.1.11

  • Bugfix: fix issue with retrain_slu_old pipeline

1.1.10

  • Bugfix: update skit-calls version

1.1.9

  • add: Change node selector in US

1.1.8

  • add: support for displaying a sample of the conversation generated

1.1.7

  • fix: the handling of the format of scenario param in generate sample conversations pipeline

1.1.6

  • add: Pipeline for generating sample conversations

1.1.5

  • fix: No calls found from FSM shouldnt fail data_fetch pipelines

1.1.4

  • fix: Intent list was not coming in comprision file while training or evaluate slu.

1.1.3

  • Update node selector to match node groups in US

1.1.2

  • add environment variables in CI

1.1.1

  • add: Added re_presign_s3_urls component, which can re-presign publis s3 http urls in transcription_pipeline.
  • fix: join on dataframes bug fix at merge_transcription component.

1.1.0

  • add: Added evaluate_slu pipeline in the repo to test the slu
  • add: Added features of comparing repo while testing
  • fix: created multiple component to reduce the retendency and complaxity of repo.
  • fix: Moved common funtions in seperate utils files for retrain and evaluate slu.

1.0.7

  • Update credentials for accessing fsm DB
  • Add support for multiple flow ids for fetch call related pipelines

1.0.6

  • Bugfix: update skit-calls version

1.0.4

  • Bugfix: Dvc config updated only for new SKU repos

1.0.2

  • Add support for obtaining comparision confusion matrix while retraining SLU

1.0.0

  • Remove deprecated pipelines

0.2.105

  • Functionality to compare new trained SLU model with live model on test data (#115)

0.2.104

  • Fixes issue with chrome browser and driver mismatch (#116)

0.2.103

  • Add retry for LS uploads in skit-labels (#22)

0.2.102

  • Redact user info using presidio (#113)
  • updated aws creds

0.2.101

  • add: s3 client aws access id and keys secrets

0.2.99

  • alias_bug_fix: fixes issue in dowloading yaml file. (#112)
  • Update key related dependencies and add time logger (#110)

0.2.98

  • Add new pipeline to identify and persist ML compliance breaches in calls

0.2.97

  • fix: logs streaming during train and test (#108)

0.2.96

  • add: skit-calls version bump for presigned-url logic for turn audio url in US cluster

0.2.95

  • Refactor retraining pipelines for new SLU architecture (#107)

0.2.94

  • download_yaml: chnages in alias file download from gitlab repo. (#106), storage_full_fix: Removig old_data while velidating pipelines.

0.2.92

  • fix: template_id not working properly, changed skit-calls query secrets
  • PL-1300: new data format support & extra meta-annotation-columns (#21)

0.2.91

  • Extend GPT support for fetch_n_tag_calls and update prompts (#104)
  • transcription pipeline: add support for mp3 file format while overlaying transcriptions (#103)

0.2.90

  • update: Secrets for assisted annotation pipeline

0.2.89

  • add: Slu customization integration with retrain_slu pipeline (#102)
  • add: support for call level tagging jobs for tag_calls pipeline, alongwith turn level tagging

0.2.88

  • add: Merge pull request #99 from skit-ai/gpt_intents

0.2.87

  • add: validate setup component for retrain_slu pipeline, tests slu setup builds in cpu node

0.2.86

  • update: skit-calls and skit-labels version update
  • fix: alternatives column postprocessing after downloading from labelstudio, client_id not being truly optional for fetch_calls component

0.2.84

  • update: skit-calls version bump for reftime with timezone

0.2.83

  • update: skit-calls version bump and use_fsm_url as param for all fetch_calls related pipelines

0.2.82

  • add: comma separated list of client_id is supported, template_id to filter calls is supported. Both in fetch_calls component + also modified downstream pipelines using it

0.2.81

  • fix: upstream column names for entity tagged dataset
  • update: version bump for skit-calls which helps to sample turns in a call based on comma separated list of intents

0.2.80

  • [PL-997] adding more call metadata from upstream for CRR tagging

0.2.79

  • add: skit-calls version bump which includes new columns for fetch_calls component returned csv - call_type, disposition, call_end_status, flow_name

0.2.78

  • update: buffer for call_quantity for fetch_calls component

0.2.77

  • update: use_fsm_url flag based on region for deciding turn audio uri paths should be from fsm or s3 bucket directly
  • update: skit-calls version bump for use-fsm-url flag
  • update: org_auth_token component's output is optional since tog deprecated

0.2.76

  • fix: made console URL region specific to fix call & slot tagging job uploads (esp in US)

0.2.75

  • update: removed audio validation from fetch_calls component as min_duration is present now.

0.2.74

  • add: get pipeline run error logs as slackbot message (#93)

0.2.73

  • update: skit-calls version bump - calls-with-cors path removal for audio_url in fetched calls.

0.2.72

  • fix: granular time filters not getting applied for ml fetch calls pipelines - date offset

0.2.71

  • fix: granular time filters not getting applied for ml fetch calls pipelines

0.2.70

  • skit-calls version bump for minute offset to def process_date_filters - in fetch_calls component

0.2.69

  • revert: "cache the poetry install step for faster docker builds and dev experience (#90)" (#92)

0.2.68

  • update: skit-calls version bump to 0.2.27 for sampling call on min_duration

0.2.67

  • fix: Auto-MR creation breaks in SLU training pipeline when classification report / confusion matrix is long (#91)
  • update: use poetry version only what mentioned in SLU repo in retrain_slu pipeline
  • update: cache the poetry install step for faster docker builds and dev experience (#90)
  • fix: asr tune pull request #89 from skit-ai/asr_tune_hi_fix

0.2.66

  • update: upload same data to multiple labelstudio project ids

0.2.65

  • add: new component file_contents_to_markdown for better gitlab mr description with reports (#88)

0.2.64

  • fix: data_label bug

0.2.63

  • update: set default value of data_label as Live

0.2.62

  • fix: remove_empty_audios not setting to false for fetch_calls

0.2.61

  • add: remove_empty_audios param controllable for fetch_calls_pipeline

0.2.60

  • update: make docs

0.2.59

  • update: remove default option for data_label in all tag_call component involving pipelines

0.2.58

  • add: mandatory data_label field for tag_calls component
  • add: optional data_labels field for fetch_tagged_data_from_labelstore component
  • update: skit-labels version bump to 0.3.31

0.2.57

  • fix: skit-calls secrets query for language - version bump to 0.2.26

0.2.56

  • update: skit-calls version bump to 0.2.25
  • fix: upload2s3 folder upload bug

0.2.55

  • upload2s3: correct bug in upload_as_directory that leads to flattening of path_on_disk contents in the target output_path (#82)
  • change: changed the parser for Hindi from Unified Parser to Character split. This is a breaking change and support for ASR models hi-v4 and b elow is removed (#83)

0.2.54

  • add: new pipeline for pushing same data for intent, entities & slot/call tagging (#81)

0.2.53

  • fix: retrain slu pipeline fails when only s3 uri provided

0.2.52

  • update: label store data format and query changes (#80)

0.2.51

  • update: db_host const
  • add: timezone based on cluster region

0.2.50

  • add: fetch_tagged_data_from_labelstore pipeline (#79)
  • fix: intent column placeholder for slu train pipeline

0.2.49

  • fix: upstream changes made by fetch_tagged_dataset component (#78)

0.2.48

  • update: single function handling both tog and labelstudio data fetching - skit-labels version bump
  • update: skit-calls version bump to 0.2.24

0.2.47

  • fix: call_type bug for inbound/outbound only
  • fix: downgrade google-auth-oauthlib to 0.4.6 (#77)

0.2.46

  • update: for pipeines using fetch_calls component, call_type defaults to "inbound" and "outbound" both
  • update: skit-calls version bump to 0.2.23

0.2.45

  • fix: training pipeline bug

0.2.44

  • update: gitlab token secret

0.2.43

  • update: retrain_slu pipeline changes to align with new Deployment CI/CD for SLU (#76)
  • add: raise exception when no calls found/csv empty (#75)

0.2.42

  • update: skit-labels version bump 0.3.29
  • update: discrepancy check for raw.intent and intent column for test dataset in retrain_slu_from_repo component

0.2.40

  • update: force utterances as str in gen_asr_metrics component (#71)
  • add: checks for 0 byte audios before uploading for tagging (#72)
  • add: custom port support while testing pipeline
  • update: Extending retrain_slu pipeline features (#73)
  • fix: entity pipelines supporting labelstudio datasets (#74)

0.2.39

  • update: Asr tune pipeline enhancements (#70

0.2.38

  • update: secrets + us ml metrics db env vars
  • update: upload2s3 upload directories, asr_tune uploads directory and process true transcript in asr_eval_pipeline (#69)

0.2.37

  • fix: preprocess step skipped for lablestudio (#68)

0.2.36

  • update: irr fixes for labelstudio datasets (#66)

0.2.35

  • add: slack bots and channels based on cluster region

0.2.34

  • fix: again pipeline_constants not defined error in asr_tune component (#65)

0.2.32

  • fix: regionwise nodeselector for pipelines

0.2.31

  • fix: pipeline_constants not defined error in asr_tune component (#63)

0.2.30

  • fix: empty_possible param for download_file_from_s3 component

0.2.29

  • fix: involving nan + adjusting for interval types - fetch_tagged_entity_dataset pipeline
  • add: lm tuning pipeline (#58)
  • update: download_from_s3 component into downstream components
  • add: auto nodeselector based on pipeline type - cpu/gpu (#60)
  • update: fetch data from multiple job/project ids at once (combined data) (#61)
  • add: Transcription Pipeline V1 (#56)
  • add: retrain_slu pipeline for automated SLU retraining with deployment tracked using gitlab. (#62)

0.2.28

  • fix: if not project_id present then only force routes for selected clients - tag_calls.

0.2.27

  • fix: org_id type being different across components.

0.2.26

  • add: eer pipelines (#55)

0.2.25

  • add: fetch_tagged_entity_dataset pipeline. (#54)

0.2.24

  • update: force route tagging requests to labelstudio. (#53)

0.2.23

  • add: offset for pulling last n days fetch_tagged_dataset op, and thus used in irr_from_tog pipeline (#49)

0.2.22

  • add: Authentication and Authorization of requests using JWT - Oauth2

0.2.21

  • update: eevee yaml component for exhaustive possible intent metrics

0.2.19

  • update: version bump for skit-calls and skit-labels
  • update: new AUDIO_URL_DOMAIN secret for param in fetch_calls component

0.2.18

  • fix: bugs in eval_asr_pipeline

0.2.17

  • fix: slackbot command parser for base64 breaking when period (.) at end of text
  • add: run pipelines from func, without uploading yamls + easier dev workflow
  • add: asr-eval-pipeline phase 1
  • update: irr_from_tog now has mlwr=True arg (which also requires slu_project_name) for pushing calculated eevee metrics on a tog job to ml-metrics db, intent_metrics table

0.2.16

  • update: start_date and end_date optional for fetch_calls_pipeline, fetch_n_tag_calls, fetch_calls_n_push_to_sheets and fetch_calls_n_upload_tog_and_sheet

0.2.15

  • update: skit-labels version bump to 0.3.27 - tag_calls and fetch_n_tag_calls uploads data in batches of batched data with retries + sleep.

0.2.14

  • fix: call_type default arg from "inbound" to "INBOUND" for fetch_calls_pipeline
  • fix: eval_voicebot_xlmr_pipeline to work even when model is not present

0.2.13

  • update: skit-labels version bump to 0.3.26

0.2.12

  • add: irr_from_tog pipeline, it takes a tog job id/ labelstudio project id and outputs eevee's intent metrics uploaded to s3 & optionally posted on slack, an addition to eval_voicebot_xlmr_pipeline.

0.2.7

  • update: skit-calls version bump to 0.2.21

0.2.6

  • add: use slackbot reminders to run/schedule recurring pipelines
  • update: skit-calls version bump to 0.2.19

0.2.5

  • update: date time offsets for data pipelines.
  • update: tarfile uploads for directories.

0.2.4

  • fix: uploading dirs to s3
  • fix: tagging response defined before use.

0.2.3

  • update: xlmr eval output as csv.

0.2.2

  • fix: evaluation pipeline for intent evaluation on f1-score metric.

0.2.1

  • fix: training pipeline accomodates older/newer datasets.
  • update: tag calls raises exception if neither of tog/labelstudio ids are provided.
  • update: tag calls raises exception if no data was uploaded.

0.2.0

  • fix: slack parser handles urls in code blocks.
  • update: add slack thread id, channel and user id automatically.
  • update: slack notification component expects code_blocks instead of s3_path.
  • fix: slack parser compatible with python 3.8
  • fix: notfications are sent to slack threads if thread id is present.
  • feat: Notify users when a pipeline run is complete / failed.
  • update: tag calls returns multiple outputs.

0.1.99

  • fix: auth-token type.

0.1.98

  • fix: training pre-proc handles utterances from these columns as well: alternatives, data.

0.1.97

  • update: dockerfile uses python 3.8 and cuda 10.2.
  • fix: labelstudio download.
  • update: compatible with python 3.8.

0.1.96

  • update: normalize.comma_sep_str can be used for comma separated numbers and strings.

0.1.95

  • fix: TypeError: Object of type JSONDecodeError is not JSON serializable on dataset uploads.
  • fix: invalid literal for int() with base 10: '' on dataset uploads

0.1.94

  • update: skit-labels unpacks tagged dataset.
  • fix: 'AttributeError: 'NoneType' object has no attribute 'get' on fetching calls with failed predictions.
  • fix: compatibility between skit-labels, skit-calls, skit-auth, kfp, etc.
  • add: s3 url support for tag calls pipeline.

0.1.93

  • fix: Slack hyperlink markup removed.

0.1.92

  • add: Labelstudio integration.

0.1.91

  • update: fetch_calls_n_upload_tog_and_sheet and fetch_calls_n_push_to_sheets pipelines refactor

0.1.90

  • update: replace org-id with reference in s3 upload component.

0.1.89

  • fix: storage options.

0.1.88

  • fix: slack token const.
  • refactor: unused types.

0.1.87

  • fix: db host name.
  • update: default arguments fixed.

0.1.86

  • docs: Better examples, easier to copy paste commands.
  • add: slack notifications to all components.

0.1.85

  • fix: slack urls.
  • add: slack notifications to all components.

0.1.84

  • add: slack signing secret.

0.1.83

  • docs: Pipeline and payload docs added.
  • update: slackbot command parser responds with stacktrace.

0.1.82

  • fix: No module named 'skit_pipelines' caused by installing before copy source in dockerfile.
  • refactor: makefile for building pipelines is leaner.

0.1.80

  • add: slack-bot integration. Invoke pipelines via slackbot.
  • refactor: Automatic generation of pydantic models using kfp signature.

0.1.79

  • add: intent evaluation pipeline.
  • add: crr evaluation pipeline.
  • add: slack-bot integration.

0.1.62

  • update: remove helper function for upload2sheet
  • update: sheet duplication and row logic while pushing calls to google sheet

0.1.53

  • update: move helper functions of upload2sheet into a folder

0.1.52

  • update: skit-calls==0.2.15

0.1.48

  • update: skit-calls==0.2.14

0.1.45

  • fix: to make Docker file and github actions yaml to use google secrets

0.1.44

  • update: component to push CSV to a google sheet. Load google secrets from a Github secrets
  • update: Docker file and github actions yaml to use google secrets

0.1.43

  • add: component to push CSV to a google sheet.
  • add: pipeline to fetch calls and push to google sheet.

0.1.42

  • update: XLMR training supports lr parameter.

0.1.41

  • update: skit-calls==0.2.13

0.1.40

  • update: skit-labels==0.3.21
  • update: ping users on slack channels.

0.1.39

  • update: skit-calls==0.2.12
  • update: remove recurrent from fetch-n-tag pipeline and start_date, end_date logic moved to skit-calls instead.

0.1.38

  • update: skit-calls==0.2.11
  • update: skit-labels==0.3.20
  • fix: save labelencoder pickle while running the train xlmr pipeline.

0.1.32

  • update: remove dvc link for pipelines.
  • add: makefile downloads secrets from skit-calls.

0.1.26

  • feat: APIs for pipelines.
  • add: secrets via dvc.

0.1.25

  • add: pipeline to fetch and tag a dataset.
  • fix: support legacy dataframes missing utterance columns.

0.1.24

  • update: XLMR intent classifier training pipeline pushes models to s3.

0.1.23

  • update: trained models upload to s3.

0.1.22

  • add: Conda and Cuda within base docker image.

0.1.21

  • add: Cuda installation within Dockerfile.
  • fix: Intent classifier training component.

0.1.20

0.1.19

  • update: kfp installed within container.

0.1.18

  • update: components isolated from helper functions.

0.1.17

  • add: component to create utterance column utterances.
  • add: component to create true intent column intent_y.
  • add: component to add state and utterances as features for intent classifer (xlmr).
  • update: model training pipeline with train set only.

0.1.16

  • add: torch = "^1.11.0" for cuda 10.2

0.1.15

  • add: preprocessing module for specialized components.

0.1.14

  • update: skit-labels 0.3.17 with higher tolerance for utterance structures.

0.1.13

  • update: skit-labels skit-calls for serialized json fields.

0.1.12

  • update: skit-labels 0.3.13, values for db creds resolved.

0.1.11

  • add: placeholder component to train xlmr intent classifier.

0.1.10

  • add: component that fetches tagged datasets.
  • add: kubeflow pipeline utilizing the above component.

0.1.9

  • update: skit-calls 0.2.3

0.1.8

  • fix: Slack notification component -- Slack token constant.

0.1.7

  • fix: Slack notification component -- Slack token constant.

0.1.6

  • update: link slack component with fetch data pipeline.

0.1.5

  • feat: slack integration.

0.1.4

  • update: skit-pipelines is available within docker image.

0.1.3

  • update: build pipeline yamls via make.

0.1.2

  • refactor: modularize project.

0.1.1

  • add: boto3 for s3 upload/download.

0.1.0

  • add: calls dataset component.
  • add: Upload to s3 component.
  • add: Calls dataset pipeline.
  • add: workflow to automate docker image creation on tag push.