Skip to content

Latest commit

 

History

History

5. LLM Model Evaluations

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

RAG evaluation packages

image

For a detailed overview of RAG evaluation package refer to the Medium Blog.

The Eval_Package is a tool designed to evaluate the performance of the LLM (Language Model) on a dataset containing questions, context, and ideal answers. It allows you to run evaluations on various datasets and assess how well the Model generates the answer.

GIF Description

Features

  • Evaluate LLM Model on custom datasets: Use the Eval_Package to assess the performance of your Model on datasets of your choice.
  • Measure model accuracy: The package provides metrics to gauge the accuracy of the model-generated answers against the ideal answers.

The MLflow_Package is a comprehensive toolkit designed to integrate the results from the Eval_Package and efficiently track and manage experiments. It also enables you to create a leaderboard for evaluation comparisons and visualize metrics through a dashboard.

GIF Description

Features

  • Experiment tracking: Use MLflow to keep a record of experiments, including parameters, metrics, and model artifacts generated during evaluations.
  • Leaderboard creation: The package allows you to create a leaderboard, making it easy to compare the performance of different Models across multiple datasets.
  • Metric visualization: Generate insightful charts and graphs through the dashboard, allowing you to visualize and analyze evaluation metrics easily.

ML-Flow-Dashboard

Directory contents