Skip to content

Latest commit

 

History

History
354 lines (245 loc) · 9.15 KB

README.md

File metadata and controls

354 lines (245 loc) · 9.15 KB

Contributors Forks Stargazers Issues MIT License LinkedIn


Logo

amethyst

A low-code recommendation engine generation tool
Explore the docs »

View Demo · Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgments

About The Project

Desc.

Amethyst is a low-code, easy to use, GPU-powered recommender engine generator based on PyTorch. It requires only three parameters to rank/predict best items for users and vice-versa

  • User ID (unique identifier for each user)
  • Item ID (unique identifier for each item)
  • User-Item Ratings (user-item rating/interaction scores)

Since all the underlying data operations are being handled by Pandas, amethyst supports a wide variety of database/data storage formats like SQL, NoSQL, CSV, TSV, etc.

The resultant recommendation scores are also obtained as a Pandas Dataframe, which helps in a flexible integration with your application.

(back to top)

Built With

pytorch pytorch pytorch

(back to top)

Getting Started

This is an example of how you can generate your own collaborative recommendation engine. To get a local copy up and running follow these simple example steps.

Prerequisites

  • Python>=3.7

Installation

  1. Clone the repo
    git clone https://github.com/radioactive11/amethyst.git
  2. Create and activate virtual environment
    python3 -m venv venv
    source venv/bin/activate
  3. Install the tool
    python3 setup.py install

(back to top)

Usage

A recommendation engine can be generated in 4 easy steps:

  1. Import the data
  2. Select an algorithm
  3. Train the model
  4. Evaluate the model's performance

Data Split ⚗️

from amethyst.dataloader import split

df = pd.read_csv("./movielens100k.csv")
df_train, df_test = split.stratified_split(
                                    df,
                                    0.8, 
                                    user_col='userID',
                                    item_col='itemID',
                                    filter_col='item'
)

Load Data 📥

from amethyst.dataloader import dataset

df = pd.read_csv("movielens100k.csv")

# from Data Split
df_train, df_test = split.stratified_split(df)

train = dataset.Dataloader.dataloader(df_train.itertuples(index=False))
test = dataset.Dataloader.dataloader(df_test.itertuples(index=False))

Train (BiVAECF) ⚙️

from amethyst.models.bivaecf.bivaecf import BiVAECF
import torch


bivae = BiVAECF(
    k=50,
    encoder_structure=[100],
    act_fn=["tanh"],
    likelihood="pois",
    n_epochs=500,
    batch_size=256,
    learning_rate=0.001,
    seed=42,
    use_gpu=torch.cuda.is_available(),
    verbose=True
)

bivae.fit(train, test)
bivae.save("model.pkl")

Train (IBPR) ⚙️

from amethyst.models.ibpr.ibprcf import IBPR
import torch


ibpr = IBPR(
        k=20,
        max_iter=100,
        alpha_=0.05,
        lambda_=0.001,
        batch_size=100,
        trainable=True,
        verbose=False,
        init_params=None)

ibpr.fit(train, test)
ibpr.save("model.pkl")

Predict/Rank 📈

from amethyst.models.predictions import rank
from amethyst.models.bivaecf.bivaecf import BiVAECF


bivae = BiVAECF(
    k=50,
    encoder_structure=[100],
    act_fn=["tanh"],
    likelihood="pois",
    n_epochs=500,
    batch_size=256,
    learning_rate=0.001,
    seed=42,
    use_gpu=torch.cuda.is_available(),
    verbose=True
)

bivae.load("mode.pkl")

predictions = rank(bivae, test, user_col='userID', item_col='itemID')

# predictions is a Pandas Dataframe
predictions.to_csv("predictions.csv", index=False)

Evaluate 📈

from amethyst.models.predictions import rank
from amethyst.eval.eval_methods import map_at_k, precision_at_k, recall_k



bivae = BiVAECF(
    k=50,
    encoder_structure=[100],
    act_fn=["tanh"],
    likelihood="pois",
    n_epochs=500,
    batch_size=256,
    learning_rate=0.001,
    seed=42,
    use_gpu=torch.cuda.is_available(),
    verbose=True
)

bivae.load("mode.pkl")

predictions = rank(bivae, test, user_col='userID', item_col='itemID')
eval_map = map_at_k(test, predictions, k=10)
pk = precision_at_k(test, predictions, k=10)
rk = recall_k(test, predictions)

(back to top)

Roadmap

  • [] Build API Wrapper
  • [] Use Elastic Search to save recommendations
  • [] Add more algorithms
  • [] Add content-based recommendation generation

See the open issues for a full list of proposed features (and known issues).

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

Contact

Arijit Roy - @_radioactive11_ - [email protected]

Project Link: https://github.com/radioactive11/amethyst

(back to top)

Acknowledgments

(back to top)