Saved searches

Use saved searches to filter your results more quickly

Cancel Create saved search Sign up Reseting focus

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

benfred / implicit Public

Fast Python Collaborative Filtering for Implicit Feedback Datasets

License

Notifications You must be signed in to change notification settings

benfred/implicit

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Go to file

Folders and files

Last commit message Last commit date

Latest commit

History

View all files

Repository files navigation

Implicit

Alternating Least Squares as described in the papers Collaborative Filtering for Implicit Feedback Datasets and Applications of the Conjugate Gradient Method for Implicit Feedback Collaborative Filtering.
Bayesian Personalized Ranking.
Logistic Matrix Factorization
Item-Item Nearest Neighbour models using Cosine, TFIDF or BM25 as a distance metric.

All models have multi-threaded training routines, using Cython and OpenMP to fit the models in parallel among all available CPU cores. In addition, the ALS and BPR models both have custom CUDA kernels - enabling fitting on compatible GPU's. Approximate nearest neighbours libraries such as Annoy, NMSLIB and Faiss can also be used by Implicit to speed up making recommendations.

Installation

Implicit can be installed from pypi with:

pip install implicit

Installing with pip will use prebuilt binary wheels on x86_64 Linux, Windows and OSX. These wheels include GPU support on Linux.

Implicit can also be installed with conda:

# CPU only package conda install -c conda-forge implicit # CPU+GPU package conda install -c conda-forge implicit implicit-proc=*=gpu

Basic Usage

import implicit # initialize a model model = implicit.als.AlternatingLeastSquares(factors=50) # train the model on a sparse matrix of user/item/confidence weights model.fit(user_item_data) # recommend items for a user recommendations = model.recommend(userid, user_item_data[userid]) # find related items related = model.similar_items(itemid)

The examples folder has a program showing how to use this to compute similar artists on the last.fm dataset.

For more information see the documentation.

Articles about Implicit

These blog posts describe the algorithms that power this library:

Finding Similar Music with Matrix Factorization
Faster Implicit Matrix Factorization
Implicit Matrix Factorization on the GPU
Approximate Nearest Neighbours for Recommender Systems
Distance Metrics for Fun and Profit

There are also several other articles about using Implicit to build recommendation systems:

H&M Personalized Fashion Recommendations Kaggle Competition
Yandex Cup 2022: Like Prediction
Recommending GitHub Repositories with Google BigQuery and the implicit library
Intro to Implicit Matrix Factorization: Classic ALS with Sketchfab Models
A Gentle Introduction to Recommender Systems with Implicit Feedback.

Requirements

This library requires SciPy version 0.16 or later and Python version 3.6 or later.

GPU Support requires at least version 11 of the NVidia CUDA Toolkit.

This library is tested with Python 3.7, 3.8, 3.9, 3.10 and 3.11 on Ubuntu, OSX and Windows.

Benchmarks

Simple benchmarks comparing the ALS fitting time versus Spark can be found here.

Optimal Configuration

I'd recommend configuring SciPy to use Intel's MKL matrix libraries. One easy way of doing this is by installing the Anaconda Python distribution.

For systems using OpenBLAS, I highly recommend setting 'export OPENBLAS_NUM_THREADS=1'. This disables its internal multithreading ability, which leads to substantial speedups for this package. Likewise for Intel MKL, setting 'export MKL_NUM_THREADS=1' should also be set.

Released under the MIT License