Quickstart

Let’s start by installing Rexify

[1]:
!pip install rexify
Requirement already satisfied: rexify in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (0.1.20)
Requirement already satisfied: numpy>=1.22.3 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from rexify) (1.24.3)
Requirement already satisfied: pandas<2.0.0,>=1.4.0 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from rexify) (1.5.3)
Requirement already satisfied: scikit-learn<2.0.0,>=1.0.0 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from rexify) (1.2.2)
Requirement already satisfied: tensorflow==2.9.0 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from rexify) (2.9.0)
Requirement already satisfied: tensorflow_recommenders>=0.7.2 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from rexify) (0.7.3)
Requirement already satisfied: absl-py>=1.0.0 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (1.4.0)
Requirement already satisfied: astunparse>=1.6.0 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (1.6.3)
Requirement already satisfied: flatbuffers<2,>=1.12 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (1.12)
Requirement already satisfied: gast<=0.4.0,>=0.2.1 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (0.4.0)
Requirement already satisfied: google-pasta>=0.1.1 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (0.2.0)
Requirement already satisfied: grpcio<2.0,>=1.24.3 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (1.54.0)
Requirement already satisfied: h5py>=2.9.0 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (3.8.0)
Requirement already satisfied: keras<2.10.0,>=2.9.0rc0 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (2.9.0)
Requirement already satisfied: keras-preprocessing>=1.1.1 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (1.1.2)
Requirement already satisfied: libclang>=13.0.0 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (16.0.0)
Requirement already satisfied: opt-einsum>=2.3.2 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (3.3.0)
Requirement already satisfied: packaging in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (23.1)
Requirement already satisfied: protobuf>=3.9.2 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (3.19.6)
Requirement already satisfied: setuptools in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (67.7.2)
Requirement already satisfied: six>=1.12.0 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (1.16.0)
Requirement already satisfied: tensorboard<2.10,>=2.9 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (2.9.1)
Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.23.1 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (0.32.0)
Requirement already satisfied: tensorflow-estimator<2.10.0,>=2.9.0rc0 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (2.9.0)
Requirement already satisfied: termcolor>=1.1.0 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (2.3.0)
Requirement already satisfied: typing-extensions>=3.6.6 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (4.5.0)
Requirement already satisfied: wrapt>=1.11.0 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorflow==2.9.0->rexify) (1.15.0)
Requirement already satisfied: python-dateutil>=2.8.1 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from pandas<2.0.0,>=1.4.0->rexify) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from pandas<2.0.0,>=1.4.0->rexify) (2023.3)
Requirement already satisfied: scipy>=1.3.2 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from scikit-learn<2.0.0,>=1.0.0->rexify) (1.10.1)
Requirement already satisfied: joblib>=1.1.1 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from scikit-learn<2.0.0,>=1.0.0->rexify) (1.2.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from scikit-learn<2.0.0,>=1.0.0->rexify) (3.1.0)
Requirement already satisfied: wheel<1.0,>=0.23.0 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from astunparse>=1.6.0->tensorflow==2.9.0->rexify) (0.37.0)
Requirement already satisfied: google-auth<3,>=1.6.3 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorboard<2.10,>=2.9->tensorflow==2.9.0->rexify) (2.17.3)
Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorboard<2.10,>=2.9->tensorflow==2.9.0->rexify) (0.4.6)
Requirement already satisfied: markdown>=2.6.8 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorboard<2.10,>=2.9->tensorflow==2.9.0->rexify) (3.4.3)
Requirement already satisfied: requests<3,>=2.21.0 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorboard<2.10,>=2.9->tensorflow==2.9.0->rexify) (2.30.0)
Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorboard<2.10,>=2.9->tensorflow==2.9.0->rexify) (0.6.1)
Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorboard<2.10,>=2.9->tensorflow==2.9.0->rexify) (1.8.1)
Requirement already satisfied: werkzeug>=1.0.1 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from tensorboard<2.10,>=2.9->tensorflow==2.9.0->rexify) (2.3.4)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from google-auth<3,>=1.6.3->tensorboard<2.10,>=2.9->tensorflow==2.9.0->rexify) (5.3.0)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from google-auth<3,>=1.6.3->tensorboard<2.10,>=2.9->tensorflow==2.9.0->rexify) (0.3.0)
Requirement already satisfied: rsa<5,>=3.1.4 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from google-auth<3,>=1.6.3->tensorboard<2.10,>=2.9->tensorflow==2.9.0->rexify) (4.9)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.10,>=2.9->tensorflow==2.9.0->rexify) (1.3.1)
Requirement already satisfied: charset-normalizer<4,>=2 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from requests<3,>=2.21.0->tensorboard<2.10,>=2.9->tensorflow==2.9.0->rexify) (3.1.0)
Requirement already satisfied: idna<4,>=2.5 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from requests<3,>=2.21.0->tensorboard<2.10,>=2.9->tensorflow==2.9.0->rexify) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from requests<3,>=2.21.0->tensorboard<2.10,>=2.9->tensorflow==2.9.0->rexify) (2.0.2)
Requirement already satisfied: certifi>=2017.4.17 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from requests<3,>=2.21.0->tensorboard<2.10,>=2.9->tensorflow==2.9.0->rexify) (2023.5.7)
Requirement already satisfied: MarkupSafe>=2.1.1 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from werkzeug>=1.0.1->tensorboard<2.10,>=2.9->tensorflow==2.9.0->rexify) (2.1.2)
Requirement already satisfied: pyasn1<0.6.0,>=0.4.6 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard<2.10,>=2.9->tensorflow==2.9.0->rexify) (0.5.0)
Requirement already satisfied: oauthlib>=3.0.0 in /home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.10,>=2.9->tensorflow==2.9.0->rexify) (3.2.2)

Get some data:

[2]:
!mkdir data
!curl --get https://storage.googleapis.com/roostr-ratings-matrices/rexify/completions.csv > data/events.csv
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 55.9M  100 55.9M    0     0  17.7M      0  0:00:03  0:00:03 --:--:-- 17.7M
[3]:
import pandas as pd
[4]:
events = pd.read_csv('data/events.csv')
events
[4]:
id type account_id program_id date
0 2102 psa 51 CPEC-54 2015-08-07
1 2129 psa 51 CPEC-81 2015-08-14
2 2132 psa 51 CPEC-84 2015-08-14
3 2277 psa 50 CPEC-198 2015-08-18
4 3255 psa 49 CPEC-175 2015-11-02
... ... ... ... ... ...
1213766 80470238 psa 525487 CPEC-22181 2021-12-25
1213767 80470249 psa 677934 CPEC-9248 2021-12-25
1213768 80470250 psa 677934 CPEC-17386 2021-12-25
1213769 80470277 psa 682006 CPEC-11016 2021-12-25
1213770 80470278 psa 678228 CPEA-17314 2021-12-25

1213771 rows × 5 columns

Next, we need to specify our schema:

[5]:
schema = {
    "user": {
        "account_id": "id",
    },
    "item": {
        "program_id": "id",
    },
    "context": {}
}

To preprocess our data, we can use the FeatureExtractor

[6]:
from rexify.features import FeatureExtractor
2023-05-09 13:28:29.572715: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-05-09 13:28:29.572745: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
Cell In[6], line 1
----> 1 from rexify.features import FeatureExtractor

ImportError: cannot import name 'FeatureExtractor' from 'rexify.features' (/home/docs/checkouts/readthedocs.org/user_builds/rexify/envs/stable/lib/python3.10/site-packages/rexify/features/__init__.py)

We just need to pass it the schema, and it’s ready to roll out.

[7]:
feat = FeatureExtractor(schema=schema)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[7], line 1
----> 1 feat = FeatureExtractor(schema=schema)

NameError: name 'FeatureExtractor' is not defined

As a scikit-learn Transformer, it has two main methods: .fit() and .transform(). What .fit_transform() essentially does is: .fit().transform().

During .fit(), it will take the schema, and infer what the preprocessing should look like - what transformations it should apply to the data before it’s ready to be passed to the model. During .transform() it will apply those transformations, resulting in a numpy.array with the same number of rows as the original data.

[8]:
features = feat.fit_transform(events)
features
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[8], line 1
----> 1 features = feat.fit_transform(events)
      2 features

NameError: name 'feat' is not defined

The .make_dataset() method converts the numpy array to a tf.data.Dataset with the format it’s expecting.

[9]:
dataset = feat.make_dataset(features).batch(512)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[9], line 1
----> 1 dataset = feat.make_dataset(features).batch(512)

NameError: name 'feat' is not defined

We can now take our Recommender model and instantiate it.

During .fit, our FeatureExtractor also learns the right model parameters, so we don’t need to worry about them. They’re stored in the model_params property.

[10]:
from rexify.models import Recommender
[11]:
model = Recommender(**feat.model_params)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[11], line 1
----> 1 model = Recommender(**feat.model_params)

NameError: name 'feat' is not defined

Being a tensorflow.keras.Model itself, in order to fit it, we need to first compile it:

[12]:
model.compile()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[12], line 1
----> 1 model.compile()

NameError: name 'model' is not defined

To fit it, all we need to do is pass our tf.data.Dataset:

[13]:
# model.fit(dataset)