Scenarios

Scenarios are a series of actions for recommendations

This module contains Learner class for RePlay models training on bandit dataset. The format of the bandit dataset should be the same as in OpenBanditPipeline. Learner class has methods fit and predict which are wrappers for the corresponding methods of RePlay model. Optimize is based on optimization over CTR estimated by OBP.

Fallback

class replay.scenarios.Fallback(main_model, fallback_model=<replay.models.pop_rec.PopRec object>, threshold=0)

Fill missing recommendations using fallback model. Behaves like a recommender and have the same interface.

__init__(main_model, fallback_model=<replay.models.pop_rec.PopRec object>, threshold=0)

Create recommendations with main_model, and fill missing with fallback_model. rating of fallback_model will be decrease to keep main recommendations on top.

Parameters
  • main_model (BaseRecommender) – initialized model

  • fallback_model (BaseRecommender) – initialized model

  • threshold (int) – number of interactions by which queries are divided into cold and hot

optimize(train_dataset, test_dataset, param_borders=None, criterion=<class 'replay.metrics.ndcg.NDCG'>, k=10, budget=10, new_study=True)

Searches best parameters with optuna.

Parameters
  • train_dataset (Dataset) – train data

  • test_dataset (Dataset) – test data

  • param_borders (Optional[Dict[str, Dict[str, List[Any]]]]) – a dictionary with keys main and fallback containing dictionaries with search grid, where key is the parameter name and value is the range of possible values {param: [low, high]}.

  • criterion (Metric) – metric to use for optimization

  • k (int) – recommendation list length

  • budget (int) – number of points to try

  • new_study (bool) – keep searching with previous study or start a new study

Return type

Tuple[Dict[str, Any]]

Returns

tuple of dictionaries with best parameters

Two Stage Scenario (Experimental)

class replay.experimental.scenarios.TwoStagesScenario(train_splitter=<replay.splitters.ratio_splitter.RatioSplitter object>, first_level_models=<replay.experimental.models.scala_als.ScalaALSWrap object>, fallback_model=<replay.models.pop_rec.PopRec object>, use_first_level_models_feat=False, second_model_params=None, second_model_config_path=None, num_negatives=100, negatives_type='first_level', use_generated_features=False, user_cat_features_list=None, item_cat_features_list=None, custom_features_processor=None, seed=123)

train:

  1. take input log and split it into first_level_train and second_level_train default splitter splits each user’s data 50/50

  2. train first_stage_models on first_stage_train

  3. create negative examples to train second stage model using one of:

    • wrong recommendations from first stage

    • random examples

    use num_negatives to specify number of negatives per user

  4. augments dataset with features:

    • get 1 level recommendations for positive examples from second_level_train and for generated negative examples

    • add user and item features

    • generate statistical and pair features

  5. train TabularAutoML from LightAutoML

inference:

  1. take log

  2. generate candidates, their number can be specified with num_candidates

  3. add features as in train

  4. get recommendations

__init__(train_splitter=<replay.splitters.ratio_splitter.RatioSplitter object>, first_level_models=<replay.experimental.models.scala_als.ScalaALSWrap object>, fallback_model=<replay.models.pop_rec.PopRec object>, use_first_level_models_feat=False, second_model_params=None, second_model_config_path=None, num_negatives=100, negatives_type='first_level', use_generated_features=False, user_cat_features_list=None, item_cat_features_list=None, custom_features_processor=None, seed=123)
Parameters
  • train_splitter (Splitter) – splitter to get first_level_train and second_level_train. Default is random 50% split.

  • first_level_models (Union[List[BaseRecommender], BaseRecommender]) – model or a list of models

  • fallback_model (Optional[BaseRecommender]) – model used to fill missing recommendations at first level models

  • use_first_level_models_feat (Union[List[bool], bool]) – flag or a list of flags to use features created by first level models

  • second_model_params (Union[Dict, str, None]) – TabularAutoML parameters

  • second_model_config_path (Optional[str]) – path to config file for TabularAutoML

  • num_negatives (int) – number of negative examples used during train

  • negatives_type (str) – negative examples creation strategy,``random`` or most relevant examples from first-level

  • use_generated_features (bool) – flag to use generated features to train second level

  • user_cat_features_list (Optional[List]) – list of user categorical features

  • item_cat_features_list (Optional[List]) – list of item categorical features

  • custom_features_processor (Optional[HistoryBasedFeaturesProcessor]) – you can pass custom feature processor

  • seed (int) – random seed

fit(dataset)

Fit a recommendation model

Parameters

dataset (Dataset) – historical interactions with query/item features [user_idx, item_idx, timestamp, rating]

Return type

None

Returns

optimize(train, test, user_features=None, item_features=None, param_borders=None, criterion=<class 'replay.metrics.precision.Precision'>, k=10, budget=10, new_study=True)

Optimize first level models with optuna.

Parameters
  • train (Union[DataFrame, DataFrame, DataFrame]) – train DataFrame [user_id, item_id, timestamp, relevance]

  • test (Union[DataFrame, DataFrame, DataFrame]) – test DataFrame [user_id, item_id, timestamp, relevance]

  • user_features (Union[DataFrame, DataFrame, DataFrame, None]) – user features [user_id , timestamp] + feature columns

  • item_features (Union[DataFrame, DataFrame, DataFrame, None]) – item features``[item_id]`` + feature columns

  • param_borders (Optional[List[Dict[str, List[Any]]]]) – list with param grids for first level models and a fallback model. Empty dict skips optimization for that model. Param grid is a dict {param: [low, high]}.

  • criterion (Metric) – metric to optimize

  • k (int) – length of a recommendation list

  • budget (int) – number of points to train each model

  • new_study (bool) – keep searching with previous study or start a new study

Return type

Tuple[List[Dict[str, Any]], Optional[Dict[str, Any]]]

Returns

list of dicts of parameters

predict(dataset, k, queries=None, items=None, filter_seen_items=True, recs_file_path=None)

Get recommendations

Parameters
  • dataset (Dataset) – historical interactions with query/item features [user_idx, item_idx, timestamp, rating]

  • k (int) – number of recommendations for each query

  • queries (Union[DataFrame, Iterable, None]) – queries to create recommendations for dataframe containing [user_idx] or array-like; if None, recommend to all queries from interactions

  • items (Union[DataFrame, Iterable, None]) – candidate items for recommendations dataframe containing [item_idx] or array-like; if None, take all items from interactions. If it contains new items, rating for them will be 0.

  • filter_seen_items (bool) – flag to remove seen items from recommendations based on interactions.

  • recs_file_path (Optional[str]) – save recommendations at the given absolute path as parquet file. If None, cached and materialized recommendations dataframe will be returned

Return type

Optional[DataFrame]

Returns

cached recommendation dataframe with columns [user_idx, item_idx, rating] or None if file_path is provided

Ofline Policy Learners

class replay.experimental.scenarios.obp_wrapper.OBPOfflinePolicyLearner(n_actions, len_list=1, replay_model=None, log=None, max_usr_id=0, item_features=None, _logger=None)

Off-policy learner which wraps OBP data representation into replay format.

Parameters
  • n_actions (int) – Number of actions.

  • len_list (int) – Length of a list of actions in a recommendation/ranking inferface, slate size. When Open Bandit Dataset is used, 3 should be set.

  • replay_model (Optional[BaseRecommender]) – Any model from replay library with fit, predict functions.

  • dataset – Dataset of interactions (user_id, item_id, rating). Constructing inside the fit method. Used for predict of replay_model.

__init__(n_actions, len_list=1, replay_model=None, log=None, max_usr_id=0, item_features=None, _logger=None)
optimize(bandit_feedback, val_size=0.3, param_borders=None, criterion='ipw', budget=10, new_study=True)

Optimize model parameters using optuna. Optimization is carried out over the IPW/DR/DM scores(IPW by default).

Parameters
  • bandit_feedback (Dict[str, ndarray]) – Bandit log data with fields [action, reward, context, action_context, n_rounds, n_actions, position, pscore] as in OpenBanditPipeline.

  • val_size (float) – Size of validation subset.

  • param_borders (Optional[Dict[str, List[Any]]]) – Dictionary of parameter names with pair of borders for the parameters optimization algorithm.

  • criterion (str) – Score for optimization. Available are ipw, dr and dm.

  • budget (int) – Number of trials for the optimization algorithm.

  • new_study (bool) – Flag to create new study or not for optuna.

Return type

Optional[Dict[str, Any]]

Returns

Dictionary of parameter names with optimal value of corresponding parameter.

predict(n_rounds=1, context=None)

Predict best actions for new data. Action set predicted by this predict method can contain duplicate items. If a non-repetitive action set is needed, please use the sample_action method.

Context

Context vectors for new data.

Return type

ndarray

Returns

Action choices made by a classifier, which can contain duplicate items. If a non-repetitive action set is needed, please use the sample_action method.