Utils

Dataframe convertation

sim4rec.utils.pandas_to_spark(df: DataFrame, schema=None, spark_session: SparkSession | None = None) DataFrame

Converts pandas DataFrame to spark DataFrame

Parameters:
  • df – DataFrame to convert

  • schema – Schema of the dataframe, defaults to None

  • spark_session – Spark session to use, defaults to None

Returns:

data converted to spark DataFrame

Exceptions

class sim4rec.utils.NotFittedError
class sim4rec.utils.EmptyDataFrameError

Transformers

class sim4rec.utils.VectorElementExtractor(inputCol: str = None, outputCol: str = None, index: int = None)

Extracts element at index from array column :param inputCol: Input column with array :param outputCol: Output column name :param index: Index of an element within array

File management

sim4rec.utils.save(obj: object, filename: str)

Saves an object to pickle dump :param obj: Instance :param filename: File name of a dump

sim4rec.utils.load(filename: str)

Loads a pickle dump from file :param filename: File name of a dump :return: Read instance