emrpy.ml.encoders

Functions

encode_cats_pandas(train_df, cat_cols[, test_df])

Encode categorical columns using OrdinalEncoder with handling for unknown and missing values.

emrpy.ml.encoders.encode_cats_pandas(train_df, cat_cols, test_df=None)

Encode categorical columns using OrdinalEncoder with handling for unknown and missing values.

Return type:

Tuple[DataFrame, Optional[DataFrame], OrdinalEncoder]

Parameters:

train_dfpandas.DataFrame

Training DataFrame containing the categorical columns to encode

cat_colslist

List of categorical column names to encode

test_dfpandas.DataFrame, optional

Test DataFrame containing the same categorical columns (default=None)

unknown_valueint, optional

Value to use for unknown categories (default=-2)

missing_valueint, optional

Value to use for missing values (default=-1)

Returns:

: Tuple[pd.DataFrame, Optional[pd.DataFrame], OrdinalEncoder]

Encoded training DataFrame, encoded test DataFrame (or None), and the fitted OrdinalEncoder instance for future use.