emrpy.ml
- emrpy.ml.encode_cats_pandas(train_df, cat_cols, test_df=None)
Encode categorical columns using OrdinalEncoder with handling for unknown and missing values.
- Return type:
Tuple
[DataFrame
,Optional
[DataFrame
],OrdinalEncoder
]
Parameters:
- train_dfpandas.DataFrame
Training DataFrame containing the categorical columns to encode
- cat_colslist
List of categorical column names to encode
- test_dfpandas.DataFrame, optional
Test DataFrame containing the same categorical columns (default=None)
- unknown_valueint, optional
Value to use for unknown categories (default=-2)
- missing_valueint, optional
Value to use for missing values (default=-1)
Returns:
: Tuple[pd.DataFrame, Optional[pd.DataFrame], OrdinalEncoder]
Encoded training DataFrame, encoded test DataFrame (or None), and the fitted OrdinalEncoder instance for future use.
Modules