classifier_pipeline

class deduplipy.classifier_pipeline.ClassifierPipeline(interaction: bool = False)

Bases: sklearn.base.BaseEstimator

Classification pipeline to be used in ActiveStringMatchLearner. Does not throw an error when there is only one class in the targets during the first steps in active learning.

Parameters

interaction – Whether to include interaction features

fit(X: Union[pandas.core.frame.DataFrame, numpy.ndarray], y: Union[pandas.core.frame.DataFrame, numpy.ndarray])deduplipy.classifier_pipeline.classifier_pipeline.ClassifierPipeline

Fit the classification pipeline. Does not throw an error when there is only one class in the targets during the first steps in active learning.

Parameters
  • X – features

  • y – target

Returns

fitted instance

predict(X: Union[pandas.core.frame.DataFrame, numpy.ndarray])numpy.ndarray

Predict using fitted instance.

Parameters

X – features

Returns

predictions

predict_proba(X: Union[pandas.core.frame.DataFrame, numpy.ndarray])numpy.ndarray

Predict probabilities using fitted instance.

Parameters

X – features

Returns

predicted probabilities