active_learning

class deduplipy.active_learning.ActiveStringMatchLearner(col_names: List[str], interaction: bool = False, uncertainty_threshold: float = 0.1, verbose: Union[int, bool] = 0, uncertainty_improvement_threshold: float = 0.01, min_nr_entries: int = 10)

Bases: object

fit(X: pandas.core.frame.DataFrame)deduplipy.active_learning.active_learning.ActiveStringMatchLearner

Fit ActiveStringMatchLearner instance on pairs of strings

Args:

X: Pandas dataframe containing pairs of strings

predict(X: Union[pandas.core.frame.DataFrame, numpy.ndarray])numpy.ndarray

Predict on new data whether the pairs are a match or not

Args:

X: Pandas dataframe to predict on

Returns: predictions

predict_proba(X: Union[pandas.core.frame.DataFrame, numpy.ndarray])numpy.ndarray

Predict probabilities on new data whether the pairs are a match or not

Args:

X: Pandas dataframe to predict on

Returns: match probabilities