Python API

class genif.GeneralizedIsolationForest
Members

models

__init__(self: genif.GeneralizedIsolationForest, k: int, n_models: int, sample_size: int, kernel: str, kernel_scaling: numpy.ndarray[numpy.float64[m, 1]], sigma: float, worker_count: int = - 1, seed: int = - 1) → None

Initializes the GeneralizedIsolationForest with the following parameters:

Parameters
  • k (int) – The number of representatives to find for each node of the tree.

  • n_models (int) – The number of trees to fit.

  • sample_size (int) – The sample size to consider for every tree to be fit.

  • kernel (str) – Name of the kernel to use (possible values: rbf, matern-d1, matern-d3, matern-d5).

  • kernel_scaling (ndarray) – Vector of scaling values for the kernel to be used (scalar for RBF, d-dimensional vector for Matern kernels).

  • sigma (float) – Average pairwise kernel values of observations in a data sub-region, which should be exceeded for the exit condition to apply.

  • worker_count (int) – Number of parallel workers to consider (-1 defaults to all available cores).

fit(self: genif.GeneralizedIsolationForest, X: numpy.ndarray[numpy.float64[m, n]]) → genif.GIFModel_ODR_Learner

Fits the forest using the provided input data matrix.

Parameters

X (ndarray) – Input data matrix with shape [n, d].

Returns

Callee.

fit_predict(self: genif.GeneralizedIsolationForest, X: numpy.ndarray[numpy.float64[m, n]]) → numpy.ndarray[numpy.float64[m, 1]]

Fits the forest using the given input data matrix and predicts the probability for every input observation to be an inlier.

Parameters

X (ndarray) – Input data matrix with shape [n, d].

Returns

Vector of probabilities, represented as ndarray with shape [n, 1].

predict(self: genif.GeneralizedIsolationForest, X: numpy.ndarray[numpy.float64[m, n]]) → numpy.ndarray[numpy.float64[m, 1]]

Predicts the probability for inlierness for every entry of the data matrix. Prior to calling predict either fit or fit_predict has to be called.

Parameters

X (ndarray) – Input data matrix with shape [n, d].

Returns

Vector of probabilities, represented as ndarray with shape [n, 1].