multimodal.boosting package

Submodules

multimodal.boosting.boost module

class multimodal.boosting.boost.UBoosting

Bases: BaseEstimator

Abstract class MuComboClassifier and MumboClassifier should inherit from UBoosting for methods

multimodal.boosting.combo module

This module contains a MultiConfusion MMatrix Bosting (CoMBo) estimator for classification implemented in the MuComboClassifier class.

class multimodal.boosting.combo.MuComboClassifier(estimator=None, n_estimators=50, random_state=None)

Bases: ClassifierMixin, UBoosting, BaseEnsemble

It then iterates the process on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases. A MuCoMBo classifier.

A MuMBo classifier is a meta-estimator that implements a multimodal (or multi-view) boosting algorithm:

It fits a set of classifiers on the original dataset splitted into several views and retains the classifier obtained for the best view.

This class implements the MuMBo algorithm [1].

Parameters:

estimatorobject, optional (default=DecisionTreeClassifier): Base estimator from which the boosted ensemble is built. Support for sample weighting is required, as well as proper classes_ and n_classes_ attributes. The default is a DecisionTreeClassifier with parameter max_depth=1.
n_estimatorsinteger, optional (default=50): Maximum number of estimators at which boosting is terminated.
random_stateint, RandomState instance or None, optional (default=None): If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

See also

sklearn.ensemble.AdaBoostClassifier
sklearn.ensemble.GradientBoostingClassifier
sklearn.tree.DecisionTreeClassifier

References

[1]

Koc{c}o, Sokol and Capponi, C{'e}cile A Boosting Approach to Multiview Classification with Cooperation, 2011,Proceedings of the 2011 European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II, 209–228 Springer-Verlag https://link.springer.com/chapter/10.1007/978-3-642-23783-6_1

[2]

Sokol Koço, “Tackling the uneven views problem with cooperation based ensemble learning methods”, PhD Thesis, Aix-Marseille Université, 2013, http://www.theses.fr/en/2013AIXM4101.

Examples

>>> from multimodal.boosting.combo import MuComboClassifier
>>> from sklearn.datasets import load_iris
>>> X, y = load_iris(return_X_y=True)
>>> views_ind = [0, 2, 4]  # view 0: sepal data, view 1: petal data
>>> clf = MuComboClassifier(random_state=0)
>>> clf.fit(X, y, views_ind)  
MuComboClassifier(random_state=0)
>>> print(clf.predict([[ 5.,  3.,  1.,  1.]]))
[0]
>>> views_ind = [[0, 2], [1, 3]]  # view 0: length data, view 1: width data
>>> clf = MuComboClassifier(random_state=0)
>>> clf.fit(X, y, views_ind)  
MuComboClassifier(random_state=0)
>>> print(clf.predict([[ 5.,  3.,  1.,  1.]]))
[0]

>>> from sklearn.tree import DecisionTreeClassifier
>>> estimator = DecisionTreeClassifier(max_depth=2)
>>> clf = MuComboClassifier(estimator=estimator, random_state=1)
>>> clf.fit(X, y, views_ind)  
MuComboClassifier(estimator=DecisionTreeClassifier(max_depth=2),
                  random_state=1)
>>> print(clf.predict([[ 5.,  3.,  1.,  1.]]))
[0]

Attributes:

estimators_list of classifiers: Collection of fitted sub-estimators.
classes_numpy.ndarray, shape = (n_classes,): Classes labels.
n_classes_int: Number of classes.
n_views_int: Number of views
estimator_weights_numpy.ndarray of floats, shape = (len(estimators_),): Weights for each estimator in the boosted ensemble.
estimator_errors_array of floats: Empirical loss for each iteration.
best_views_numpy.ndarray of integers, shape = (len(estimators_),): Indices of the best view for each estimator in the boosted ensemble.
n_yi_numpy ndarray of int contains number of train sample for each classe shape (n_classes,)

decision_function(X)

Compute the decision function of X.

Parameters:

X{array-like, sparse matrix}, shape = (n_samples, n_features): Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

Returns:

dec_funnumpy.ndarray, shape = (n_view, n_samples, k): Decision function of the input samples. The order of outputs is the same of that of the classes_ attribute. Binary classification is a special cases with k == 1, otherwise k == n_classes. For binary classification, values <=0 mean classification in the first class in classes_ and values >0 mean classification in the second class in classes_.

fit(X, y, views_ind=None)

Build a multimodal boosted classifier from the training set (X, y).

Parameters:

Xdict dictionary with all views

or MultiModalData , MultiModalArray, MultiModalSparseArray or {array-like, sparse matrix}, shape = (n_samples, n_features) Training multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

yarray-like, shape = (n_samples,)

Target values (class labels).

views_indarray-like (default=[0, n_features//2, n_features])

Paramater specifying how to extract the data views from X:

If views_ind is a 1-D array of sorted integers, the entries indicate the limits of the slices used to extract the views, where view n is given by X[:, views_ind[n]:views_ind[n+1]].

With this convention each view is therefore a view (in the NumPy sense) of X and no copy of the data is done.
If views_ind is an array of arrays of integers, then each array of integers views_ind[n] specifies the indices of the view n, which is then given by X[:, views_ind[n]].

With this convention each view creates therefore a partial copy of the data in X. This convention is thus more flexible but less efficient than the previous one.

Returns:

selfobject: Returns self.

Raises:

ValueError estimator must support sample_weight
ValueError where X and view_ind are not compatibles

predict(X)

Predict classes for X.

The predicted class of an input sample is computed as the weighted mean prediction of the classifiers in the ensemble.

Parameters:

X{array-like, sparse matrix}, shape = (n_samples, n_features): Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

Returns:

ynumpy.ndarray, shape = (n_samples,): Predicted classes.

Raises:

ValueError ‘X’ input matrix must be have the same total number of features: of ‘X’ fit data

score(X, y)

Return the mean accuracy on the given test data and labels.

Parameters:

X{array-like, sparse matrix} of shape = (n_samples, n_features): Multi-view test samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
yarray-like, shape = (n_samples,): True labels for X.

Returns:

scorefloat: Mean accuracy of self.predict(X) wrt. y.

set_fit_request(*, views_ind: Union[bool, None, str] = '$UNCHANGED$') → MuComboClassifier

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Parameters:

views_indstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for views_ind parameter in fit.

Returns:

selfobject: The updated object.

staged_decision_function(X)

Compute decision function of X for each boosting iteration.

This method allows monitoring (i.e. determine error on testing set) after each boosting iteration.

Parameters:

X{array-like, sparse matrix}, shape = (n_samples, n_features): Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

Returns:

dec_fungenerator of numpy.ndarrays, shape = (n_samples, k): Decision function of the input samples. The order of outputs is the same of that of the classes_ attribute. Binary classification is a special cases with k == 1, otherwise k==n_classes. For binary classification, values <=0 mean classification in the first class in classes_ and values >0 mean classification in the second class in classes_.

staged_predict(X)

Return staged predictions for X.

The predicted class of an input sample is computed as the weighted mean prediction of the classifiers in the ensemble.

This generator method yields the ensemble prediction after each iteration of boosting and therefore allows monitoring, such as to determine the prediction on a test set after each boost.

Parameters:

X{array-like, sparse matrix} of shape = (n_samples, n_features): Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

Returns:

ygenerator of numpy.ndarrays, shape = (n_samples,): Predicted classes.

staged_score(X, y)

Return staged mean accuracy on the given test data and labels.

This generator method yields the ensemble score after each iteration of boosting and therefore allows monitoring, such as to determine the score on a test set after each boost.

Parameters:

X{array-like, sparse matrix} of shape = (n_samples, n_features): Multi-view test samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
yarray-like, shape = (n_samples,): True labels for X.

Returns:

scoregenerator of floats: Mean accuracy of self.staged_predict(X) wrt. y.

multimodal.boosting.mumbo module

Multimodal Boosting

This module contains a MultiModal Boosting (MuMBo) estimator for classification implemented in the MumboClassifier class.

class multimodal.boosting.mumbo.MumboClassifier(estimator=None, n_estimators=50, random_state=None, best_view_mode='edge')

Bases: ClassifierMixin, UBoosting, BaseEnsemble

It then iterates the process on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases. A MuMBo classifier.

A MuMBo classifier is a meta-estimator that implements a multimodal (or multi-view) boosting algorithm:

It fits a set of classifiers on the original dataset splitted into several views and retains the classifier obtained for the best view.

This class implements the MuMBo algorithm [1].

Parameters:

estimatorobject, optional (default=DecisionTreeClassifier)

Base estimator from which the boosted ensemble is built. Support for sample weighting is required, as well as proper classes_ and n_classes_ attributes. The default is a DecisionTreeClassifie with parameter max_depth=1.

n_estimatorsinteger, optional (default=50)

Maximum number of estimators at which boosting is terminated.

random_stateint, RandomState instance or None, optional (default=None)

If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

best_view_mode{“edge”, “error”}, optional (default=”edge”)

Mode used to select the best view at each iteration:

if best_view_mode == "edge", the best view is the view maximizing the edge value (variable δ (delta) in [1]),
if best_view_mode == "error", the best view is the view minimizing the classification error.

See also

sklearn.ensemble.AdaBoostClassifier
sklearn.ensemble.GradientBoostingClassifier
sklearn.tree.DecisionTreeClassifier

References

[1] (1,2)

Sokol Koço, “Tackling the uneven views problem with cooperation based ensemble learning methods”,

Examples

>>> from multimodal.boosting.mumbo import MumboClassifier
>>> from sklearn.datasets import load_iris
>>> X, y = load_iris(return_X_y=True)
>>> views_ind = [0, 2, 4]  # view 0: sepal data, view 1: petal data
>>> clf = MumboClassifier(random_state=0)
>>> clf.fit(X, y, views_ind)  
MumboClassifier(random_state=0)
>>> print(clf.predict([[ 5.8,  3.,  1.,  1.0]]))
[1]
>>> views_ind = [[0, 2], [1, 3]]  # view 0: length data, view 1: width data
>>> clf = MumboClassifier(random_state=0)
>>> clf.fit(X, y, views_ind)  
MumboClassifier(random_state=0)
>>> print(clf.predict([[ 5.8,  3.,  1.,  1.0]]))
[1]

>>> from sklearn.tree import DecisionTreeClassifier
>>> estimator = DecisionTreeClassifier(max_depth=2)
>>> clf = MumboClassifier(estimator=estimator, random_state=0)
>>> clf.fit(X, y, views_ind)  
MumboClassifier(estimator=DecisionTreeClassifier(max_depth=2),
                random_state=0)
>>> print(clf.predict([[ 5.8,  3.,  1.,  1.]]))
[1]

Attributes:

estimators_list of classifiers: Collection of fitted sub-estimators.
classes_numpy.ndarray, shape = (n_classes,): Classes labels.
n_classes_int: Number of classes.
estimator_weights_numpy.ndarray of floats, shape = (len(estimators: Weights for each estimator in the boosted ensemble.
estimator_errors_array of floats: Empirical loss for each iteration.
best_views_numpy.ndarray of integers, shape = (len(estimators_),): Indices of the best view for each estimator in the boosted ensemble.

decision_function(X)

Compute the decision function of X.

Parameters:

X{ array-like, sparse matrix},: shape = (n_samples, n_views * n_features) Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR. maybe also MultimodalData

Returns:

dec_funnumpy.ndarray, shape = (n_samples, k): Decision function of the input samples. The order of outputs is the same of that of the classes_ attribute. Binary classification is a special cases with k == 1, otherwise k == n_classes. For binary classification, values <=0 mean classification in the first class in classes_ and values >0 mean classification in the second class in classes_.

fit(X, y, views_ind=None)

Build a multimodal boosted classifier from the training set (X, y).

Parameters:

Xdict dictionary with all views

or MultiModalData , MultiModalArray, MultiModalSparseArray or {array-like, sparse matrix}, shape = (n_samples, n_features) Training multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

yarray-like, shape = (n_samples,)

Target values (class labels).

views_indarray-like (default=[0, n_features//2, n_features])

Paramater specifying how to extract the data views from X:

If views_ind is a 1-D array of sorted integers, the entries indicate the limits of the slices used to extract the views, where view n is given by X[:, views_ind[n]:views_ind[n+1]].

With this convention each view is therefore a view (in the NumPy sense) of X and no copy of the data is done.
If views_ind is an array of arrays of integers, then each array of integers views_ind[n] specifies the indices of the view n, which is then given by X[:, views_ind[n]].

With this convention each view creates therefore a partial copy of the data in X. This convention is thus more flexible but less efficient than the previous one.

Returns:

selfobject: Returns self.

predict(X)

Predict classes for X.

The predicted class of an input sample is computed as the weighted mean prediction of the classifiers in the ensemble.

Parameters:

X{array-like, sparse matrix}, shape = (n_samples, n_features): Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

Returns:

ynumpy.ndarray, shape = (n_samples,): Predicted classes.

score(X, y)

Return the mean accuracy on the given test data and labels.

Parameters:

X{array-like, sparse matrix} of shape = (n_samples, n_features): Multi-view test samples. Sparse matrix can be CSC, CSR
yarray-like, shape = (n_samples,): True labels for X.

Returns:

scorefloat: Mean accuracy of self.predict(X) wrt. y.

set_fit_request(*, views_ind: Union[bool, None, str] = '$UNCHANGED$') → MumboClassifier

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Parameters:

views_indstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for views_ind parameter in fit.

Returns:

selfobject: The updated object.

staged_decision_function(X)

Compute decision function of X for each boosting iteration.

This method allows monitoring (i.e. determine error on testing set) after each boosting iteration.

Parameters:

X{array-like, sparse matrix}, shape = (n_samples, n_features): Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR. maybe also MultimodalData

Returns:

dec_fungenerator of numpy.ndarrays, shape = (n_samples, k): Decision function of the input samples. The order of outputs is the same of that of the classes_ attribute. Binary classification is a special cases with k == 1, otherwise k==n_classes. For binary classification, values <=0 mean classification in the first class in classes_ and values >0 mean classification in the second class in classes_.

staged_predict(X)

Return staged predictions for X.

The predicted class of an input sample is computed as the weighted mean prediction of the classifiers in the ensemble.

This generator method yields the ensemble prediction after each iteration of boosting and therefore allows monitoring, such as to determine the prediction on a test set after each boost.

Parameters:

X{array-like, sparse matrix} of shape = (n_samples, n_features): Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

Returns:

ygenerator of numpy.ndarrays, shape = (n_samples,): Predicted classes.

staged_score(X, y)

Return staged mean accuracy on the given test data and labels.

This generator method yields the ensemble score after each iteration of boosting and therefore allows monitoring, such as to determine the score on a test set after each boost.

Parameters:

X{array-like, sparse matrix} of shape = (n_samples, n_features): Multi-view test samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
yarray-like, shape = (n_samples,): True labels for X.

Returns:

scoregenerator of floats: Mean accuracy of self.staged_predict(X) wrt. y.

Module contents

class multimodal.boosting.MuComboClassifier(estimator=None, n_estimators=50, random_state=None)

Bases: ClassifierMixin, UBoosting, BaseEnsemble

It then iterates the process on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases. A MuCoMBo classifier.

A MuMBo classifier is a meta-estimator that implements a multimodal (or multi-view) boosting algorithm:

It fits a set of classifiers on the original dataset splitted into several views and retains the classifier obtained for the best view.

This class implements the MuMBo algorithm [1].

Parameters:

estimatorobject, optional (default=DecisionTreeClassifier): Base estimator from which the boosted ensemble is built. Support for sample weighting is required, as well as proper classes_ and n_classes_ attributes. The default is a DecisionTreeClassifier with parameter max_depth=1.
n_estimatorsinteger, optional (default=50): Maximum number of estimators at which boosting is terminated.
random_stateint, RandomState instance or None, optional (default=None): If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

See also

sklearn.ensemble.AdaBoostClassifier
sklearn.ensemble.GradientBoostingClassifier
sklearn.tree.DecisionTreeClassifier

References

[1]

Koc{c}o, Sokol and Capponi, C{'e}cile A Boosting Approach to Multiview Classification with Cooperation, 2011,Proceedings of the 2011 European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II, 209–228 Springer-Verlag https://link.springer.com/chapter/10.1007/978-3-642-23783-6_1

[2]

Sokol Koço, “Tackling the uneven views problem with cooperation based ensemble learning methods”, PhD Thesis, Aix-Marseille Université, 2013, http://www.theses.fr/en/2013AIXM4101.

Examples

>>> from multimodal.boosting.combo import MuComboClassifier
>>> from sklearn.datasets import load_iris
>>> X, y = load_iris(return_X_y=True)
>>> views_ind = [0, 2, 4]  # view 0: sepal data, view 1: petal data
>>> clf = MuComboClassifier(random_state=0)
>>> clf.fit(X, y, views_ind)  
MuComboClassifier(random_state=0)
>>> print(clf.predict([[ 5.,  3.,  1.,  1.]]))
[0]
>>> views_ind = [[0, 2], [1, 3]]  # view 0: length data, view 1: width data
>>> clf = MuComboClassifier(random_state=0)
>>> clf.fit(X, y, views_ind)  
MuComboClassifier(random_state=0)
>>> print(clf.predict([[ 5.,  3.,  1.,  1.]]))
[0]

>>> from sklearn.tree import DecisionTreeClassifier
>>> estimator = DecisionTreeClassifier(max_depth=2)
>>> clf = MuComboClassifier(estimator=estimator, random_state=1)
>>> clf.fit(X, y, views_ind)  
MuComboClassifier(estimator=DecisionTreeClassifier(max_depth=2),
                  random_state=1)
>>> print(clf.predict([[ 5.,  3.,  1.,  1.]]))
[0]

Attributes:

estimators_list of classifiers: Collection of fitted sub-estimators.
classes_numpy.ndarray, shape = (n_classes,): Classes labels.
n_classes_int: Number of classes.
n_views_int: Number of views
estimator_weights_numpy.ndarray of floats, shape = (len(estimators_),): Weights for each estimator in the boosted ensemble.
estimator_errors_array of floats: Empirical loss for each iteration.
best_views_numpy.ndarray of integers, shape = (len(estimators_),): Indices of the best view for each estimator in the boosted ensemble.
n_yi_numpy ndarray of int contains number of train sample for each classe shape (n_classes,)

decision_function(X)

Compute the decision function of X.

Parameters:

X{array-like, sparse matrix}, shape = (n_samples, n_features): Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

Returns:

dec_funnumpy.ndarray, shape = (n_view, n_samples, k): Decision function of the input samples. The order of outputs is the same of that of the classes_ attribute. Binary classification is a special cases with k == 1, otherwise k == n_classes. For binary classification, values <=0 mean classification in the first class in classes_ and values >0 mean classification in the second class in classes_.

fit(X, y, views_ind=None)

Build a multimodal boosted classifier from the training set (X, y).

Parameters:

Xdict dictionary with all views

or MultiModalData , MultiModalArray, MultiModalSparseArray or {array-like, sparse matrix}, shape = (n_samples, n_features) Training multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

yarray-like, shape = (n_samples,)

Target values (class labels).

views_indarray-like (default=[0, n_features//2, n_features])

Paramater specifying how to extract the data views from X:

If views_ind is a 1-D array of sorted integers, the entries indicate the limits of the slices used to extract the views, where view n is given by X[:, views_ind[n]:views_ind[n+1]].

With this convention each view is therefore a view (in the NumPy sense) of X and no copy of the data is done.
If views_ind is an array of arrays of integers, then each array of integers views_ind[n] specifies the indices of the view n, which is then given by X[:, views_ind[n]].

With this convention each view creates therefore a partial copy of the data in X. This convention is thus more flexible but less efficient than the previous one.

Returns:

selfobject: Returns self.

Raises:

ValueError estimator must support sample_weight
ValueError where X and view_ind are not compatibles

predict(X)

Predict classes for X.

The predicted class of an input sample is computed as the weighted mean prediction of the classifiers in the ensemble.

Parameters:

X{array-like, sparse matrix}, shape = (n_samples, n_features): Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

Returns:

ynumpy.ndarray, shape = (n_samples,): Predicted classes.

Raises:

ValueError ‘X’ input matrix must be have the same total number of features: of ‘X’ fit data

score(X, y)

Return the mean accuracy on the given test data and labels.

Parameters:

X{array-like, sparse matrix} of shape = (n_samples, n_features): Multi-view test samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
yarray-like, shape = (n_samples,): True labels for X.

Returns:

scorefloat: Mean accuracy of self.predict(X) wrt. y.

set_fit_request(*, views_ind: Union[bool, None, str] = '$UNCHANGED$') → MuComboClassifier

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Parameters:

views_indstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for views_ind parameter in fit.

Returns:

selfobject: The updated object.

staged_decision_function(X)

Compute decision function of X for each boosting iteration.

This method allows monitoring (i.e. determine error on testing set) after each boosting iteration.

Parameters:

X{array-like, sparse matrix}, shape = (n_samples, n_features): Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

Returns:

dec_fungenerator of numpy.ndarrays, shape = (n_samples, k): Decision function of the input samples. The order of outputs is the same of that of the classes_ attribute. Binary classification is a special cases with k == 1, otherwise k==n_classes. For binary classification, values <=0 mean classification in the first class in classes_ and values >0 mean classification in the second class in classes_.

staged_predict(X)

Return staged predictions for X.

The predicted class of an input sample is computed as the weighted mean prediction of the classifiers in the ensemble.

This generator method yields the ensemble prediction after each iteration of boosting and therefore allows monitoring, such as to determine the prediction on a test set after each boost.

Parameters:

X{array-like, sparse matrix} of shape = (n_samples, n_features): Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

Returns:

ygenerator of numpy.ndarrays, shape = (n_samples,): Predicted classes.

staged_score(X, y)

Return staged mean accuracy on the given test data and labels.

This generator method yields the ensemble score after each iteration of boosting and therefore allows monitoring, such as to determine the score on a test set after each boost.

Parameters:

X{array-like, sparse matrix} of shape = (n_samples, n_features): Multi-view test samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
yarray-like, shape = (n_samples,): True labels for X.

Returns:

scoregenerator of floats: Mean accuracy of self.staged_predict(X) wrt. y.

class multimodal.boosting.MumboClassifier(estimator=None, n_estimators=50, random_state=None, best_view_mode='edge')

Bases: ClassifierMixin, UBoosting, BaseEnsemble

It then iterates the process on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases. A MuMBo classifier.

A MuMBo classifier is a meta-estimator that implements a multimodal (or multi-view) boosting algorithm:

It fits a set of classifiers on the original dataset splitted into several views and retains the classifier obtained for the best view.

This class implements the MuMBo algorithm [1].

Parameters:

estimatorobject, optional (default=DecisionTreeClassifier)

Base estimator from which the boosted ensemble is built. Support for sample weighting is required, as well as proper classes_ and n_classes_ attributes. The default is a DecisionTreeClassifie with parameter max_depth=1.

n_estimatorsinteger, optional (default=50)

Maximum number of estimators at which boosting is terminated.

random_stateint, RandomState instance or None, optional (default=None)

If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

best_view_mode{“edge”, “error”}, optional (default=”edge”)

Mode used to select the best view at each iteration:

if best_view_mode == "edge", the best view is the view maximizing the edge value (variable δ (delta) in [1]),
if best_view_mode == "error", the best view is the view minimizing the classification error.

See also

sklearn.ensemble.AdaBoostClassifier
sklearn.ensemble.GradientBoostingClassifier
sklearn.tree.DecisionTreeClassifier

References

[1] (1,2)

Sokol Koço, “Tackling the uneven views problem with cooperation based ensemble learning methods”,

Examples

>>> from multimodal.boosting.mumbo import MumboClassifier
>>> from sklearn.datasets import load_iris
>>> X, y = load_iris(return_X_y=True)
>>> views_ind = [0, 2, 4]  # view 0: sepal data, view 1: petal data
>>> clf = MumboClassifier(random_state=0)
>>> clf.fit(X, y, views_ind)  
MumboClassifier(random_state=0)
>>> print(clf.predict([[ 5.8,  3.,  1.,  1.0]]))
[1]
>>> views_ind = [[0, 2], [1, 3]]  # view 0: length data, view 1: width data
>>> clf = MumboClassifier(random_state=0)
>>> clf.fit(X, y, views_ind)  
MumboClassifier(random_state=0)
>>> print(clf.predict([[ 5.8,  3.,  1.,  1.0]]))
[1]

>>> from sklearn.tree import DecisionTreeClassifier
>>> estimator = DecisionTreeClassifier(max_depth=2)
>>> clf = MumboClassifier(estimator=estimator, random_state=0)
>>> clf.fit(X, y, views_ind)  
MumboClassifier(estimator=DecisionTreeClassifier(max_depth=2),
                random_state=0)
>>> print(clf.predict([[ 5.8,  3.,  1.,  1.]]))
[1]

Attributes:

estimators_list of classifiers: Collection of fitted sub-estimators.
classes_numpy.ndarray, shape = (n_classes,): Classes labels.
n_classes_int: Number of classes.
estimator_weights_numpy.ndarray of floats, shape = (len(estimators: Weights for each estimator in the boosted ensemble.
estimator_errors_array of floats: Empirical loss for each iteration.
best_views_numpy.ndarray of integers, shape = (len(estimators_),): Indices of the best view for each estimator in the boosted ensemble.

decision_function(X)

Compute the decision function of X.

Parameters:

X{ array-like, sparse matrix},: shape = (n_samples, n_views * n_features) Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR. maybe also MultimodalData

Returns:

dec_funnumpy.ndarray, shape = (n_samples, k): Decision function of the input samples. The order of outputs is the same of that of the classes_ attribute. Binary classification is a special cases with k == 1, otherwise k == n_classes. For binary classification, values <=0 mean classification in the first class in classes_ and values >0 mean classification in the second class in classes_.

fit(X, y, views_ind=None)

Build a multimodal boosted classifier from the training set (X, y).

Parameters:

Xdict dictionary with all views

or MultiModalData , MultiModalArray, MultiModalSparseArray or {array-like, sparse matrix}, shape = (n_samples, n_features) Training multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

yarray-like, shape = (n_samples,)

Target values (class labels).

views_indarray-like (default=[0, n_features//2, n_features])

Paramater specifying how to extract the data views from X:

If views_ind is a 1-D array of sorted integers, the entries indicate the limits of the slices used to extract the views, where view n is given by X[:, views_ind[n]:views_ind[n+1]].

With this convention each view is therefore a view (in the NumPy sense) of X and no copy of the data is done.
If views_ind is an array of arrays of integers, then each array of integers views_ind[n] specifies the indices of the view n, which is then given by X[:, views_ind[n]].

With this convention each view creates therefore a partial copy of the data in X. This convention is thus more flexible but less efficient than the previous one.

Returns:

selfobject: Returns self.

predict(X)

Predict classes for X.

The predicted class of an input sample is computed as the weighted mean prediction of the classifiers in the ensemble.

Parameters:

X{array-like, sparse matrix}, shape = (n_samples, n_features): Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

Returns:

ynumpy.ndarray, shape = (n_samples,): Predicted classes.

score(X, y)

Return the mean accuracy on the given test data and labels.

Parameters:

X{array-like, sparse matrix} of shape = (n_samples, n_features): Multi-view test samples. Sparse matrix can be CSC, CSR
yarray-like, shape = (n_samples,): True labels for X.

Returns:

scorefloat: Mean accuracy of self.predict(X) wrt. y.

set_fit_request(*, views_ind: Union[bool, None, str] = '$UNCHANGED$') → MumboClassifier

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Parameters:

views_indstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for views_ind parameter in fit.

Returns:

selfobject: The updated object.

staged_decision_function(X)

Compute decision function of X for each boosting iteration.

This method allows monitoring (i.e. determine error on testing set) after each boosting iteration.

Parameters:

X{array-like, sparse matrix}, shape = (n_samples, n_features): Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR. maybe also MultimodalData

Returns:

dec_fungenerator of numpy.ndarrays, shape = (n_samples, k): Decision function of the input samples. The order of outputs is the same of that of the classes_ attribute. Binary classification is a special cases with k == 1, otherwise k==n_classes. For binary classification, values <=0 mean classification in the first class in classes_ and values >0 mean classification in the second class in classes_.

staged_predict(X)

Return staged predictions for X.

The predicted class of an input sample is computed as the weighted mean prediction of the classifiers in the ensemble.

This generator method yields the ensemble prediction after each iteration of boosting and therefore allows monitoring, such as to determine the prediction on a test set after each boost.

Parameters:

X{array-like, sparse matrix} of shape = (n_samples, n_features): Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.

Returns:

ygenerator of numpy.ndarrays, shape = (n_samples,): Predicted classes.

staged_score(X, y)

Return staged mean accuracy on the given test data and labels.

This generator method yields the ensemble score after each iteration of boosting and therefore allows monitoring, such as to determine the score on a test set after each boost.

Parameters:

X{array-like, sparse matrix} of shape = (n_samples, n_features): Multi-view test samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
yarray-like, shape = (n_samples,): True labels for X.

Returns:

scoregenerator of floats: Mean accuracy of self.staged_predict(X) wrt. y.