multimodal.boosting package
Submodules
multimodal.boosting.boost module
- class multimodal.boosting.boost.UBoosting
Bases:
BaseEstimatorAbstract class MuComboClassifier and MumboClassifier should inherit from UBoosting for methods
multimodal.boosting.combo module
This module contains a MultiConfusion MMatrix Bosting (CoMBo)
estimator for classification implemented in the MuComboClassifier class.
- class multimodal.boosting.combo.MuComboClassifier(estimator=None, n_estimators=50, random_state=None)
Bases:
ClassifierMixin,UBoosting,BaseEnsembleIt then iterates the process on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases. A MuCoMBo classifier.
A MuMBo classifier is a meta-estimator that implements a multimodal (or multi-view) boosting algorithm:
It fits a set of classifiers on the original dataset splitted into several views and retains the classifier obtained for the best view.
This class implements the MuMBo algorithm [1].
- Parameters:
- estimatorobject, optional (default=DecisionTreeClassifier)
Base estimator from which the boosted ensemble is built. Support for sample weighting is required, as well as proper classes_ and n_classes_ attributes. The default is a DecisionTreeClassifier with parameter
max_depth=1.- n_estimatorsinteger, optional (default=50)
Maximum number of estimators at which boosting is terminated.
- random_stateint, RandomState instance or None, optional (default=None)
If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
See also
sklearn.ensemble.AdaBoostClassifiersklearn.ensemble.GradientBoostingClassifiersklearn.tree.DecisionTreeClassifier
References
[1]Koc{c}o, Sokol and Capponi, C{'e}cile A Boosting Approach to Multiview Classification with Cooperation, 2011,Proceedings of the 2011 European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II, 209–228 Springer-Verlag https://link.springer.com/chapter/10.1007/978-3-642-23783-6_1
[2]Sokol Koço, “Tackling the uneven views problem with cooperation based ensemble learning methods”, PhD Thesis, Aix-Marseille Université, 2013, http://www.theses.fr/en/2013AIXM4101.
Examples
>>> from multimodal.boosting.combo import MuComboClassifier >>> from sklearn.datasets import load_iris >>> X, y = load_iris(return_X_y=True) >>> views_ind = [0, 2, 4] # view 0: sepal data, view 1: petal data >>> clf = MuComboClassifier(random_state=0) >>> clf.fit(X, y, views_ind) MuComboClassifier(random_state=0) >>> print(clf.predict([[ 5., 3., 1., 1.]])) [0] >>> views_ind = [[0, 2], [1, 3]] # view 0: length data, view 1: width data >>> clf = MuComboClassifier(random_state=0) >>> clf.fit(X, y, views_ind) MuComboClassifier(random_state=0) >>> print(clf.predict([[ 5., 3., 1., 1.]])) [0]
>>> from sklearn.tree import DecisionTreeClassifier >>> estimator = DecisionTreeClassifier(max_depth=2) >>> clf = MuComboClassifier(estimator=estimator, random_state=1) >>> clf.fit(X, y, views_ind) MuComboClassifier(estimator=DecisionTreeClassifier(max_depth=2), random_state=1) >>> print(clf.predict([[ 5., 3., 1., 1.]])) [0]
- Attributes:
- estimators_list of classifiers
Collection of fitted sub-estimators.
- classes_numpy.ndarray, shape = (n_classes,)
Classes labels.
- n_classes_int
Number of classes.
- n_views_int
Number of views
- estimator_weights_numpy.ndarray of floats, shape = (len(estimators_),)
Weights for each estimator in the boosted ensemble.
- estimator_errors_array of floats
Empirical loss for each iteration.
- best_views_numpy.ndarray of integers, shape = (len(estimators_),)
Indices of the best view for each estimator in the boosted ensemble.
- n_yi_numpy ndarray of int contains number of train sample for each classe shape (n_classes,)
- decision_function(X)
Compute the decision function of X.
- Parameters:
- X{array-like, sparse matrix}, shape = (n_samples, n_features)
Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- Returns:
- dec_funnumpy.ndarray, shape = (n_view, n_samples, k)
Decision function of the input samples. The order of outputs is the same of that of the classes_ attribute. Binary classification is a special cases with
k == 1, otherwisek == n_classes. For binary classification, values <=0 mean classification in the first class inclasses_and values >0 mean classification in the second class inclasses_.
- fit(X, y, views_ind=None)
Build a multimodal boosted classifier from the training set (X, y).
- Parameters:
- Xdict dictionary with all views
or MultiModalData , MultiModalArray, MultiModalSparseArray or {array-like, sparse matrix}, shape = (n_samples, n_features) Training multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- yarray-like, shape = (n_samples,)
Target values (class labels).
- views_indarray-like (default=[0, n_features//2, n_features])
Paramater specifying how to extract the data views from X:
If views_ind is a 1-D array of sorted integers, the entries indicate the limits of the slices used to extract the views, where view
nis given byX[:, views_ind[n]:views_ind[n+1]].With this convention each view is therefore a view (in the NumPy sense) of X and no copy of the data is done.
If views_ind is an array of arrays of integers, then each array of integers
views_ind[n]specifies the indices of the viewn, which is then given byX[:, views_ind[n]].With this convention each view creates therefore a partial copy of the data in X. This convention is thus more flexible but less efficient than the previous one.
- Returns:
- selfobject
Returns self.
- Raises:
- ValueError estimator must support sample_weight
- ValueError where X and view_ind are not compatibles
- predict(X)
Predict classes for X.
The predicted class of an input sample is computed as the weighted mean prediction of the classifiers in the ensemble.
- Parameters:
- X{array-like, sparse matrix}, shape = (n_samples, n_features)
Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- Returns:
- ynumpy.ndarray, shape = (n_samples,)
Predicted classes.
- Raises:
- ValueError ‘X’ input matrix must be have the same total number of features
of ‘X’ fit data
- score(X, y)
Return the mean accuracy on the given test data and labels.
- Parameters:
- X{array-like, sparse matrix} of shape = (n_samples, n_features)
Multi-view test samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- yarray-like, shape = (n_samples,)
True labels for X.
- Returns:
- scorefloat
Mean accuracy of self.predict(X) wrt. y.
- set_fit_request(*, views_ind: Union[bool, None, str] = '$UNCHANGED$') MuComboClassifier
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
- Parameters:
- views_indstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
views_indparameter infit.
- Returns:
- selfobject
The updated object.
- staged_decision_function(X)
Compute decision function of X for each boosting iteration.
This method allows monitoring (i.e. determine error on testing set) after each boosting iteration.
- Parameters:
- X{array-like, sparse matrix}, shape = (n_samples, n_features)
Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- Returns:
- dec_fungenerator of numpy.ndarrays, shape = (n_samples, k)
Decision function of the input samples. The order of outputs is the same of that of the classes_ attribute. Binary classification is a special cases with
k == 1, otherwisek==n_classes. For binary classification, values <=0 mean classification in the first class inclasses_and values >0 mean classification in the second class inclasses_.
- staged_predict(X)
Return staged predictions for X.
The predicted class of an input sample is computed as the weighted mean prediction of the classifiers in the ensemble.
This generator method yields the ensemble prediction after each iteration of boosting and therefore allows monitoring, such as to determine the prediction on a test set after each boost.
- Parameters:
- X{array-like, sparse matrix} of shape = (n_samples, n_features)
Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- Returns:
- ygenerator of numpy.ndarrays, shape = (n_samples,)
Predicted classes.
- staged_score(X, y)
Return staged mean accuracy on the given test data and labels.
This generator method yields the ensemble score after each iteration of boosting and therefore allows monitoring, such as to determine the score on a test set after each boost.
- Parameters:
- X{array-like, sparse matrix} of shape = (n_samples, n_features)
Multi-view test samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- yarray-like, shape = (n_samples,)
True labels for X.
- Returns:
- scoregenerator of floats
Mean accuracy of self.staged_predict(X) wrt. y.
multimodal.boosting.mumbo module
Multimodal Boosting
This module contains a MultiModal Boosting (MuMBo)
estimator for classification implemented in the MumboClassifier class.
- class multimodal.boosting.mumbo.MumboClassifier(estimator=None, n_estimators=50, random_state=None, best_view_mode='edge')
Bases:
ClassifierMixin,UBoosting,BaseEnsembleIt then iterates the process on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases. A MuMBo classifier.
A MuMBo classifier is a meta-estimator that implements a multimodal (or multi-view) boosting algorithm:
It fits a set of classifiers on the original dataset splitted into several views and retains the classifier obtained for the best view.
This class implements the MuMBo algorithm [1].
- Parameters:
- estimatorobject, optional (default=DecisionTreeClassifier)
Base estimator from which the boosted ensemble is built. Support for sample weighting is required, as well as proper classes_ and n_classes_ attributes. The default is a DecisionTreeClassifie with parameter
max_depth=1.- n_estimatorsinteger, optional (default=50)
Maximum number of estimators at which boosting is terminated.
- random_stateint, RandomState instance or None, optional (default=None)
If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
- best_view_mode{“edge”, “error”}, optional (default=”edge”)
Mode used to select the best view at each iteration:
if
best_view_mode == "edge", the best view is the view maximizing the edge value (variable δ (delta) in [1]),if
best_view_mode == "error", the best view is the view minimizing the classification error.
See also
sklearn.ensemble.AdaBoostClassifiersklearn.ensemble.GradientBoostingClassifiersklearn.tree.DecisionTreeClassifier
References
Examples
>>> from multimodal.boosting.mumbo import MumboClassifier >>> from sklearn.datasets import load_iris >>> X, y = load_iris(return_X_y=True) >>> views_ind = [0, 2, 4] # view 0: sepal data, view 1: petal data >>> clf = MumboClassifier(random_state=0) >>> clf.fit(X, y, views_ind) MumboClassifier(random_state=0) >>> print(clf.predict([[ 5.8, 3., 1., 1.0]])) [1] >>> views_ind = [[0, 2], [1, 3]] # view 0: length data, view 1: width data >>> clf = MumboClassifier(random_state=0) >>> clf.fit(X, y, views_ind) MumboClassifier(random_state=0) >>> print(clf.predict([[ 5.8, 3., 1., 1.0]])) [1]
>>> from sklearn.tree import DecisionTreeClassifier >>> estimator = DecisionTreeClassifier(max_depth=2) >>> clf = MumboClassifier(estimator=estimator, random_state=0) >>> clf.fit(X, y, views_ind) MumboClassifier(estimator=DecisionTreeClassifier(max_depth=2), random_state=0) >>> print(clf.predict([[ 5.8, 3., 1., 1.]])) [1]
- Attributes:
- estimators_list of classifiers
Collection of fitted sub-estimators.
- classes_numpy.ndarray, shape = (n_classes,)
Classes labels.
- n_classes_int
Number of classes.
- estimator_weights_numpy.ndarray of floats, shape = (len(estimators
Weights for each estimator in the boosted ensemble.
- estimator_errors_array of floats
Empirical loss for each iteration.
- best_views_numpy.ndarray of integers, shape = (len(estimators_),)
Indices of the best view for each estimator in the boosted ensemble.
- decision_function(X)
Compute the decision function of X.
- Parameters:
- X{ array-like, sparse matrix},
shape = (n_samples, n_views * n_features) Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR. maybe also MultimodalData
- Returns:
- dec_funnumpy.ndarray, shape = (n_samples, k)
Decision function of the input samples. The order of outputs is the same of that of the classes_ attribute. Binary classification is a special cases with
k == 1, otherwisek == n_classes. For binary classification, values <=0 mean classification in the first class inclasses_and values >0 mean classification in the second class inclasses_.
- fit(X, y, views_ind=None)
Build a multimodal boosted classifier from the training set (X, y).
- Parameters:
- Xdict dictionary with all views
or MultiModalData , MultiModalArray, MultiModalSparseArray or {array-like, sparse matrix}, shape = (n_samples, n_features) Training multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- yarray-like, shape = (n_samples,)
Target values (class labels).
- views_indarray-like (default=[0, n_features//2, n_features])
Paramater specifying how to extract the data views from X:
If views_ind is a 1-D array of sorted integers, the entries indicate the limits of the slices used to extract the views, where view
nis given byX[:, views_ind[n]:views_ind[n+1]].With this convention each view is therefore a view (in the NumPy sense) of X and no copy of the data is done.
If views_ind is an array of arrays of integers, then each array of integers
views_ind[n]specifies the indices of the viewn, which is then given byX[:, views_ind[n]].With this convention each view creates therefore a partial copy of the data in X. This convention is thus more flexible but less efficient than the previous one.
- Returns:
- selfobject
Returns self.
- predict(X)
Predict classes for X.
The predicted class of an input sample is computed as the weighted mean prediction of the classifiers in the ensemble.
- Parameters:
- X{array-like, sparse matrix}, shape = (n_samples, n_features)
Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- Returns:
- ynumpy.ndarray, shape = (n_samples,)
Predicted classes.
- score(X, y)
Return the mean accuracy on the given test data and labels.
- Parameters:
- X{array-like, sparse matrix} of shape = (n_samples, n_features)
Multi-view test samples. Sparse matrix can be CSC, CSR
- yarray-like, shape = (n_samples,)
True labels for X.
- Returns:
- scorefloat
Mean accuracy of self.predict(X) wrt. y.
- set_fit_request(*, views_ind: Union[bool, None, str] = '$UNCHANGED$') MumboClassifier
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
- Parameters:
- views_indstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
views_indparameter infit.
- Returns:
- selfobject
The updated object.
- staged_decision_function(X)
Compute decision function of X for each boosting iteration.
This method allows monitoring (i.e. determine error on testing set) after each boosting iteration.
- Parameters:
- X{array-like, sparse matrix}, shape = (n_samples, n_features)
Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR. maybe also MultimodalData
- Returns:
- dec_fungenerator of numpy.ndarrays, shape = (n_samples, k)
Decision function of the input samples. The order of outputs is the same of that of the classes_ attribute. Binary classification is a special cases with
k == 1, otherwisek==n_classes. For binary classification, values <=0 mean classification in the first class inclasses_and values >0 mean classification in the second class inclasses_.
- staged_predict(X)
Return staged predictions for X.
The predicted class of an input sample is computed as the weighted mean prediction of the classifiers in the ensemble.
This generator method yields the ensemble prediction after each iteration of boosting and therefore allows monitoring, such as to determine the prediction on a test set after each boost.
- Parameters:
- X{array-like, sparse matrix} of shape = (n_samples, n_features)
Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- Returns:
- ygenerator of numpy.ndarrays, shape = (n_samples,)
Predicted classes.
- staged_score(X, y)
Return staged mean accuracy on the given test data and labels.
This generator method yields the ensemble score after each iteration of boosting and therefore allows monitoring, such as to determine the score on a test set after each boost.
- Parameters:
- X{array-like, sparse matrix} of shape = (n_samples, n_features)
Multi-view test samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- yarray-like, shape = (n_samples,)
True labels for X.
- Returns:
- scoregenerator of floats
Mean accuracy of self.staged_predict(X) wrt. y.
Module contents
- class multimodal.boosting.MuComboClassifier(estimator=None, n_estimators=50, random_state=None)
Bases:
ClassifierMixin,UBoosting,BaseEnsembleIt then iterates the process on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases. A MuCoMBo classifier.
A MuMBo classifier is a meta-estimator that implements a multimodal (or multi-view) boosting algorithm:
It fits a set of classifiers on the original dataset splitted into several views and retains the classifier obtained for the best view.
This class implements the MuMBo algorithm [1].
- Parameters:
- estimatorobject, optional (default=DecisionTreeClassifier)
Base estimator from which the boosted ensemble is built. Support for sample weighting is required, as well as proper classes_ and n_classes_ attributes. The default is a DecisionTreeClassifier with parameter
max_depth=1.- n_estimatorsinteger, optional (default=50)
Maximum number of estimators at which boosting is terminated.
- random_stateint, RandomState instance or None, optional (default=None)
If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
See also
sklearn.ensemble.AdaBoostClassifiersklearn.ensemble.GradientBoostingClassifiersklearn.tree.DecisionTreeClassifier
References
[1]Koc{c}o, Sokol and Capponi, C{'e}cile A Boosting Approach to Multiview Classification with Cooperation, 2011,Proceedings of the 2011 European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II, 209–228 Springer-Verlag https://link.springer.com/chapter/10.1007/978-3-642-23783-6_1
[2]Sokol Koço, “Tackling the uneven views problem with cooperation based ensemble learning methods”, PhD Thesis, Aix-Marseille Université, 2013, http://www.theses.fr/en/2013AIXM4101.
Examples
>>> from multimodal.boosting.combo import MuComboClassifier >>> from sklearn.datasets import load_iris >>> X, y = load_iris(return_X_y=True) >>> views_ind = [0, 2, 4] # view 0: sepal data, view 1: petal data >>> clf = MuComboClassifier(random_state=0) >>> clf.fit(X, y, views_ind) MuComboClassifier(random_state=0) >>> print(clf.predict([[ 5., 3., 1., 1.]])) [0] >>> views_ind = [[0, 2], [1, 3]] # view 0: length data, view 1: width data >>> clf = MuComboClassifier(random_state=0) >>> clf.fit(X, y, views_ind) MuComboClassifier(random_state=0) >>> print(clf.predict([[ 5., 3., 1., 1.]])) [0]
>>> from sklearn.tree import DecisionTreeClassifier >>> estimator = DecisionTreeClassifier(max_depth=2) >>> clf = MuComboClassifier(estimator=estimator, random_state=1) >>> clf.fit(X, y, views_ind) MuComboClassifier(estimator=DecisionTreeClassifier(max_depth=2), random_state=1) >>> print(clf.predict([[ 5., 3., 1., 1.]])) [0]
- Attributes:
- estimators_list of classifiers
Collection of fitted sub-estimators.
- classes_numpy.ndarray, shape = (n_classes,)
Classes labels.
- n_classes_int
Number of classes.
- n_views_int
Number of views
- estimator_weights_numpy.ndarray of floats, shape = (len(estimators_),)
Weights for each estimator in the boosted ensemble.
- estimator_errors_array of floats
Empirical loss for each iteration.
- best_views_numpy.ndarray of integers, shape = (len(estimators_),)
Indices of the best view for each estimator in the boosted ensemble.
- n_yi_numpy ndarray of int contains number of train sample for each classe shape (n_classes,)
- decision_function(X)
Compute the decision function of X.
- Parameters:
- X{array-like, sparse matrix}, shape = (n_samples, n_features)
Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- Returns:
- dec_funnumpy.ndarray, shape = (n_view, n_samples, k)
Decision function of the input samples. The order of outputs is the same of that of the classes_ attribute. Binary classification is a special cases with
k == 1, otherwisek == n_classes. For binary classification, values <=0 mean classification in the first class inclasses_and values >0 mean classification in the second class inclasses_.
- fit(X, y, views_ind=None)
Build a multimodal boosted classifier from the training set (X, y).
- Parameters:
- Xdict dictionary with all views
or MultiModalData , MultiModalArray, MultiModalSparseArray or {array-like, sparse matrix}, shape = (n_samples, n_features) Training multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- yarray-like, shape = (n_samples,)
Target values (class labels).
- views_indarray-like (default=[0, n_features//2, n_features])
Paramater specifying how to extract the data views from X:
If views_ind is a 1-D array of sorted integers, the entries indicate the limits of the slices used to extract the views, where view
nis given byX[:, views_ind[n]:views_ind[n+1]].With this convention each view is therefore a view (in the NumPy sense) of X and no copy of the data is done.
If views_ind is an array of arrays of integers, then each array of integers
views_ind[n]specifies the indices of the viewn, which is then given byX[:, views_ind[n]].With this convention each view creates therefore a partial copy of the data in X. This convention is thus more flexible but less efficient than the previous one.
- Returns:
- selfobject
Returns self.
- Raises:
- ValueError estimator must support sample_weight
- ValueError where X and view_ind are not compatibles
- predict(X)
Predict classes for X.
The predicted class of an input sample is computed as the weighted mean prediction of the classifiers in the ensemble.
- Parameters:
- X{array-like, sparse matrix}, shape = (n_samples, n_features)
Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- Returns:
- ynumpy.ndarray, shape = (n_samples,)
Predicted classes.
- Raises:
- ValueError ‘X’ input matrix must be have the same total number of features
of ‘X’ fit data
- score(X, y)
Return the mean accuracy on the given test data and labels.
- Parameters:
- X{array-like, sparse matrix} of shape = (n_samples, n_features)
Multi-view test samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- yarray-like, shape = (n_samples,)
True labels for X.
- Returns:
- scorefloat
Mean accuracy of self.predict(X) wrt. y.
- set_fit_request(*, views_ind: Union[bool, None, str] = '$UNCHANGED$') MuComboClassifier
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
- Parameters:
- views_indstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
views_indparameter infit.
- Returns:
- selfobject
The updated object.
- staged_decision_function(X)
Compute decision function of X for each boosting iteration.
This method allows monitoring (i.e. determine error on testing set) after each boosting iteration.
- Parameters:
- X{array-like, sparse matrix}, shape = (n_samples, n_features)
Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- Returns:
- dec_fungenerator of numpy.ndarrays, shape = (n_samples, k)
Decision function of the input samples. The order of outputs is the same of that of the classes_ attribute. Binary classification is a special cases with
k == 1, otherwisek==n_classes. For binary classification, values <=0 mean classification in the first class inclasses_and values >0 mean classification in the second class inclasses_.
- staged_predict(X)
Return staged predictions for X.
The predicted class of an input sample is computed as the weighted mean prediction of the classifiers in the ensemble.
This generator method yields the ensemble prediction after each iteration of boosting and therefore allows monitoring, such as to determine the prediction on a test set after each boost.
- Parameters:
- X{array-like, sparse matrix} of shape = (n_samples, n_features)
Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- Returns:
- ygenerator of numpy.ndarrays, shape = (n_samples,)
Predicted classes.
- staged_score(X, y)
Return staged mean accuracy on the given test data and labels.
This generator method yields the ensemble score after each iteration of boosting and therefore allows monitoring, such as to determine the score on a test set after each boost.
- Parameters:
- X{array-like, sparse matrix} of shape = (n_samples, n_features)
Multi-view test samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- yarray-like, shape = (n_samples,)
True labels for X.
- Returns:
- scoregenerator of floats
Mean accuracy of self.staged_predict(X) wrt. y.
- class multimodal.boosting.MumboClassifier(estimator=None, n_estimators=50, random_state=None, best_view_mode='edge')
Bases:
ClassifierMixin,UBoosting,BaseEnsembleIt then iterates the process on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases. A MuMBo classifier.
A MuMBo classifier is a meta-estimator that implements a multimodal (or multi-view) boosting algorithm:
It fits a set of classifiers on the original dataset splitted into several views and retains the classifier obtained for the best view.
This class implements the MuMBo algorithm [1].
- Parameters:
- estimatorobject, optional (default=DecisionTreeClassifier)
Base estimator from which the boosted ensemble is built. Support for sample weighting is required, as well as proper classes_ and n_classes_ attributes. The default is a DecisionTreeClassifie with parameter
max_depth=1.- n_estimatorsinteger, optional (default=50)
Maximum number of estimators at which boosting is terminated.
- random_stateint, RandomState instance or None, optional (default=None)
If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
- best_view_mode{“edge”, “error”}, optional (default=”edge”)
Mode used to select the best view at each iteration:
if
best_view_mode == "edge", the best view is the view maximizing the edge value (variable δ (delta) in [1]),if
best_view_mode == "error", the best view is the view minimizing the classification error.
See also
sklearn.ensemble.AdaBoostClassifiersklearn.ensemble.GradientBoostingClassifiersklearn.tree.DecisionTreeClassifier
References
Examples
>>> from multimodal.boosting.mumbo import MumboClassifier >>> from sklearn.datasets import load_iris >>> X, y = load_iris(return_X_y=True) >>> views_ind = [0, 2, 4] # view 0: sepal data, view 1: petal data >>> clf = MumboClassifier(random_state=0) >>> clf.fit(X, y, views_ind) MumboClassifier(random_state=0) >>> print(clf.predict([[ 5.8, 3., 1., 1.0]])) [1] >>> views_ind = [[0, 2], [1, 3]] # view 0: length data, view 1: width data >>> clf = MumboClassifier(random_state=0) >>> clf.fit(X, y, views_ind) MumboClassifier(random_state=0) >>> print(clf.predict([[ 5.8, 3., 1., 1.0]])) [1]
>>> from sklearn.tree import DecisionTreeClassifier >>> estimator = DecisionTreeClassifier(max_depth=2) >>> clf = MumboClassifier(estimator=estimator, random_state=0) >>> clf.fit(X, y, views_ind) MumboClassifier(estimator=DecisionTreeClassifier(max_depth=2), random_state=0) >>> print(clf.predict([[ 5.8, 3., 1., 1.]])) [1]
- Attributes:
- estimators_list of classifiers
Collection of fitted sub-estimators.
- classes_numpy.ndarray, shape = (n_classes,)
Classes labels.
- n_classes_int
Number of classes.
- estimator_weights_numpy.ndarray of floats, shape = (len(estimators
Weights for each estimator in the boosted ensemble.
- estimator_errors_array of floats
Empirical loss for each iteration.
- best_views_numpy.ndarray of integers, shape = (len(estimators_),)
Indices of the best view for each estimator in the boosted ensemble.
- decision_function(X)
Compute the decision function of X.
- Parameters:
- X{ array-like, sparse matrix},
shape = (n_samples, n_views * n_features) Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR. maybe also MultimodalData
- Returns:
- dec_funnumpy.ndarray, shape = (n_samples, k)
Decision function of the input samples. The order of outputs is the same of that of the classes_ attribute. Binary classification is a special cases with
k == 1, otherwisek == n_classes. For binary classification, values <=0 mean classification in the first class inclasses_and values >0 mean classification in the second class inclasses_.
- fit(X, y, views_ind=None)
Build a multimodal boosted classifier from the training set (X, y).
- Parameters:
- Xdict dictionary with all views
or MultiModalData , MultiModalArray, MultiModalSparseArray or {array-like, sparse matrix}, shape = (n_samples, n_features) Training multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- yarray-like, shape = (n_samples,)
Target values (class labels).
- views_indarray-like (default=[0, n_features//2, n_features])
Paramater specifying how to extract the data views from X:
If views_ind is a 1-D array of sorted integers, the entries indicate the limits of the slices used to extract the views, where view
nis given byX[:, views_ind[n]:views_ind[n+1]].With this convention each view is therefore a view (in the NumPy sense) of X and no copy of the data is done.
If views_ind is an array of arrays of integers, then each array of integers
views_ind[n]specifies the indices of the viewn, which is then given byX[:, views_ind[n]].With this convention each view creates therefore a partial copy of the data in X. This convention is thus more flexible but less efficient than the previous one.
- Returns:
- selfobject
Returns self.
- predict(X)
Predict classes for X.
The predicted class of an input sample is computed as the weighted mean prediction of the classifiers in the ensemble.
- Parameters:
- X{array-like, sparse matrix}, shape = (n_samples, n_features)
Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- Returns:
- ynumpy.ndarray, shape = (n_samples,)
Predicted classes.
- score(X, y)
Return the mean accuracy on the given test data and labels.
- Parameters:
- X{array-like, sparse matrix} of shape = (n_samples, n_features)
Multi-view test samples. Sparse matrix can be CSC, CSR
- yarray-like, shape = (n_samples,)
True labels for X.
- Returns:
- scorefloat
Mean accuracy of self.predict(X) wrt. y.
- set_fit_request(*, views_ind: Union[bool, None, str] = '$UNCHANGED$') MumboClassifier
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
- Parameters:
- views_indstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
views_indparameter infit.
- Returns:
- selfobject
The updated object.
- staged_decision_function(X)
Compute decision function of X for each boosting iteration.
This method allows monitoring (i.e. determine error on testing set) after each boosting iteration.
- Parameters:
- X{array-like, sparse matrix}, shape = (n_samples, n_features)
Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR. maybe also MultimodalData
- Returns:
- dec_fungenerator of numpy.ndarrays, shape = (n_samples, k)
Decision function of the input samples. The order of outputs is the same of that of the classes_ attribute. Binary classification is a special cases with
k == 1, otherwisek==n_classes. For binary classification, values <=0 mean classification in the first class inclasses_and values >0 mean classification in the second class inclasses_.
- staged_predict(X)
Return staged predictions for X.
The predicted class of an input sample is computed as the weighted mean prediction of the classifiers in the ensemble.
This generator method yields the ensemble prediction after each iteration of boosting and therefore allows monitoring, such as to determine the prediction on a test set after each boost.
- Parameters:
- X{array-like, sparse matrix} of shape = (n_samples, n_features)
Multi-view input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- Returns:
- ygenerator of numpy.ndarrays, shape = (n_samples,)
Predicted classes.
- staged_score(X, y)
Return staged mean accuracy on the given test data and labels.
This generator method yields the ensemble score after each iteration of boosting and therefore allows monitoring, such as to determine the score on a test set after each boost.
- Parameters:
- X{array-like, sparse matrix} of shape = (n_samples, n_features)
Multi-view test samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK and LIL are converted to CSR.
- yarray-like, shape = (n_samples,)
True labels for X.
- Returns:
- scoregenerator of floats
Mean accuracy of self.staged_predict(X) wrt. y.