summit.multiview_platform.utils package

Submodules

summit.multiview_platform.utils.base module

class BaseClassifier

Bases: BaseEstimator

accepts_multi_class(random_state, n_samples=10, dim=2, n_classes=3)

Base function to test if the classifier accepts a multiclass task. It is highly recommended to overwrite it with a simple method that returns True or False in the classifier’s module, as it will speed up the benchmark

gen_best_params(detector)

return best parameters of detector :param detector:

Returns:

best param – value

Return type:

dictionary with param name as key and best parameters

gen_distribs()
gen_params_from_detector(detector)
get_base_estimator(estimator, estimator_config)
get_config()

Generates a string to containing all the information about the classifier’s configuration

get_interpretation(directory, base_file_name, y_test, feature_ids, multi_class=False)

Base method that returns an empty string if there is not interpretation method in the classifier’s module

params_to_string()

Formats the parameters of the classifier as a string

to_str(param_name)

Formats a parameter into a string

class ResultAnalyser(classifier, classification_indices, k_folds, hps_method, metrics_dict, n_iter, class_label_names, pred, directory, base_file_name, labels, database_name, nb_cores, duration, feature_ids)

Bases: object

A shared result analysis tool for mono and multiview classifiers. The main utility of this class is to generate a txt file summarizing the results and possible interpretation for the classifier.

analyze()

Main function used in the monoview and multiview classification scripts

Returns:

  • string_analysis (a string that will be stored in the log and in a txt)

  • file

  • image_analysis (a list of images to save)

  • metric_scores (a dictionary of {metric: (train_score, test_score)})

  • used in later analysis.

get_all_metrics_scores()

Get the scores for all the metrics in the list

abstractmethod get_base_string()
get_classifier_config_string()

Formats the information about the classifier and its configuration

Return type:

A string explaining the classifier’s configuration

get_db_config_string()

Generates a string, formatting all the information on the database

Return type:

db_config_string string, formatting all the information on the database

get_metric_score(metric, metric_kwargs)

Get the train and test scores for a specific metric and its arguments

Parameters:
  • metric (name of the metric, must be implemented in metrics)

  • metric_kwargs (the dictionary containing the arguments for the metric.)

Return type:

train_score, test_score

abstractmethod get_view_specific_info()
print_metric_score()

Generates a string, formatting the metrics configuration and scores

Parameters:
  • metric_scores (dictionary of train_score, test_score for each metric)

  • metric_list (list of metrics)

Return type:

metric_score_string string formatting all metric results

get_metric(metrics_dict)

Fetches the metric module in the metrics package

get_names(classed_list)

summit.multiview_platform.utils.configuration module

get_the_args(path_to_config_file='/home/runner/work/summit/summit/config_files/config.yml')

The function for extracting the args for a ‘.yml’ file.

Parameters:

path_to_config_file (str, path to the yml file containing the configuration)

Returns:

  • yaml_config (dict, the dictionary conaining the configuration for the)

  • benchmark

pass_default_config(log=True, name=['plausible'], label='_', file_type='.hdf5', views=None, pathf='/home/runner/work/summit/summit/data/', nice=0, random_state=42, nb_cores=1, full=True, debug=False, add_noise=False, noise_std=0.0, res_dir='/home/runner/work/summit/summit/results/', track_tracebacks=True, split=0.49, nb_folds=5, nb_class=None, classes=None, type=['multiview'], algos_monoview=['all'], algos_multiview=['svm_jumbo_fusion'], stats_iter=2, metrics={'accuracy_score': {}, 'f1_score': {}}, metric_princ='accuracy_score', hps_type='Random', hps_iter=1, hps_kwargs={'equivalent_draws': True, 'n_iter': 10}, **kwargs)
save_config(directory, arguments)

Saves the config file in the result directory.

summit.multiview_platform.utils.dataset module

class Dataset

Bases: object

This is the base class for all the type of multiview datasets of SuMMIT.

check_selected_label_names(nb_labels=None, selected_label_names=None, random_state=RandomState(MT19937) at 0x7F66D59EFD40)
abstractmethod filter(labels, label_names, sample_indices, view_names, path=None)
gen_feat_id()
abstractmethod get_label_names(sample_indices=None)
abstractmethod get_labels(sample_indices=None)
abstractmethod get_nb_samples()
get_shape(view_index=0, sample_indices=None)

Gets the shape of the needed view on the asked samples

Parameters:
  • view_index (int) – The index of the view to extract

  • sample_indices (numpy.ndarray) – The array containing the indices of the samples to extract.

Return type:

Tuple containing the shape

abstractmethod get_v(view_index, sample_indices=None)
init_sample_indices(sample_indices=None)

If no sample indices are provided, selects all the available samples.

Parameters:

sample_indices (np.array,) – An array-like containing the indices of the samples.

select_labels(selected_label_names)
select_views_and_labels(nb_labels=None, selected_label_names=None, random_state=None, view_names=None, path_for_new='../data/')
to_numpy_array(sample_indices=None, view_indices=None)

Concatenates the needed views in one big numpy array while saving the limits of each view in a list, to be able to retrieve them later.

Parameters:
  • sample_indices (array like) – The indices of the samples to extract from the dataset

  • view_indices (array like) – The indices of the view to concatenate in the numpy array

Returns:

  • concat_views (numpy array,) – The numpy array containing all the needed views.

  • view_limits (list of int) – The limits of each slice used to extract the views.

class HDF5Dataset(views=None, labels=None, are_sparse=False, file_name='dataset.hdf5', view_names=None, path='', hdf5_file=None, labels_names=None, is_temp=False, sample_ids=None, feature_ids=None)

Bases: Dataset

Dataset class

This is used to encapsulate the multiview dataset while keeping it stored on the disk instead of in RAM.

Parameters:
  • views (list of numpy arrays or None) – The list containing each view of the dataset as a numpy array of shape (nb samples, nb features).

  • labels (numpy array or None) – The labels for the multiview dataset, of shape (nb samples, ).

  • are_sparse (list of bool, or None) – The list of boolean telling if each view is sparse or not.

  • file_name (str, or None) – The name of the hdf5 file that will be created to store the multiview dataset.

  • view_names (list of str, or None) – The name of each view.

  • path (str, or None) – The path where the hdf5 dataset file will be stored

  • hdf5_file (h5py.File object, or None) – If not None, the dataset will be imported directly from this file.

  • labels_names (list of str, or None) – The name for each unique value of the labels given in labels.

  • is_temp (bool) – Used if a temporary dataset has to be stored by the benchmark.

dataset

The h5py file pbject that points to the hdf5 dataset on the disk.

Type:

h5py.File object

nb_view

The number of views in the dataset.

Type:

int

view_dict
The dictionnary with the name of each view as the keys and their indices

as values

Type:

dict

add_gaussian_noise(random_state, path, noise_std=0.15)
copy_view(target_dataset=None, source_view_name=None, target_view_index=None, sample_indices=None)
filter(labels, label_names, sample_indices, view_names, path=None)
get_label_names(decode=False, sample_indices=None)

Used to get the list of the label names for the given set of samples

Parameters:
  • decode (bool) – If True, will decode the label names before listing them

  • sample_indices (numpy.ndarray) – The array containing the indices of the needed samples

Returns:

  • list

  • seleted labels’ names

get_labels(sample_indices=None)

Gets the label array for the asked samples

Parameters:

sample_indices (numpy.ndarray) – The array containing the indices of the samples to extract.

Return type:

numpy.ndarray containing the labels of the asked samples

get_name()

Gets the name of the dataset hdf5 file

get_nb_class(sample_indices=None)

Gets the number of classes of the dataset for the asked samples

Parameters:

sample_indices (numpy.ndarray) – The array containing the indices of the samples to extract.

Returns:

int

Return type:

The number of classes

get_nb_samples()

Used to get the number of samples available in hte dataset

Return type:

int

get_v(view_index, sample_indices=None)

Extract the view and returns a numpy.ndarray containing the description of the samples specified in sample_indices

Parameters:
  • view_index (int) – The index of the view to extract

  • sample_indices (numpy.ndarray) – The array containing the indices of the samples to extract.

Return type:

A numpy.ndarray containing the view data for the needed samples

get_view_dict()

Returns the dictionary containing view indices as keys and their corresponding names as values

get_view_name(view_idx)

Method to get a view’s name from its index.

Parameters:

view_idx (int) – The index of the view in the dataset

Return type:

The view’s name.

init_attrs()

Used to init the attributes that are modified when self.dataset changes

init_view_names(view_names=None)
rm()

Method used to delete the dataset file on the disk if the dataset is temporary.

update_hdf5_dataset(path)
class RAMDataset(views=None, labels=None, are_sparse=False, view_names=None, labels_names=None, sample_ids=None, name=None, feature_ids=None)

Bases: Dataset

filter(labels, label_names, sample_indices, view_names, path=None)
get_label_names(sample_indices=None, decode=True)
get_labels(sample_indices=None)
get_name()
get_nb_class(sample_indices=None)
get_nb_samples()
get_v(view_index, sample_indices=None)
get_view_dict()
get_view_name(view_idx)
init_attrs()
confirm(resp=True, timeout=15)

Used to process answer

copy_hdf5(pathF, name, nbCores)

Used to copy a HDF5 database in case of multicore computing

datasets_already_exist(pathF, name, nbCores)

Used to check if it’s necessary to copy datasets

delete_HDF5(benchmarkArgumentsDictionaries, nbCores, dataset)

Used to delete temporary copies at the end of the benchmark

extract_subset(matrix, used_indices)

Used to extract a subset of a matrix even if it’s sparse WIP

get_samples_views_indices(dataset, samples_indices, view_indices)

This function is used to get all the samples indices and view indices if needed

init_multiple_datasets(path_f, name, nb_cores)

Used to create copies of the dataset if multicore computation is used.

This is a temporary solution to fix the sharing memory issue with HDF5 datasets.

Parameters:
  • path_f (string) – Path to the original dataset directory

  • name (string) – Name of the dataset

  • nb_cores (int) – The number of threads that the benchmark can use

Returns:

datasetFiles – Dictionary resuming which mono- and multiview algorithms which will be used in the benchmark.

Return type:

None

input_(timeout=15)

used as a UI to stop if too much HDD space will be used

is_just_number(string)

summit.multiview_platform.utils.execution module

find_dataset_names(path, type, names)

This function goal is to browse the dataset directory and extrats all the needed dataset names.

gen_argument_dictionaries(labels_dictionary, directories, splits, hyper_param_search, args, k_folds, stats_iter_random_states, metrics, argument_dictionaries, benchmark, views, views_indices)

Used to generate a dictionary for each benchmark.

One for each label combination (if multiclass), for each statistical iteration, generates an dictionary with all necessary information to perform the benchmark

Parameters:
  • labels_dictionary (dictionary) – Dictionary mapping labels indices to labels names.

  • directories (list of strings) – List of the paths to the result directories for each statistical iteration.

  • multiclass_labels (list of lists of numpy.ndarray) – For each label couple, for each statistical iteration a triplet of numpy.ndarrays is stored with the indices for the biclass training set, the ones for the biclass testing set and the ones for the multiclass testing set.

  • labels_combinations (list of lists of numpy.ndarray) – Each original couple of different labels.

  • indices_multiclass (list of lists of numpy.ndarray) – For each combination, contains a biclass labels numpy.ndarray with the 0/1 labels of combination.

  • hyper_param_search (string) – Type of hyper parameter optimization method

  • args (parsed args objects) – All the args passed by the user.

  • k_folds (list of list of sklearn.model_selection.StratifiedKFold) – For each statistical iteration a Kfold stratified (keeping the ratio between classes in each fold).

  • stats_iter_random_states (list of numpy.random.RandomState objects) – Multiple random states, one for each sattistical iteration of the same benchmark.

  • metrics (list of lists) – metrics that will be used to evaluate the algorithms performance.

  • argument_dictionaries (dictionary) – Dictionary resuming all the specific arguments for the benchmark, oe dictionary for each classifier.

  • benchmark (dictionary) – Dictionary resuming which mono- and multiview algorithms which will be used in the benchmark.

  • nb_views (int) – THe number of views used by the benchmark.

  • views (list of strings) – List of the names of the used views.

  • views_indices (list of ints) – List of indices (according to the dataset) of the used views.

Returns:

benchmarkArgumentDictionaries – All the needed arguments for the benchmarks.

Return type:

list of dicts

gen_direcorties_names(directory, stats_iter)

Used to generate the different directories of each iteration if needed.

Parameters:
  • directory (string) – Path to the results directory.

  • statsIter (int) – The number of statistical iterations.

Returns:

directories – Paths to each statistical iterations result directory.

Return type:

list of strings

gen_k_folds(stats_iter, nb_folds, stats_iter_random_states)

Used to generate folds indices for cross validation for each statistical iteration.

Parameters:
  • stats_iter (integer) – Number of statistical iterations of the benchmark.

  • nb_folds (integer) – The number of cross-validation folds for the benchmark.

  • stats_iter_random_states (list of numpy.random.RandomState) – The random states for each statistical iteration.

Returns:

folds_list – For each statistical iteration a Kfold stratified (keeping the ratio between classes in each fold).

Return type:

list of list of sklearn.model_selection.StratifiedKFold

gen_splits(labels, split_ratio, stats_iter_random_states)

Used to _gen the train/test splits using one or multiple random states.

Parameters:
  • labels (numpy.ndarray) – Name of the database.

  • split_ratio (float) – The ratio of samples between train and test set.

  • stats_iter_random_states (list of numpy.random.RandomState) – The random states for each statistical iteration.

Returns:

splits – For each statistical iteration a couple of numpy.ndarrays is stored with the indices for the training set and the ones of the testing set.

Return type:

list of lists of numpy.ndarray

get_database_function(name, type_var)

Used to get the right database extraction function according to the type of database and it’s name

Parameters:
  • name (string) – Name of the database.

  • type_var (string) – type of dataset hdf5 or csv

Returns:

getDatabase – The function that will be used to extract the database

Return type:

function

init_log_file(name, views, cl_type, log, debug, label, result_directory, args)

Used to init the directory where the preds will be stored and the log file.

First this function will check if the result directory already exists (only one per minute is allowed).

If the the result directory name is available, it is created, and the logfile is initiated.

Parameters:
  • name (string) – Name of the database.

  • views (list of strings) – List of the view names that will be used in the benchmark.

  • cl_type (list of strings) – Type of benchmark that will be made .

  • log (bool) – Whether to show the log file in console or hide it.

  • debug (bool) – for debug option

  • label (str for label)

  • result_directory (str name of the result directory)

  • add_noise (bool for add noise)

  • noise_std (level of std noise)

Returns:

results_directory – Reference to the main results directory for the benchmark.

Return type:

string

init_random_state(random_state_arg, directory)

Used to init a random state. If no random state is specified, it will generate a ‘random’ seed. If the randomSateArg is a string containing only numbers, it will be converted in

an int to generate a seed.

If the randomSateArg is a string with letters, it must be a path to a pickled random state file that will be loaded. The function will also pickle the new random state in a file tobe able to retrieve it later. Tested

Parameters:
  • random_state_arg (None or string) – See function description.

  • directory (string) – Path to the results directory.

Returns:

random_state – This random state will be used all along the benchmark .

Return type:

numpy.random.RandomState object

init_stats_iter_random_states(stats_iter, random_state)

Used to initialize multiple random states if needed because of multiple statistical iteration of the same benchmark

Parameters:
  • stats_iter (int) – Number of statistical iterations of the same benchmark done (with a different random state).

  • random_state (numpy.random.RandomState object) – The random state of the whole experimentation, that will be used to generate the ones for each statistical iteration.

Returns:

stats_iter_random_states – Multiple random states, one for each sattistical iteration of the same benchmark.

Return type:

list of numpy.random.RandomState objects

init_views(dataset_var, arg_views)

Used to return the views names that will be used by the benchmark, their indices and all the views names.

Parameters:
  • dataset_var (HDF5 dataset file) – The full dataset that wil be used by the benchmark.

  • arg_views (list of strings) – The views that will be used by the benchmark (arg).

Returns:

  • views (list of strings) – Names of the views that will be used by the benchmark.

  • view_indices (list of ints) – The list of the indices of the view that will be used in the benchmark (according to the dataset).

  • all_views (list of strings) – Names of all the available views in the dataset.

parse_the_args(arguments)

Used to parse the args entered by the user

summit.multiview_platform.utils.get_multiview_db module

exception DatasetError(*args, **kwargs)

Bases: Exception

get_classic_db_csv(views, pathF, nameDB, NB_CLASS, askedLabelsNames, random_state, full=False, add_noise=False, noise_std=0.15, delimiter=',', path_for_new='../data/')
get_classic_db_hdf5(views, path_f, name_DB, nb_class, asked_labels_names, random_state, full=False, add_noise=False, noise_std=0.15, path_for_new='../data/')

Used to load a hdf5 database

get_plausible_db_hdf5(features, path, file_name, nb_class=3, label_names=[b'No', b'Yes', b'Maybe'], random_state=None, full=True, add_noise=False, noise_std=0.15, nb_view=3, nb_samples=100, nb_features=10)

Used to generate a plausible dataset to test the algorithms

make_me_noisy(view_data, random_state, percentage=5)

used to introduce some noise in the generated data

summit.multiview_platform.utils.make_file_config module

class ConfigurationMaker(classifier_dict=None)

Bases: object

Find the name of the classifier from the dict classier to report

summit.multiview_platform.utils.multiclass module

class MonoviewWrapper

Bases: MultiClassWrapper

class MultiClassWrapper

Bases: object

format_params(params, deep=True)
get_config()
get_interpretation(directory, base_file_name, y_test=None)
set_params(**params)

This function is useful in order for the OV_Wrappers to be transparent in terms of parameters. If we remove it the parameters have to be specified as estimator__param. Witch is not relevant for the platform

class MultiviewOVOWrapper(estimator=None, **args)

Bases: MultiviewWrapper, OneVsOneClassifier

fit(X, y, train_indices=None, view_indices=None)

Fit underlying estimators.

Parameters:
  • X ((sparse) array-like of shape (n_samples, n_features)) – Data.

  • y (array-like of shape (n_samples,)) – Multi-class targets.

Return type:

self

get_params(deep=True)

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params – Parameter names mapped to their values.

Return type:

dict

get_tags()

Get tags of estimateur see sklearn > 1.6.0 _pairwise attribut removed

multiview_decision_function(X, sample_indices, view_indices)
predict(X, sample_indices=None, view_indices=None)

Estimate the best class label for each sample in X.

This is implemented as argmax(decision_function(X), axis=1) which will return the label of the class with most votes by estimators predicting the outcome of a decision for each possible class pair.

Parameters:

X ((sparse) array-like of shape (n_samples, n_features)) – Data.

Returns:

y – Predicted multi-class targets.

Return type:

numpy array of shape [n_samples]

set_fit_request(*, train_indices: bool | None | str = '$UNCHANGED$', view_indices: bool | None | str = '$UNCHANGED$') MultiviewOVOWrapper

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • train_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for train_indices parameter in fit.

  • view_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for view_indices parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_partial_fit_request(*, classes: bool | None | str = '$UNCHANGED$') MultiviewOVOWrapper

Request metadata passed to the partial_fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to partial_fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to partial_fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

classes (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for classes parameter in partial_fit.

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, sample_indices: bool | None | str = '$UNCHANGED$', view_indices: bool | None | str = '$UNCHANGED$') MultiviewOVOWrapper

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • sample_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_indices parameter in predict.

  • view_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for view_indices parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') MultiviewOVOWrapper

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class MultiviewOVRWrapper(estimator=None, **args)

Bases: MultiviewWrapper, OneVsRestClassifier

fit(X, y, train_indices=None, view_indices=None)

Fit underlying estimators.

Parameters:
  • X ({array-like, sparse matrix} of shape (n_samples, n_features)) – Data.

  • y ({array-like, sparse matrix} of shape (n_samples,) or (n_samples, n_classes)) – Multi-class targets. An indicator matrix turns on multilabel classification.

  • **fit_params (dict) –

    Parameters passed to the estimator.fit method of each sub-estimator.

    Added in version 1.4: Only available if enable_metadata_routing=True. See Metadata Routing User Guide for more details.

Returns:

self – Instance of fitted estimator.

Return type:

object

get_params(deep=True)

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params – Parameter names mapped to their values.

Return type:

dict

predict(X, sample_indices=None, view_indices=None)

Predict multi-class targets using underlying estimators.

Parameters:

X ({array-like, sparse matrix} of shape (n_samples, n_features)) – Data.

Returns:

y – Predicted multi-class targets.

Return type:

{array-like, sparse matrix} of shape (n_samples,) or (n_samples, n_classes)

set_fit_request(*, train_indices: bool | None | str = '$UNCHANGED$', view_indices: bool | None | str = '$UNCHANGED$') MultiviewOVRWrapper

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • train_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for train_indices parameter in fit.

  • view_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for view_indices parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_partial_fit_request(*, classes: bool | None | str = '$UNCHANGED$') MultiviewOVRWrapper

Request metadata passed to the partial_fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to partial_fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to partial_fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

classes (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for classes parameter in partial_fit.

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, sample_indices: bool | None | str = '$UNCHANGED$', view_indices: bool | None | str = '$UNCHANGED$') MultiviewOVRWrapper

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • sample_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_indices parameter in predict.

  • view_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for view_indices parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') MultiviewOVRWrapper

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class MultiviewWrapper(estimator=None, **args)

Bases: MultiClassWrapper

class OVOWrapper(estimator, *, n_jobs=None)

Bases: MonoviewWrapper, OneVsOneClassifier

decision_function(X)

Decision function for the OneVsOneClassifier.

The decision values for the samples are computed by adding the normalized sum of pair-wise classification confidence levels to the votes in order to disambiguate between the decision values when the votes for all the classes are equal leading to a tie.

Parameters:

X (array-like of shape (n_samples, n_features)) – Input data.

Returns:

Y – Result of calling decision_function on the final estimator.

Changed in version 0.19: output shape changed to (n_samples,) to conform to scikit-learn conventions for binary classification.

Return type:

array-like of shape (n_samples, n_classes) or (n_samples,)

get_params(deep=True)

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params – Parameter names mapped to their values.

Return type:

dict

set_partial_fit_request(*, classes: bool | None | str = '$UNCHANGED$') OVOWrapper

Request metadata passed to the partial_fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to partial_fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to partial_fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

classes (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for classes parameter in partial_fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') OVOWrapper

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class OVRWrapper(estimator, *, n_jobs=None, verbose=0)

Bases: MonoviewWrapper, OneVsRestClassifier

get_params(deep=True)

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params – Parameter names mapped to their values.

Return type:

dict

set_partial_fit_request(*, classes: bool | None | str = '$UNCHANGED$') OVRWrapper

Request metadata passed to the partial_fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to partial_fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to partial_fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

classes (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for classes parameter in partial_fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') OVRWrapper

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

get_mc_estim(estimator, random_state, y=None, multiview=False, multiclass=False)

Used to get a multiclass-compatible estimator if the one in param does not natively support multiclass. If perdict_proba is available in the asked estimator, a One Versus Rest wrapper is returned, else, a One Versus One wrapper is returned.

To be able to deal with multiview algorithm, multiview wrappers are implemented separately.

Parameters:
  • estimator (sklearn-like estimator) – Asked estimator

  • y (numpy.array) – The labels of the problem

  • random_state (numpy.random.RandomState object) – The random state, used to generate a fake multiclass problem

  • multiview (bool) – If True, mutliview-compatible wrappers are returned.

Returns:

estimator – Either the aksed estimator, or a multiclass-compatible wrapper over the asked estimator

Return type:

sklearn-like estimator

summit.multiview_platform.utils.multiview_result_analysis module

summit.multiview_platform.utils.organization module

secure_file_path(file_name)

summit.multiview_platform.utils.transformations module

sign_labels(labels)

Returns a label array with (-1,1) as labels. If labels was already made of (-1,1), returns labels. If labels is made of (0,1), returns labels with all zeros transformed in -1.

Parameters:
  • labels

  • array (The original label numpy)

Return type:

A np.array with labels made of (-1,1)

unsign_labels(labels)

The inverse function

Parameters:

labels

Module contents