summit.multiview_platform.utils package

Submodules

summit.multiview_platform.utils.base module

class BaseClassifier

Bases: BaseEstimator

accepts_multi_class(random_state, n_samples=10, dim=2, n_classes=3): Base function to test if the classifier accepts a multiclass task. It is highly recommended to overwrite it with a simple method that returns True or False in the classifier’s module, as it will speed up the benchmark

gen_best_params(detector)

return best parameters of detector :param detector:

Returns:: best param – value
Return type:: dictionary with param name as key and best parameters

gen_distribs()

gen_params_from_detector(detector)

get_base_estimator(estimator, estimator_config)

get_config(): Generates a string to containing all the information about the classifier’s configuration

get_interpretation(directory, base_file_name, y_test, feature_ids, multi_class=False): Base method that returns an empty string if there is not interpretation method in the classifier’s module

params_to_string(): Formats the parameters of the classifier as a string

to_str(param_name): Formats a parameter into a string

class ResultAnalyser(classifier, classification_indices, k_folds, hps_method, metrics_dict, n_iter, class_label_names, pred, directory, base_file_name, labels, database_name, nb_cores, duration, feature_ids)

Bases: object

A shared result analysis tool for mono and multiview classifiers. The main utility of this class is to generate a txt file summarizing the results and possible interpretation for the classifier.

analyze()

Main function used in the monoview and multiview classification scripts

Returns:

string_analysis (a string that will be stored in the log and in a txt)
file
image_analysis (a list of images to save)
metric_scores (a dictionary of {metric: (train_score, test_score)})
used in later analysis.

get_all_metrics_scores(): Get the scores for all the metrics in the list

abstractmethod get_base_string()

get_classifier_config_string()

Formats the information about the classifier and its configuration

Return type:: A string explaining the classifier’s configuration

get_db_config_string()

Generates a string, formatting all the information on the database

Return type:: db_config_string string, formatting all the information on the database

get_metric_score(metric, metric_kwargs)

Get the train and test scores for a specific metric and its arguments

Parameters:

metric (name of the metric, must be implemented in metrics)
metric_kwargs (the dictionary containing the arguments for the metric.)

Return type:

train_score, test_score

abstractmethod get_view_specific_info()

print_metric_score()

Generates a string, formatting the metrics configuration and scores

Parameters:

metric_scores (dictionary of train_score, test_score for each metric)
metric_list (list of metrics)

Return type:

metric_score_string string formatting all metric results

get_metric(metrics_dict): Fetches the metric module in the metrics package

get_names(classed_list)

summit.multiview_platform.utils.configuration module

get_the_args(path_to_config_file='/home/runner/work/summit/summit/config_files/config.yml')

The function for extracting the args for a ‘.yml’ file.

Parameters:

path_to_config_file (str, path to the yml file containing the configuration)

Returns:

yaml_config (dict, the dictionary conaining the configuration for the)
benchmark

pass_default_config(log=True, name=['plausible'], label='_', file_type='.hdf5', views=None, pathf='/home/runner/work/summit/summit/data/', nice=0, random_state=42, nb_cores=1, full=True, debug=False, add_noise=False, noise_std=0.0, res_dir='/home/runner/work/summit/summit/results/', track_tracebacks=True, split=0.49, nb_folds=5, nb_class=None, classes=None, type=['multiview'], algos_monoview=['all'], algos_multiview=['svm_jumbo_fusion'], stats_iter=2, metrics={'accuracy_score': {}, 'f1_score': {}}, metric_princ='accuracy_score', hps_type='Random', hps_iter=1, hps_kwargs={'equivalent_draws': True, 'n_iter': 10}, **kwargs)

save_config(directory, arguments): Saves the config file in the result directory.

summit.multiview_platform.utils.dataset module

class Dataset

Bases: object

This is the base class for all the type of multiview datasets of SuMMIT.

check_selected_label_names(nb_labels=None, selected_label_names=None, random_state=RandomState(MT19937) at 0x7F66D59EFD40)

abstractmethod filter(labels, label_names, sample_indices, view_names, path=None)

gen_feat_id()

abstractmethod get_label_names(sample_indices=None)

abstractmethod get_labels(sample_indices=None)

abstractmethod get_nb_samples()

get_shape(view_index=0, sample_indices=None)

Gets the shape of the needed view on the asked samples

Parameters:

view_index (int) – The index of the view to extract
sample_indices (numpy.ndarray) – The array containing the indices of the samples to extract.

Return type:

Tuple containing the shape

abstractmethod get_v(view_index, sample_indices=None)

init_sample_indices(sample_indices=None)

If no sample indices are provided, selects all the available samples.

Parameters:: sample_indices (np.array,) – An array-like containing the indices of the samples.

select_labels(selected_label_names)

select_views_and_labels(nb_labels=None, selected_label_names=None, random_state=None, view_names=None, path_for_new='../data/')

to_numpy_array(sample_indices=None, view_indices=None)

Concatenates the needed views in one big numpy array while saving the limits of each view in a list, to be able to retrieve them later.

Parameters:

sample_indices (array like) – The indices of the samples to extract from the dataset
view_indices (array like) – The indices of the view to concatenate in the numpy array

Returns:

concat_views (numpy array,) – The numpy array containing all the needed views.
view_limits (list of int) – The limits of each slice used to extract the views.

class HDF5Dataset(views=None, labels=None, are_sparse=False, file_name='dataset.hdf5', view_names=None, path='', hdf5_file=None, labels_names=None, is_temp=False, sample_ids=None, feature_ids=None)

Bases: Dataset

Dataset class

This is used to encapsulate the multiview dataset while keeping it stored on the disk instead of in RAM.

Parameters:

views (list of numpy arrays or None) – The list containing each view of the dataset as a numpy array of shape (nb samples, nb features).
labels (numpy array or None) – The labels for the multiview dataset, of shape (nb samples, ).
are_sparse (list of bool, or None) – The list of boolean telling if each view is sparse or not.
file_name (str, or None) – The name of the hdf5 file that will be created to store the multiview dataset.
view_names (list of str, or None) – The name of each view.
path (str, or None) – The path where the hdf5 dataset file will be stored
hdf5_file (h5py.File object, or None) – If not None, the dataset will be imported directly from this file.
labels_names (list of str, or None) – The name for each unique value of the labels given in labels.
is_temp (bool) – Used if a temporary dataset has to be stored by the benchmark.

dataset

The h5py file pbject that points to the hdf5 dataset on the disk.

Type:: h5py.File object

nb_view

The number of views in the dataset.

Type:: int

view_dict

The dictionnary with the name of each view as the keys and their indices: as values

Type:: dict

add_gaussian_noise(random_state, path, noise_std=0.15)

copy_view(target_dataset=None, source_view_name=None, target_view_index=None, sample_indices=None)

filter(labels, label_names, sample_indices, view_names, path=None)

get_label_names(decode=False, sample_indices=None)

Used to get the list of the label names for the given set of samples

Parameters:

decode (bool) – If True, will decode the label names before listing them
sample_indices (numpy.ndarray) – The array containing the indices of the needed samples

Returns:

list
seleted labels’ names

get_labels(sample_indices=None)

Gets the label array for the asked samples

Parameters:: sample_indices (numpy.ndarray) – The array containing the indices of the samples to extract.
Return type:: numpy.ndarray containing the labels of the asked samples

get_name(): Gets the name of the dataset hdf5 file

get_nb_class(sample_indices=None)

Gets the number of classes of the dataset for the asked samples

Parameters:: sample_indices (numpy.ndarray) – The array containing the indices of the samples to extract.
Returns:: int
Return type:: The number of classes

get_nb_samples()

Used to get the number of samples available in hte dataset

Return type:: int

get_v(view_index, sample_indices=None)

Extract the view and returns a numpy.ndarray containing the description of the samples specified in sample_indices

Parameters:

view_index (int) – The index of the view to extract
sample_indices (numpy.ndarray) – The array containing the indices of the samples to extract.

Return type:

A numpy.ndarray containing the view data for the needed samples

get_view_dict(): Returns the dictionary containing view indices as keys and their corresponding names as values

get_view_name(view_idx)

Method to get a view’s name from its index.

Parameters:: view_idx (int) – The index of the view in the dataset
Return type:: The view’s name.

init_attrs(): Used to init the attributes that are modified when self.dataset changes

init_view_names(view_names=None)

rm(): Method used to delete the dataset file on the disk if the dataset is temporary.

update_hdf5_dataset(path)

class RAMDataset(views=None, labels=None, are_sparse=False, view_names=None, labels_names=None, sample_ids=None, name=None, feature_ids=None)

Bases: Dataset

filter(labels, label_names, sample_indices, view_names, path=None)

get_label_names(sample_indices=None, decode=True)

get_labels(sample_indices=None)

get_name()

get_nb_class(sample_indices=None)

get_nb_samples()

get_v(view_index, sample_indices=None)

get_view_dict()

get_view_name(view_idx)

init_attrs()

confirm(resp=True, timeout=15): Used to process answer

copy_hdf5(pathF, name, nbCores): Used to copy a HDF5 database in case of multicore computing

datasets_already_exist(pathF, name, nbCores): Used to check if it’s necessary to copy datasets

delete_HDF5(benchmarkArgumentsDictionaries, nbCores, dataset): Used to delete temporary copies at the end of the benchmark

extract_subset(matrix, used_indices): Used to extract a subset of a matrix even if it’s sparse WIP

get_samples_views_indices(dataset, samples_indices, view_indices): This function is used to get all the samples indices and view indices if needed

init_multiple_datasets(path_f, name, nb_cores)

Used to create copies of the dataset if multicore computation is used.

This is a temporary solution to fix the sharing memory issue with HDF5 datasets.

Parameters:

path_f (string) – Path to the original dataset directory
name (string) – Name of the dataset
nb_cores (int) – The number of threads that the benchmark can use

Returns:

datasetFiles – Dictionary resuming which mono- and multiview algorithms which will be used in the benchmark.

Return type:

None

input_(timeout=15): used as a UI to stop if too much HDD space will be used

is_just_number(string)

summit.multiview_platform.utils.execution module

find_dataset_names(path, type, names): This function goal is to browse the dataset directory and extrats all the needed dataset names.

gen_argument_dictionaries(labels_dictionary, directories, splits, hyper_param_search, args, k_folds, stats_iter_random_states, metrics, argument_dictionaries, benchmark, views, views_indices)

Used to generate a dictionary for each benchmark.

One for each label combination (if multiclass), for each statistical iteration, generates an dictionary with all necessary information to perform the benchmark

Parameters:

labels_dictionary (dictionary) – Dictionary mapping labels indices to labels names.
directories (list of strings) – List of the paths to the result directories for each statistical iteration.
multiclass_labels (list of lists of numpy.ndarray) – For each label couple, for each statistical iteration a triplet of numpy.ndarrays is stored with the indices for the biclass training set, the ones for the biclass testing set and the ones for the multiclass testing set.
labels_combinations (list of lists of numpy.ndarray) – Each original couple of different labels.
indices_multiclass (list of lists of numpy.ndarray) – For each combination, contains a biclass labels numpy.ndarray with the 0/1 labels of combination.
hyper_param_search (string) – Type of hyper parameter optimization method
args (parsed args objects) – All the args passed by the user.
k_folds (list of list of sklearn.model_selection.StratifiedKFold) – For each statistical iteration a Kfold stratified (keeping the ratio between classes in each fold).
stats_iter_random_states (list of numpy.random.RandomState objects) – Multiple random states, one for each sattistical iteration of the same benchmark.
metrics (list of lists) – metrics that will be used to evaluate the algorithms performance.
argument_dictionaries (dictionary) – Dictionary resuming all the specific arguments for the benchmark, oe dictionary for each classifier.
benchmark (dictionary) – Dictionary resuming which mono- and multiview algorithms which will be used in the benchmark.
nb_views (int) – THe number of views used by the benchmark.
views (list of strings) – List of the names of the used views.
views_indices (list of ints) – List of indices (according to the dataset) of the used views.

Returns:

benchmarkArgumentDictionaries – All the needed arguments for the benchmarks.

Return type:

list of dicts

gen_direcorties_names(directory, stats_iter)

Used to generate the different directories of each iteration if needed.

Parameters:

directory (string) – Path to the results directory.
statsIter (int) – The number of statistical iterations.

Returns:

directories – Paths to each statistical iterations result directory.

Return type:

list of strings

gen_k_folds(stats_iter, nb_folds, stats_iter_random_states)

Used to generate folds indices for cross validation for each statistical iteration.

Parameters:

stats_iter (integer) – Number of statistical iterations of the benchmark.
nb_folds (integer) – The number of cross-validation folds for the benchmark.
stats_iter_random_states (list of numpy.random.RandomState) – The random states for each statistical iteration.

Returns:

folds_list – For each statistical iteration a Kfold stratified (keeping the ratio between classes in each fold).

Return type:

list of list of sklearn.model_selection.StratifiedKFold

gen_splits(labels, split_ratio, stats_iter_random_states)

Used to _gen the train/test splits using one or multiple random states.

Parameters:

labels (numpy.ndarray) – Name of the database.
split_ratio (float) – The ratio of samples between train and test set.
stats_iter_random_states (list of numpy.random.RandomState) – The random states for each statistical iteration.

Returns:

splits – For each statistical iteration a couple of numpy.ndarrays is stored with the indices for the training set and the ones of the testing set.

Return type:

list of lists of numpy.ndarray

get_database_function(name, type_var)

Used to get the right database extraction function according to the type of database and it’s name

Parameters:

name (string) – Name of the database.
type_var (string) – type of dataset hdf5 or csv

Returns:

getDatabase – The function that will be used to extract the database

Return type:

function

init_log_file(name, views, cl_type, log, debug, label, result_directory, args)

Used to init the directory where the preds will be stored and the log file.

First this function will check if the result directory already exists (only one per minute is allowed).

If the the result directory name is available, it is created, and the logfile is initiated.

Parameters:

name (string) – Name of the database.
views (list of strings) – List of the view names that will be used in the benchmark.
cl_type (list of strings) – Type of benchmark that will be made .
log (bool) – Whether to show the log file in console or hide it.
debug (bool) – for debug option
label (str for label)
result_directory (str name of the result directory)
add_noise (bool for add noise)
noise_std (level of std noise)

Returns:

results_directory – Reference to the main results directory for the benchmark.

Return type:

string

init_random_state(random_state_arg, directory)

Used to init a random state. If no random state is specified, it will generate a ‘random’ seed. If the randomSateArg is a string containing only numbers, it will be converted in

an int to generate a seed.

If the randomSateArg is a string with letters, it must be a path to a pickled random state file that will be loaded. The function will also pickle the new random state in a file tobe able to retrieve it later. Tested

Parameters:

random_state_arg (None or string) – See function description.
directory (string) – Path to the results directory.

Returns:

random_state – This random state will be used all along the benchmark .

Return type:

numpy.random.RandomState object

init_stats_iter_random_states(stats_iter, random_state)

Used to initialize multiple random states if needed because of multiple statistical iteration of the same benchmark

Parameters:

stats_iter (int) – Number of statistical iterations of the same benchmark done (with a different random state).
random_state (numpy.random.RandomState object) – The random state of the whole experimentation, that will be used to generate the ones for each statistical iteration.

Returns:

stats_iter_random_states – Multiple random states, one for each sattistical iteration of the same benchmark.

Return type:

list of numpy.random.RandomState objects

init_views(dataset_var, arg_views)

Used to return the views names that will be used by the benchmark, their indices and all the views names.

Parameters:

dataset_var (HDF5 dataset file) – The full dataset that wil be used by the benchmark.
arg_views (list of strings) – The views that will be used by the benchmark (arg).

Returns:

views (list of strings) – Names of the views that will be used by the benchmark.
view_indices (list of ints) – The list of the indices of the view that will be used in the benchmark (according to the dataset).
all_views (list of strings) – Names of all the available views in the dataset.

parse_the_args(arguments): Used to parse the args entered by the user

summit.multiview_platform.utils.get_multiview_db module

exception DatasetError(*args, **kwargs): Bases: Exception

get_classic_db_csv(views, pathF, nameDB, NB_CLASS, askedLabelsNames, random_state, full=False, add_noise=False, noise_std=0.15, delimiter=',', path_for_new='../data/')

get_classic_db_hdf5(views, path_f, name_DB, nb_class, asked_labels_names, random_state, full=False, add_noise=False, noise_std=0.15, path_for_new='../data/'): Used to load a hdf5 database

get_plausible_db_hdf5(features, path, file_name, nb_class=3, label_names=[b'No', b'Yes', b'Maybe'], random_state=None, full=True, add_noise=False, noise_std=0.15, nb_view=3, nb_samples=100, nb_features=10): Used to generate a plausible dataset to test the algorithms

make_me_noisy(view_data, random_state, percentage=5): used to introduce some noise in the generated data

summit.multiview_platform.utils.hyper_parameter_search module

class CustomDist

Bases: object

multiply(random_number)

class CustomRandint(low=0, high=0, multiplier='')

Bases: CustomDist

Used as a distribution returning a integer between low and high-1. It can be used with a multiplier agrument to be able to perform more complex generation for example 10 e -(randint)

get_nb_possibilities()

rvs(random_state=None)

class CustomUniform(loc=0, state=1, multiplier='')

Bases: CustomDist

Used as a distribution returning a float between loc and loc + scale.. It can be used with a multiplier agrument to be able to perform more complex generation for example 10 e -(float)

rvs(random_state=None)

class Grid(estimator, param_grid={}, refit=False, n_jobs=1, scoring=None, cv=None, available_indices=None, view_indices=None, framework='monoview', random_state=None, track_tracebacks=True)

Bases: GridSearchCV, HPSearch

fit(X, y=None, groups=None, **fit_params)

Run fit with all sets of parameters.

Parameters:

X (array-like of shape (n_samples, n_features) or (n_samples, n_samples)) – Training vectors, where n_samples is the number of samples and n_features is the number of features. For precomputed kernel or distance matrix, the expected shape of X is (n_samples, n_samples).
y (array-like of shape (n_samples, n_output) or (n_samples,), default=None) – Target relative to X for classification or regression; None for unsupervised learning.
**params (dict of str -> object) –
Parameters passed to the fit method of the estimator, the scorer, and the CV splitter.

If a fit parameter is an array-like whose length is equal to num_samples then it will be split by cross-validation along with X and y. For example, the sample_weight parameter is split because len(sample_weights) = len(X). However, this behavior does not apply to groups which is passed to the splitter configured via the cv parameter of the constructor. Thus, groups is used to perform the split and determines which samples are assigned to the each side of the a split.

Returns:

self – Instance of fitted estimator.

Return type:

object

get_candidate_params(X)

set_fit_request(*, groups: bool | None | str = '$UNCHANGED$') → Grid

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: groups (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for groups parameter in fit.
Returns:: self – The updated object.
Return type:: object

class HPSearch

Bases: object

fit_multiview(X, y, groups=None, **fit_params)

gen_report(output_file_name)

get_best_params()

abstractmethod get_candidate_params(X)

get_scoring(metric)

translate_param_distribs(param_distribs)

class Random(estimator, param_distributions=None, n_iter=10, refit=False, n_jobs=1, scoring=None, cv=None, available_indices=None, random_state=None, view_indices=None, framework='monoview', equivalent_draws=True, track_tracebacks=True)

Bases: RandomizedSearchCV, HPSearch

fit(X, y=None, groups=None, **fit_params)

Run fit with all sets of parameters.

Parameters:

X (array-like of shape (n_samples, n_features) or (n_samples, n_samples)) – Training vectors, where n_samples is the number of samples and n_features is the number of features. For precomputed kernel or distance matrix, the expected shape of X is (n_samples, n_samples).
y (array-like of shape (n_samples, n_output) or (n_samples,), default=None) – Target relative to X for classification or regression; None for unsupervised learning.
**params (dict of str -> object) –
Parameters passed to the fit method of the estimator, the scorer, and the CV splitter.

If a fit parameter is an array-like whose length is equal to num_samples then it will be split by cross-validation along with X and y. For example, the sample_weight parameter is split because len(sample_weights) = len(X). However, this behavior does not apply to groups which is passed to the splitter configured via the cv parameter of the constructor. Thus, groups is used to perform the split and determines which samples are assigned to the each side of the a split.

Returns:

self – Instance of fitted estimator.

Return type:

object

get_candidate_params(X)

get_param_distribs(estimator, user_distribs)

set_fit_request(*, groups: bool | None | str = '$UNCHANGED$') → Random

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: groups (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for groups parameter in fit.
Returns:: self – The updated object.
Return type:: object

translate_randint(args)

translate_uniform(args)

format_params(params, pref='')

summit.multiview_platform.utils.make_file_config module

class ConfigurationMaker(classifier_dict=None)

Bases: object

Find the name of the classifier from the dict classier to report

summit.multiview_platform.utils.multiclass module

class MonoviewWrapper: Bases: MultiClassWrapper

class MultiClassWrapper

Bases: object

format_params(params, deep=True)

get_config()

get_interpretation(directory, base_file_name, y_test=None)

set_params(**params): This function is useful in order for the OV_Wrappers to be transparent in terms of parameters. If we remove it the parameters have to be specified as estimator__param. Witch is not relevant for the platform

class MultiviewOVOWrapper(estimator=None, **args)

Bases: MultiviewWrapper, OneVsOneClassifier

fit(X, y, train_indices=None, view_indices=None)

Fit underlying estimators.

Parameters:

X ((sparse) array-like of shape (n_samples, n_features)) – Data.
y (array-like of shape (n_samples,)) – Multi-class targets.

Return type:

self

get_params(deep=True)

Get parameters for this estimator.

Parameters:: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:: params – Parameter names mapped to their values.
Return type:: dict

get_tags(): Get tags of estimateur see sklearn > 1.6.0 _pairwise attribut removed

multiview_decision_function(X, sample_indices, view_indices)

predict(X, sample_indices=None, view_indices=None)

Estimate the best class label for each sample in X.

This is implemented as argmax(decision_function(X), axis=1) which will return the label of the class with most votes by estimators predicting the outcome of a decision for each possible class pair.

Parameters:: X ((sparse) array-like of shape (n_samples, n_features)) – Data.
Returns:: y – Predicted multi-class targets.
Return type:: numpy array of shape [n_samples]

set_fit_request(*, train_indices: bool | None | str = '$UNCHANGED$', view_indices: bool | None | str = '$UNCHANGED$') → MultiviewOVOWrapper

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

train_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for train_indices parameter in fit.
view_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for view_indices parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_partial_fit_request(*, classes: bool | None | str = '$UNCHANGED$') → MultiviewOVOWrapper

Request metadata passed to the partial_fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to partial_fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to partial_fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: classes (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for classes parameter in partial_fit.
Returns:: self – The updated object.
Return type:: object

set_predict_request(*, sample_indices: bool | None | str = '$UNCHANGED$', view_indices: bool | None | str = '$UNCHANGED$') → MultiviewOVOWrapper

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_indices parameter in predict.
view_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for view_indices parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → MultiviewOVOWrapper

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

class MultiviewOVRWrapper(estimator=None, **args)

Bases: MultiviewWrapper, OneVsRestClassifier

fit(X, y, train_indices=None, view_indices=None)

Fit underlying estimators.

Parameters:

X ({array-like, sparse matrix} of shape (n_samples, n_features)) – Data.
y ({array-like, sparse matrix} of shape (n_samples,) or (n_samples, n_classes)) – Multi-class targets. An indicator matrix turns on multilabel classification.
**fit_params (dict) –
Parameters passed to the estimator.fit method of each sub-estimator.

Added in version 1.4: Only available if enable_metadata_routing=True. See Metadata Routing User Guide for more details.

Returns:

self – Instance of fitted estimator.

Return type:

object

get_params(deep=True)

Get parameters for this estimator.

Parameters:: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:: params – Parameter names mapped to their values.
Return type:: dict

predict(X, sample_indices=None, view_indices=None)

Predict multi-class targets using underlying estimators.

Parameters:: X ({array-like, sparse matrix} of shape (n_samples, n_features)) – Data.
Returns:: y – Predicted multi-class targets.
Return type:: {array-like, sparse matrix} of shape (n_samples,) or (n_samples, n_classes)

set_fit_request(*, train_indices: bool | None | str = '$UNCHANGED$', view_indices: bool | None | str = '$UNCHANGED$') → MultiviewOVRWrapper

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

train_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for train_indices parameter in fit.
view_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for view_indices parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_partial_fit_request(*, classes: bool | None | str = '$UNCHANGED$') → MultiviewOVRWrapper

Request metadata passed to the partial_fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to partial_fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to partial_fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: classes (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for classes parameter in partial_fit.
Returns:: self – The updated object.
Return type:: object

set_predict_request(*, sample_indices: bool | None | str = '$UNCHANGED$', view_indices: bool | None | str = '$UNCHANGED$') → MultiviewOVRWrapper

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_indices parameter in predict.
view_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for view_indices parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → MultiviewOVRWrapper

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

class MultiviewWrapper(estimator=None, **args): Bases: MultiClassWrapper

class OVOWrapper(estimator, *, n_jobs=None)

Bases: MonoviewWrapper, OneVsOneClassifier

decision_function(X)

Decision function for the OneVsOneClassifier.

The decision values for the samples are computed by adding the normalized sum of pair-wise classification confidence levels to the votes in order to disambiguate between the decision values when the votes for all the classes are equal leading to a tie.

Parameters:

X (array-like of shape (n_samples, n_features)) – Input data.

Returns:

Y – Result of calling decision_function on the final estimator.

Changed in version 0.19: output shape changed to (n_samples,) to conform to scikit-learn conventions for binary classification.

Return type:

array-like of shape (n_samples, n_classes) or (n_samples,)

get_params(deep=True)

Get parameters for this estimator.

Parameters:: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:: params – Parameter names mapped to their values.
Return type:: dict

set_partial_fit_request(*, classes: bool | None | str = '$UNCHANGED$') → OVOWrapper

Request metadata passed to the partial_fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to partial_fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to partial_fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: classes (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for classes parameter in partial_fit.
Returns:: self – The updated object.
Return type:: object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → OVOWrapper

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

class OVRWrapper(estimator, *, n_jobs=None, verbose=0)

Bases: MonoviewWrapper, OneVsRestClassifier

get_params(deep=True)

Get parameters for this estimator.

Parameters:: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:: params – Parameter names mapped to their values.
Return type:: dict

set_partial_fit_request(*, classes: bool | None | str = '$UNCHANGED$') → OVRWrapper

Request metadata passed to the partial_fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to partial_fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to partial_fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: classes (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for classes parameter in partial_fit.
Returns:: self – The updated object.
Return type:: object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → OVRWrapper

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

get_mc_estim(estimator, random_state, y=None, multiview=False, multiclass=False)

Used to get a multiclass-compatible estimator if the one in param does not natively support multiclass. If perdict_proba is available in the asked estimator, a One Versus Rest wrapper is returned, else, a One Versus One wrapper is returned.

To be able to deal with multiview algorithm, multiview wrappers are implemented separately.

Parameters:

estimator (sklearn-like estimator) – Asked estimator
y (numpy.array) – The labels of the problem
random_state (numpy.random.RandomState object) – The random state, used to generate a fake multiclass problem
multiview (bool) – If True, mutliview-compatible wrappers are returned.

Returns:

estimator – Either the aksed estimator, or a multiclass-compatible wrapper over the asked estimator

Return type:

sklearn-like estimator

summit.multiview_platform.utils.multiview_result_analysis module

summit.multiview_platform.utils.organization module

secure_file_path(file_name)

summit.multiview_platform.utils.transformations module

sign_labels(labels)

Returns a label array with (-1,1) as labels. If labels was already made of (-1,1), returns labels. If labels is made of (0,1), returns labels with all zeros transformed in -1.

Parameters:

labels
array (The original label numpy)

Return type:

A np.array with labels made of (-1,1)

unsign_labels(labels)

The inverse function

Parameters:: labels

summit.multiview_platform.utils package

Submodules

summit.multiview_platform.utils.base module

summit.multiview_platform.utils.configuration module

summit.multiview_platform.utils.dataset module

summit.multiview_platform.utils.execution module

summit.multiview_platform.utils.get_multiview_db module

summit.multiview_platform.utils.hyper_parameter_search module

summit.multiview_platform.utils.make_file_config module

summit.multiview_platform.utils.multiclass module

summit.multiview_platform.utils.multiview_result_analysis module

summit.multiview_platform.utils.organization module

summit.multiview_platform.utils.transformations module

Module contents