summit.multiview_platform.utils package
Submodules
summit.multiview_platform.utils.base module
- class BaseClassifier
Bases:
BaseEstimator
- accepts_multi_class(random_state, n_samples=10, dim=2, n_classes=3)
Base function to test if the classifier accepts a multiclass task. It is highly recommended to overwrite it with a simple method that returns True or False in the classifier’s module, as it will speed up the benchmark
- gen_best_params(detector)
return best parameters of detector :param detector:
- Returns:
best param – value
- Return type:
dictionary with param name as key and best parameters
- gen_distribs()
- gen_params_from_detector(detector)
- get_base_estimator(estimator, estimator_config)
- get_config()
Generates a string to containing all the information about the classifier’s configuration
- get_interpretation(directory, base_file_name, y_test, feature_ids, multi_class=False)
Base method that returns an empty string if there is not interpretation method in the classifier’s module
- params_to_string()
Formats the parameters of the classifier as a string
- to_str(param_name)
Formats a parameter into a string
- class ResultAnalyser(classifier, classification_indices, k_folds, hps_method, metrics_dict, n_iter, class_label_names, pred, directory, base_file_name, labels, database_name, nb_cores, duration, feature_ids)
Bases:
object
A shared result analysis tool for mono and multiview classifiers. The main utility of this class is to generate a txt file summarizing the results and possible interpretation for the classifier.
- analyze()
Main function used in the monoview and multiview classification scripts
- Returns:
string_analysis (a string that will be stored in the log and in a txt)
file
image_analysis (a list of images to save)
metric_scores (a dictionary of {metric: (train_score, test_score)})
used in later analysis.
- get_all_metrics_scores()
Get the scores for all the metrics in the list
- abstractmethod get_base_string()
- get_classifier_config_string()
Formats the information about the classifier and its configuration
- Return type:
A string explaining the classifier’s configuration
- get_db_config_string()
Generates a string, formatting all the information on the database
- Return type:
db_config_string string, formatting all the information on the database
- get_metric_score(metric, metric_kwargs)
Get the train and test scores for a specific metric and its arguments
- Parameters:
metric (name of the metric, must be implemented in metrics)
metric_kwargs (the dictionary containing the arguments for the metric.)
- Return type:
train_score, test_score
- abstractmethod get_view_specific_info()
- print_metric_score()
Generates a string, formatting the metrics configuration and scores
- Parameters:
metric_scores (dictionary of train_score, test_score for each metric)
metric_list (list of metrics)
- Return type:
metric_score_string string formatting all metric results
- get_metric(metrics_dict)
Fetches the metric module in the metrics package
- get_names(classed_list)
summit.multiview_platform.utils.configuration module
- get_the_args(path_to_config_file='/home/runner/work/summit/summit/config_files/config.yml')
The function for extracting the args for a ‘.yml’ file.
- Parameters:
path_to_config_file (str, path to the yml file containing the configuration)
- Returns:
yaml_config (dict, the dictionary conaining the configuration for the)
benchmark
- pass_default_config(log=True, name=['plausible'], label='_', file_type='.hdf5', views=None, pathf='/home/runner/work/summit/summit/data/', nice=0, random_state=42, nb_cores=1, full=True, debug=False, add_noise=False, noise_std=0.0, res_dir='/home/runner/work/summit/summit/results/', track_tracebacks=True, split=0.49, nb_folds=5, nb_class=None, classes=None, type=['multiview'], algos_monoview=['all'], algos_multiview=['svm_jumbo_fusion'], stats_iter=2, metrics={'accuracy_score': {}, 'f1_score': {}}, metric_princ='accuracy_score', hps_type='Random', hps_iter=1, hps_kwargs={'equivalent_draws': True, 'n_iter': 10}, **kwargs)
- save_config(directory, arguments)
Saves the config file in the result directory.
summit.multiview_platform.utils.dataset module
- class Dataset
Bases:
object
This is the base class for all the type of multiview datasets of SuMMIT.
- check_selected_label_names(nb_labels=None, selected_label_names=None, random_state=RandomState(MT19937) at 0x7F66D59EFD40)
- abstractmethod filter(labels, label_names, sample_indices, view_names, path=None)
- gen_feat_id()
- abstractmethod get_label_names(sample_indices=None)
- abstractmethod get_labels(sample_indices=None)
- abstractmethod get_nb_samples()
- get_shape(view_index=0, sample_indices=None)
Gets the shape of the needed view on the asked samples
- Parameters:
view_index (int) – The index of the view to extract
sample_indices (numpy.ndarray) – The array containing the indices of the samples to extract.
- Return type:
Tuple containing the shape
- abstractmethod get_v(view_index, sample_indices=None)
- init_sample_indices(sample_indices=None)
If no sample indices are provided, selects all the available samples.
- Parameters:
sample_indices (np.array,) – An array-like containing the indices of the samples.
- select_labels(selected_label_names)
- select_views_and_labels(nb_labels=None, selected_label_names=None, random_state=None, view_names=None, path_for_new='../data/')
- to_numpy_array(sample_indices=None, view_indices=None)
Concatenates the needed views in one big numpy array while saving the limits of each view in a list, to be able to retrieve them later.
- Parameters:
sample_indices (array like) – The indices of the samples to extract from the dataset
view_indices (array like) – The indices of the view to concatenate in the numpy array
- Returns:
concat_views (numpy array,) – The numpy array containing all the needed views.
view_limits (list of int) – The limits of each slice used to extract the views.
- class HDF5Dataset(views=None, labels=None, are_sparse=False, file_name='dataset.hdf5', view_names=None, path='', hdf5_file=None, labels_names=None, is_temp=False, sample_ids=None, feature_ids=None)
Bases:
Dataset
Dataset class
This is used to encapsulate the multiview dataset while keeping it stored on the disk instead of in RAM.
- Parameters:
views (list of numpy arrays or None) – The list containing each view of the dataset as a numpy array of shape (nb samples, nb features).
labels (numpy array or None) – The labels for the multiview dataset, of shape (nb samples, ).
are_sparse (list of bool, or None) – The list of boolean telling if each view is sparse or not.
file_name (str, or None) – The name of the hdf5 file that will be created to store the multiview dataset.
view_names (list of str, or None) – The name of each view.
path (str, or None) – The path where the hdf5 dataset file will be stored
hdf5_file (h5py.File object, or None) – If not None, the dataset will be imported directly from this file.
labels_names (list of str, or None) – The name for each unique value of the labels given in labels.
is_temp (bool) – Used if a temporary dataset has to be stored by the benchmark.
- dataset
The h5py file pbject that points to the hdf5 dataset on the disk.
- Type:
h5py.File object
- nb_view
The number of views in the dataset.
- Type:
int
- view_dict
- The dictionnary with the name of each view as the keys and their indices
as values
- Type:
dict
- add_gaussian_noise(random_state, path, noise_std=0.15)
- copy_view(target_dataset=None, source_view_name=None, target_view_index=None, sample_indices=None)
- filter(labels, label_names, sample_indices, view_names, path=None)
- get_label_names(decode=False, sample_indices=None)
Used to get the list of the label names for the given set of samples
- Parameters:
decode (bool) – If True, will decode the label names before listing them
sample_indices (numpy.ndarray) – The array containing the indices of the needed samples
- Returns:
list
seleted labels’ names
- get_labels(sample_indices=None)
Gets the label array for the asked samples
- Parameters:
sample_indices (numpy.ndarray) – The array containing the indices of the samples to extract.
- Return type:
numpy.ndarray containing the labels of the asked samples
- get_name()
Gets the name of the dataset hdf5 file
- get_nb_class(sample_indices=None)
Gets the number of classes of the dataset for the asked samples
- Parameters:
sample_indices (numpy.ndarray) – The array containing the indices of the samples to extract.
- Returns:
int
- Return type:
The number of classes
- get_nb_samples()
Used to get the number of samples available in hte dataset
- Return type:
int
- get_v(view_index, sample_indices=None)
Extract the view and returns a numpy.ndarray containing the description of the samples specified in sample_indices
- Parameters:
view_index (int) – The index of the view to extract
sample_indices (numpy.ndarray) – The array containing the indices of the samples to extract.
- Return type:
A numpy.ndarray containing the view data for the needed samples
- get_view_dict()
Returns the dictionary containing view indices as keys and their corresponding names as values
- get_view_name(view_idx)
Method to get a view’s name from its index.
- Parameters:
view_idx (int) – The index of the view in the dataset
- Return type:
The view’s name.
- init_attrs()
Used to init the attributes that are modified when self.dataset changes
- init_view_names(view_names=None)
- rm()
Method used to delete the dataset file on the disk if the dataset is temporary.
- update_hdf5_dataset(path)
- class RAMDataset(views=None, labels=None, are_sparse=False, view_names=None, labels_names=None, sample_ids=None, name=None, feature_ids=None)
Bases:
Dataset
- filter(labels, label_names, sample_indices, view_names, path=None)
- get_label_names(sample_indices=None, decode=True)
- get_labels(sample_indices=None)
- get_name()
- get_nb_class(sample_indices=None)
- get_nb_samples()
- get_v(view_index, sample_indices=None)
- get_view_dict()
- get_view_name(view_idx)
- init_attrs()
- confirm(resp=True, timeout=15)
Used to process answer
- copy_hdf5(pathF, name, nbCores)
Used to copy a HDF5 database in case of multicore computing
- datasets_already_exist(pathF, name, nbCores)
Used to check if it’s necessary to copy datasets
- delete_HDF5(benchmarkArgumentsDictionaries, nbCores, dataset)
Used to delete temporary copies at the end of the benchmark
- extract_subset(matrix, used_indices)
Used to extract a subset of a matrix even if it’s sparse WIP
- get_samples_views_indices(dataset, samples_indices, view_indices)
This function is used to get all the samples indices and view indices if needed
- init_multiple_datasets(path_f, name, nb_cores)
Used to create copies of the dataset if multicore computation is used.
This is a temporary solution to fix the sharing memory issue with HDF5 datasets.
- Parameters:
path_f (string) – Path to the original dataset directory
name (string) – Name of the dataset
nb_cores (int) – The number of threads that the benchmark can use
- Returns:
datasetFiles – Dictionary resuming which mono- and multiview algorithms which will be used in the benchmark.
- Return type:
None
- input_(timeout=15)
used as a UI to stop if too much HDD space will be used
- is_just_number(string)
summit.multiview_platform.utils.execution module
- find_dataset_names(path, type, names)
This function goal is to browse the dataset directory and extrats all the needed dataset names.
- gen_argument_dictionaries(labels_dictionary, directories, splits, hyper_param_search, args, k_folds, stats_iter_random_states, metrics, argument_dictionaries, benchmark, views, views_indices)
Used to generate a dictionary for each benchmark.
One for each label combination (if multiclass), for each statistical iteration, generates an dictionary with all necessary information to perform the benchmark
- Parameters:
labels_dictionary (dictionary) – Dictionary mapping labels indices to labels names.
directories (list of strings) – List of the paths to the result directories for each statistical iteration.
multiclass_labels (list of lists of numpy.ndarray) – For each label couple, for each statistical iteration a triplet of numpy.ndarrays is stored with the indices for the biclass training set, the ones for the biclass testing set and the ones for the multiclass testing set.
labels_combinations (list of lists of numpy.ndarray) – Each original couple of different labels.
indices_multiclass (list of lists of numpy.ndarray) – For each combination, contains a biclass labels numpy.ndarray with the 0/1 labels of combination.
hyper_param_search (string) – Type of hyper parameter optimization method
args (parsed args objects) – All the args passed by the user.
k_folds (list of list of sklearn.model_selection.StratifiedKFold) – For each statistical iteration a Kfold stratified (keeping the ratio between classes in each fold).
stats_iter_random_states (list of numpy.random.RandomState objects) – Multiple random states, one for each sattistical iteration of the same benchmark.
metrics (list of lists) – metrics that will be used to evaluate the algorithms performance.
argument_dictionaries (dictionary) – Dictionary resuming all the specific arguments for the benchmark, oe dictionary for each classifier.
benchmark (dictionary) – Dictionary resuming which mono- and multiview algorithms which will be used in the benchmark.
nb_views (int) – THe number of views used by the benchmark.
views (list of strings) – List of the names of the used views.
views_indices (list of ints) – List of indices (according to the dataset) of the used views.
- Returns:
benchmarkArgumentDictionaries – All the needed arguments for the benchmarks.
- Return type:
list of dicts
- gen_direcorties_names(directory, stats_iter)
Used to generate the different directories of each iteration if needed.
- Parameters:
directory (string) – Path to the results directory.
statsIter (int) – The number of statistical iterations.
- Returns:
directories – Paths to each statistical iterations result directory.
- Return type:
list of strings
- gen_k_folds(stats_iter, nb_folds, stats_iter_random_states)
Used to generate folds indices for cross validation for each statistical iteration.
- Parameters:
stats_iter (integer) – Number of statistical iterations of the benchmark.
nb_folds (integer) – The number of cross-validation folds for the benchmark.
stats_iter_random_states (list of numpy.random.RandomState) – The random states for each statistical iteration.
- Returns:
folds_list – For each statistical iteration a Kfold stratified (keeping the ratio between classes in each fold).
- Return type:
list of list of sklearn.model_selection.StratifiedKFold
- gen_splits(labels, split_ratio, stats_iter_random_states)
Used to _gen the train/test splits using one or multiple random states.
- Parameters:
labels (numpy.ndarray) – Name of the database.
split_ratio (float) – The ratio of samples between train and test set.
stats_iter_random_states (list of numpy.random.RandomState) – The random states for each statistical iteration.
- Returns:
splits – For each statistical iteration a couple of numpy.ndarrays is stored with the indices for the training set and the ones of the testing set.
- Return type:
list of lists of numpy.ndarray
- get_database_function(name, type_var)
Used to get the right database extraction function according to the type of database and it’s name
- Parameters:
name (string) – Name of the database.
type_var (string) – type of dataset hdf5 or csv
- Returns:
getDatabase – The function that will be used to extract the database
- Return type:
function
- init_log_file(name, views, cl_type, log, debug, label, result_directory, args)
Used to init the directory where the preds will be stored and the log file.
First this function will check if the result directory already exists (only one per minute is allowed).
If the the result directory name is available, it is created, and the logfile is initiated.
- Parameters:
name (string) – Name of the database.
views (list of strings) – List of the view names that will be used in the benchmark.
cl_type (list of strings) – Type of benchmark that will be made .
log (bool) – Whether to show the log file in console or hide it.
debug (bool) – for debug option
label (str for label)
result_directory (str name of the result directory)
add_noise (bool for add noise)
noise_std (level of std noise)
- Returns:
results_directory – Reference to the main results directory for the benchmark.
- Return type:
string
- init_random_state(random_state_arg, directory)
Used to init a random state. If no random state is specified, it will generate a ‘random’ seed. If the randomSateArg is a string containing only numbers, it will be converted in
an int to generate a seed.
If the randomSateArg is a string with letters, it must be a path to a pickled random state file that will be loaded. The function will also pickle the new random state in a file tobe able to retrieve it later. Tested
- Parameters:
random_state_arg (None or string) – See function description.
directory (string) – Path to the results directory.
- Returns:
random_state – This random state will be used all along the benchmark .
- Return type:
numpy.random.RandomState object
- init_stats_iter_random_states(stats_iter, random_state)
Used to initialize multiple random states if needed because of multiple statistical iteration of the same benchmark
- Parameters:
stats_iter (int) – Number of statistical iterations of the same benchmark done (with a different random state).
random_state (numpy.random.RandomState object) – The random state of the whole experimentation, that will be used to generate the ones for each statistical iteration.
- Returns:
stats_iter_random_states – Multiple random states, one for each sattistical iteration of the same benchmark.
- Return type:
list of numpy.random.RandomState objects
- init_views(dataset_var, arg_views)
Used to return the views names that will be used by the benchmark, their indices and all the views names.
- Parameters:
dataset_var (HDF5 dataset file) – The full dataset that wil be used by the benchmark.
arg_views (list of strings) – The views that will be used by the benchmark (arg).
- Returns:
views (list of strings) – Names of the views that will be used by the benchmark.
view_indices (list of ints) – The list of the indices of the view that will be used in the benchmark (according to the dataset).
all_views (list of strings) – Names of all the available views in the dataset.
- parse_the_args(arguments)
Used to parse the args entered by the user
summit.multiview_platform.utils.get_multiview_db module
- exception DatasetError(*args, **kwargs)
Bases:
Exception
- get_classic_db_csv(views, pathF, nameDB, NB_CLASS, askedLabelsNames, random_state, full=False, add_noise=False, noise_std=0.15, delimiter=',', path_for_new='../data/')
- get_classic_db_hdf5(views, path_f, name_DB, nb_class, asked_labels_names, random_state, full=False, add_noise=False, noise_std=0.15, path_for_new='../data/')
Used to load a hdf5 database
- get_plausible_db_hdf5(features, path, file_name, nb_class=3, label_names=[b'No', b'Yes', b'Maybe'], random_state=None, full=True, add_noise=False, noise_std=0.15, nb_view=3, nb_samples=100, nb_features=10)
Used to generate a plausible dataset to test the algorithms
- make_me_noisy(view_data, random_state, percentage=5)
used to introduce some noise in the generated data
summit.multiview_platform.utils.hyper_parameter_search module
- class CustomRandint(low=0, high=0, multiplier='')
Bases:
CustomDist
Used as a distribution returning a integer between low and high-1. It can be used with a multiplier agrument to be able to perform more complex generation for example 10 e -(randint)
- get_nb_possibilities()
- rvs(random_state=None)
- class CustomUniform(loc=0, state=1, multiplier='')
Bases:
CustomDist
Used as a distribution returning a float between loc and loc + scale.. It can be used with a multiplier agrument to be able to perform more complex generation for example 10 e -(float)
- rvs(random_state=None)
- class Grid(estimator, param_grid={}, refit=False, n_jobs=1, scoring=None, cv=None, available_indices=None, view_indices=None, framework='monoview', random_state=None, track_tracebacks=True)
Bases:
GridSearchCV
,HPSearch
- fit(X, y=None, groups=None, **fit_params)
Run fit with all sets of parameters.
- Parameters:
X (array-like of shape (n_samples, n_features) or (n_samples, n_samples)) – Training vectors, where n_samples is the number of samples and n_features is the number of features. For precomputed kernel or distance matrix, the expected shape of X is (n_samples, n_samples).
y (array-like of shape (n_samples, n_output) or (n_samples,), default=None) – Target relative to X for classification or regression; None for unsupervised learning.
**params (dict of str -> object) –
Parameters passed to the
fit
method of the estimator, the scorer, and the CV splitter.If a fit parameter is an array-like whose length is equal to num_samples then it will be split by cross-validation along with X and y. For example, the sample_weight parameter is split because len(sample_weights) = len(X). However, this behavior does not apply to groups which is passed to the splitter configured via the cv parameter of the constructor. Thus, groups is used to perform the split and determines which samples are assigned to the each side of the a split.
- Returns:
self – Instance of fitted estimator.
- Return type:
object
- get_candidate_params(X)
- set_fit_request(*, groups: bool | None | str = '$UNCHANGED$') Grid
Request metadata passed to the
fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed tofit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it tofit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
groups (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
groups
parameter infit
.- Returns:
self – The updated object.
- Return type:
object
- class HPSearch
Bases:
object
- fit_multiview(X, y, groups=None, **fit_params)
- gen_report(output_file_name)
- get_best_params()
- abstractmethod get_candidate_params(X)
- get_scoring(metric)
- translate_param_distribs(param_distribs)
- class Random(estimator, param_distributions=None, n_iter=10, refit=False, n_jobs=1, scoring=None, cv=None, available_indices=None, random_state=None, view_indices=None, framework='monoview', equivalent_draws=True, track_tracebacks=True)
Bases:
RandomizedSearchCV
,HPSearch
- fit(X, y=None, groups=None, **fit_params)
Run fit with all sets of parameters.
- Parameters:
X (array-like of shape (n_samples, n_features) or (n_samples, n_samples)) – Training vectors, where n_samples is the number of samples and n_features is the number of features. For precomputed kernel or distance matrix, the expected shape of X is (n_samples, n_samples).
y (array-like of shape (n_samples, n_output) or (n_samples,), default=None) – Target relative to X for classification or regression; None for unsupervised learning.
**params (dict of str -> object) –
Parameters passed to the
fit
method of the estimator, the scorer, and the CV splitter.If a fit parameter is an array-like whose length is equal to num_samples then it will be split by cross-validation along with X and y. For example, the sample_weight parameter is split because len(sample_weights) = len(X). However, this behavior does not apply to groups which is passed to the splitter configured via the cv parameter of the constructor. Thus, groups is used to perform the split and determines which samples are assigned to the each side of the a split.
- Returns:
self – Instance of fitted estimator.
- Return type:
object
- get_candidate_params(X)
- get_param_distribs(estimator, user_distribs)
- set_fit_request(*, groups: bool | None | str = '$UNCHANGED$') Random
Request metadata passed to the
fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed tofit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it tofit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
groups (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
groups
parameter infit
.- Returns:
self – The updated object.
- Return type:
object
- translate_randint(args)
- translate_uniform(args)
- format_params(params, pref='')
summit.multiview_platform.utils.make_file_config module
- class ConfigurationMaker(classifier_dict=None)
Bases:
object
Find the name of the classifier from the dict classier to report
summit.multiview_platform.utils.multiclass module
- class MonoviewWrapper
Bases:
MultiClassWrapper
- class MultiClassWrapper
Bases:
object
- format_params(params, deep=True)
- get_config()
- get_interpretation(directory, base_file_name, y_test=None)
- set_params(**params)
This function is useful in order for the OV_Wrappers to be transparent in terms of parameters. If we remove it the parameters have to be specified as estimator__param. Witch is not relevant for the platform
- class MultiviewOVOWrapper(estimator=None, **args)
Bases:
MultiviewWrapper
,OneVsOneClassifier
- fit(X, y, train_indices=None, view_indices=None)
Fit underlying estimators.
- Parameters:
X ((sparse) array-like of shape (n_samples, n_features)) – Data.
y (array-like of shape (n_samples,)) – Multi-class targets.
- Return type:
self
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- get_tags()
Get tags of estimateur see sklearn > 1.6.0 _pairwise attribut removed
- multiview_decision_function(X, sample_indices, view_indices)
- predict(X, sample_indices=None, view_indices=None)
Estimate the best class label for each sample in X.
This is implemented as
argmax(decision_function(X), axis=1)
which will return the label of the class with most votes by estimators predicting the outcome of a decision for each possible class pair.- Parameters:
X ((sparse) array-like of shape (n_samples, n_features)) – Data.
- Returns:
y – Predicted multi-class targets.
- Return type:
numpy array of shape [n_samples]
- set_fit_request(*, train_indices: bool | None | str = '$UNCHANGED$', view_indices: bool | None | str = '$UNCHANGED$') MultiviewOVOWrapper
Request metadata passed to the
fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed tofit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it tofit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
train_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
train_indices
parameter infit
.view_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
view_indices
parameter infit
.
- Returns:
self – The updated object.
- Return type:
object
- set_partial_fit_request(*, classes: bool | None | str = '$UNCHANGED$') MultiviewOVOWrapper
Request metadata passed to the
partial_fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed topartial_fit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it topartial_fit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
classes (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
classes
parameter inpartial_fit
.- Returns:
self – The updated object.
- Return type:
object
- set_predict_request(*, sample_indices: bool | None | str = '$UNCHANGED$', view_indices: bool | None | str = '$UNCHANGED$') MultiviewOVOWrapper
Request metadata passed to the
predict
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed topredict
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it topredict
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
sample_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_indices
parameter inpredict
.view_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
view_indices
parameter inpredict
.
- Returns:
self – The updated object.
- Return type:
object
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') MultiviewOVOWrapper
Request metadata passed to the
score
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed toscore
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it toscore
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weight
parameter inscore
.- Returns:
self – The updated object.
- Return type:
object
- class MultiviewOVRWrapper(estimator=None, **args)
Bases:
MultiviewWrapper
,OneVsRestClassifier
- fit(X, y, train_indices=None, view_indices=None)
Fit underlying estimators.
- Parameters:
X ({array-like, sparse matrix} of shape (n_samples, n_features)) – Data.
y ({array-like, sparse matrix} of shape (n_samples,) or (n_samples, n_classes)) – Multi-class targets. An indicator matrix turns on multilabel classification.
**fit_params (dict) –
Parameters passed to the
estimator.fit
method of each sub-estimator.Added in version 1.4: Only available if enable_metadata_routing=True. See Metadata Routing User Guide for more details.
- Returns:
self – Instance of fitted estimator.
- Return type:
object
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- predict(X, sample_indices=None, view_indices=None)
Predict multi-class targets using underlying estimators.
- Parameters:
X ({array-like, sparse matrix} of shape (n_samples, n_features)) – Data.
- Returns:
y – Predicted multi-class targets.
- Return type:
{array-like, sparse matrix} of shape (n_samples,) or (n_samples, n_classes)
- set_fit_request(*, train_indices: bool | None | str = '$UNCHANGED$', view_indices: bool | None | str = '$UNCHANGED$') MultiviewOVRWrapper
Request metadata passed to the
fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed tofit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it tofit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
train_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
train_indices
parameter infit
.view_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
view_indices
parameter infit
.
- Returns:
self – The updated object.
- Return type:
object
- set_partial_fit_request(*, classes: bool | None | str = '$UNCHANGED$') MultiviewOVRWrapper
Request metadata passed to the
partial_fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed topartial_fit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it topartial_fit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
classes (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
classes
parameter inpartial_fit
.- Returns:
self – The updated object.
- Return type:
object
- set_predict_request(*, sample_indices: bool | None | str = '$UNCHANGED$', view_indices: bool | None | str = '$UNCHANGED$') MultiviewOVRWrapper
Request metadata passed to the
predict
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed topredict
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it topredict
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
sample_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_indices
parameter inpredict
.view_indices (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
view_indices
parameter inpredict
.
- Returns:
self – The updated object.
- Return type:
object
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') MultiviewOVRWrapper
Request metadata passed to the
score
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed toscore
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it toscore
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weight
parameter inscore
.- Returns:
self – The updated object.
- Return type:
object
- class MultiviewWrapper(estimator=None, **args)
Bases:
MultiClassWrapper
- class OVOWrapper(estimator, *, n_jobs=None)
Bases:
MonoviewWrapper
,OneVsOneClassifier
- decision_function(X)
Decision function for the OneVsOneClassifier.
The decision values for the samples are computed by adding the normalized sum of pair-wise classification confidence levels to the votes in order to disambiguate between the decision values when the votes for all the classes are equal leading to a tie.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Input data.
- Returns:
Y – Result of calling decision_function on the final estimator.
Changed in version 0.19: output shape changed to
(n_samples,)
to conform to scikit-learn conventions for binary classification.- Return type:
array-like of shape (n_samples, n_classes) or (n_samples,)
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_partial_fit_request(*, classes: bool | None | str = '$UNCHANGED$') OVOWrapper
Request metadata passed to the
partial_fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed topartial_fit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it topartial_fit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
classes (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
classes
parameter inpartial_fit
.- Returns:
self – The updated object.
- Return type:
object
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') OVOWrapper
Request metadata passed to the
score
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed toscore
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it toscore
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weight
parameter inscore
.- Returns:
self – The updated object.
- Return type:
object
- class OVRWrapper(estimator, *, n_jobs=None, verbose=0)
Bases:
MonoviewWrapper
,OneVsRestClassifier
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- set_partial_fit_request(*, classes: bool | None | str = '$UNCHANGED$') OVRWrapper
Request metadata passed to the
partial_fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed topartial_fit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it topartial_fit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
classes (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
classes
parameter inpartial_fit
.- Returns:
self – The updated object.
- Return type:
object
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') OVRWrapper
Request metadata passed to the
score
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed toscore
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it toscore
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weight
parameter inscore
.- Returns:
self – The updated object.
- Return type:
object
- get_mc_estim(estimator, random_state, y=None, multiview=False, multiclass=False)
Used to get a multiclass-compatible estimator if the one in param does not natively support multiclass. If perdict_proba is available in the asked estimator, a One Versus Rest wrapper is returned, else, a One Versus One wrapper is returned.
To be able to deal with multiview algorithm, multiview wrappers are implemented separately.
- Parameters:
estimator (sklearn-like estimator) – Asked estimator
y (numpy.array) – The labels of the problem
random_state (numpy.random.RandomState object) – The random state, used to generate a fake multiclass problem
multiview (bool) – If True, mutliview-compatible wrappers are returned.
- Returns:
estimator – Either the aksed estimator, or a multiclass-compatible wrapper over the asked estimator
- Return type:
sklearn-like estimator
summit.multiview_platform.utils.multiview_result_analysis module
summit.multiview_platform.utils.organization module
- secure_file_path(file_name)
summit.multiview_platform.utils.transformations module
- sign_labels(labels)
Returns a label array with (-1,1) as labels. If labels was already made of (-1,1), returns labels. If labels is made of (0,1), returns labels with all zeros transformed in -1.
- Parameters:
labels
array (The original label numpy)
- Return type:
A np.array with labels made of (-1,1)
- unsign_labels(labels)
The inverse function
- Parameters:
labels