Example 3 : Understanding the statistical iterations
Context
In the previous example, we have seen that in order to output meaningful results, the platform splits the input dataset in a training and a testing set.
However, even if the split is done at random, one can draw a lucky (or unlucky) split and have great (or poor) performance on this specific split.
To settle this issue, the platform can run on multiple splits and return the mean scores.
How to use it
This feature is controlled by a single argument : stats_iter:
in the config file.
Modifying this argument and setting more than one stats_iter
will slightly modify the result directory’s structure.
Indeed, as the platform will perform a benchmark on multiple train/test split, the result directory will be larger in order to keep all the individual results.
To run SuMMIT on several train/test splits, run :
>>> from summit.execute import execute
>>> execute("example 3")
While SuMMIT computes, let us explore the new pseudo-code
for each statistical iteration :
┌
|for each monoview classifier:
| for each view:
| for each draw:
| for each fold:
| learn the classifier on all-1 folds and test it on 1
| get the mean performance
| get the best hyper-parameter set
| learn on the whole training set
|and
|for each multiview classifier:
| for each draw:
| for each fold:
| learn the classifier on all-1 folds and test it on 1
| get the mean performance
| get the best hyper-parameter set
| learn on the whole training set
└
The result directory will be structured as :
- feature_importances
- doc_summit-generated_view_1-feature_importances.html
- doc_summit-generated_view_1-feature_importances_dataframe.csv
- doc_summit-generated_view_1-feature_importances_dataframe_stds.csv
- doc_summit-generated_view_2-feature_importances.html
- doc_summit-generated_view_2-feature_importances_dataframe.csv
- doc_summit-generated_view_2-feature_importances_dataframe_stds.csv
- doc_summit-generated_view_3-feature_importances.html
- doc_summit-generated_view_3-feature_importances_dataframe.csv
- doc_summit-generated_view_3-feature_importances_dataframe_stds.csv
- doc_summit-generated_view_4-feature_importances.html
- doc_summit-generated_view_4-feature_importances_dataframe.csv
- doc_summit-generated_view_4-feature_importances_dataframe_stds.csv
- iter_1
- adaboost
- generated_view_1
- adaboost-doc_summit-generated_view_1-confusion_matrix.csv
- adaboost-doc_summit-generated_view_1-feature_importances.png
- adaboost-doc_summit-generated_view_1-full_pred.csv
- adaboost-doc_summit-generated_view_1-summary.txt
- adaboost-doc_summit-generated_view_1-test_labels.csv
- adaboost-doc_summit-generated_view_1-test_metrics.csv
- adaboost-doc_summit-generated_view_1-test_metrics.png
- adaboost-doc_summit-generated_view_1-times.csv
- adaboost-doc_summit-generated_view_1-train_labels.csv
- adaboost-doc_summit-generated_view_1-train_metrics.csv
- adaboost-doc_summit-generated_view_1-train_pred.csv
- generated_view_2
- adaboost-doc_summit-generated_view_2-confusion_matrix.csv
- adaboost-doc_summit-generated_view_2-feature_importances.png
- adaboost-doc_summit-generated_view_2-full_pred.csv
- adaboost-doc_summit-generated_view_2-summary.txt
- adaboost-doc_summit-generated_view_2-test_labels.csv
- adaboost-doc_summit-generated_view_2-test_metrics.csv
- adaboost-doc_summit-generated_view_2-test_metrics.png
- adaboost-doc_summit-generated_view_2-times.csv
- adaboost-doc_summit-generated_view_2-train_labels.csv
- adaboost-doc_summit-generated_view_2-train_metrics.csv
- adaboost-doc_summit-generated_view_2-train_pred.csv
- generated_view_3
- adaboost-doc_summit-generated_view_3-confusion_matrix.csv
- adaboost-doc_summit-generated_view_3-feature_importances.png
- adaboost-doc_summit-generated_view_3-full_pred.csv
- adaboost-doc_summit-generated_view_3-summary.txt
- adaboost-doc_summit-generated_view_3-test_labels.csv
- adaboost-doc_summit-generated_view_3-test_metrics.csv
- adaboost-doc_summit-generated_view_3-test_metrics.png
- adaboost-doc_summit-generated_view_3-times.csv
- adaboost-doc_summit-generated_view_3-train_labels.csv
- adaboost-doc_summit-generated_view_3-train_metrics.csv
- adaboost-doc_summit-generated_view_3-train_pred.csv
- generated_view_4
- adaboost-doc_summit-generated_view_4-confusion_matrix.csv
- adaboost-doc_summit-generated_view_4-feature_importances.png
- adaboost-doc_summit-generated_view_4-full_pred.csv
- adaboost-doc_summit-generated_view_4-summary.txt
- adaboost-doc_summit-generated_view_4-test_labels.csv
- adaboost-doc_summit-generated_view_4-test_metrics.csv
- adaboost-doc_summit-generated_view_4-test_metrics.png
- adaboost-doc_summit-generated_view_4-times.csv
- adaboost-doc_summit-generated_view_4-train_labels.csv
- adaboost-doc_summit-generated_view_4-train_metrics.csv
- adaboost-doc_summit-generated_view_4-train_pred.csv
- generated_view_1feature_importances.pickle
- generated_view_2feature_importances.pickle
- generated_view_3feature_importances.pickle
- generated_view_4feature_importances.pickle
- generated_view_1
- decision_tree
- generated_view_1
- decision_tree-doc_summit-generated_view_1-confusion_matrix.csv
- decision_tree-doc_summit-generated_view_1-feature_importances.png
- decision_tree-doc_summit-generated_view_1-full_pred.csv
- decision_tree-doc_summit-generated_view_1-summary.txt
- decision_tree-doc_summit-generated_view_1-test_labels.csv
- decision_tree-doc_summit-generated_view_1-train_labels.csv
- decision_tree-doc_summit-generated_view_1-train_pred.csv
- generated_view_2
- decision_tree-doc_summit-generated_view_2-confusion_matrix.csv
- decision_tree-doc_summit-generated_view_2-feature_importances.png
- decision_tree-doc_summit-generated_view_2-full_pred.csv
- decision_tree-doc_summit-generated_view_2-summary.txt
- decision_tree-doc_summit-generated_view_2-test_labels.csv
- decision_tree-doc_summit-generated_view_2-train_labels.csv
- decision_tree-doc_summit-generated_view_2-train_pred.csv
- generated_view_3
- decision_tree-doc_summit-generated_view_3-confusion_matrix.csv
- decision_tree-doc_summit-generated_view_3-feature_importances.png
- decision_tree-doc_summit-generated_view_3-full_pred.csv
- decision_tree-doc_summit-generated_view_3-summary.txt
- decision_tree-doc_summit-generated_view_3-test_labels.csv
- decision_tree-doc_summit-generated_view_3-train_labels.csv
- decision_tree-doc_summit-generated_view_3-train_pred.csv
- generated_view_4
- decision_tree-doc_summit-generated_view_4-confusion_matrix.csv
- decision_tree-doc_summit-generated_view_4-feature_importances.png
- decision_tree-doc_summit-generated_view_4-full_pred.csv
- decision_tree-doc_summit-generated_view_4-summary.txt
- decision_tree-doc_summit-generated_view_4-test_labels.csv
- decision_tree-doc_summit-generated_view_4-train_labels.csv
- decision_tree-doc_summit-generated_view_4-train_pred.csv
- generated_view_1feature_importances.pickle
- generated_view_2feature_importances.pickle
- generated_view_3feature_importances.pickle
- generated_view_4feature_importances.pickle
- generated_view_1
- feature_importances
- doc_summit-generated_view_1-feature_importances.html
- doc_summit-generated_view_1-feature_importances_dataframe.csv
- doc_summit-generated_view_2-feature_importances.html
- doc_summit-generated_view_2-feature_importances_dataframe.csv
- doc_summit-generated_view_3-feature_importances.html
- doc_summit-generated_view_3-feature_importances_dataframe.csv
- doc_summit-generated_view_4-feature_importances.html
- doc_summit-generated_view_4-feature_importances_dataframe.csv
- folds
- test_labels_fold_0.csv
- test_labels_fold_1.csv
- test_labels_fold_2.csv
- test_labels_fold_3.csv
- test_labels_fold_4.csv
- weighted_linear_late_fusion
- weighted_linear_late_fusion-doc_summit-confusion_matrix.csv
- weighted_linear_late_fusion-doc_summit-summary.txt
- doc_summit-2D_plot_data.csv
- doc_summit-accuracy_score*-class.html
- doc_summit-accuracy_score*.csv
- doc_summit-accuracy_score*.html
- doc_summit-accuracy_score*.png
- doc_summit-bar_plot_data.csv
- doc_summit-durations.html
- doc_summit-durations_dataframe.csv
- doc_summit-error_analysis_2D.html
- doc_summit-error_analysis_2D.png
- doc_summit-error_analysis_bar.html
- doc_summit-error_analysis_bar.png
- doc_summit-f1_score-class.html
- doc_summit-f1_score.csv
- doc_summit-f1_score.html
- doc_summit-f1_score.png
- train_indices.csv
- train_labels.csv
- adaboost
- iter_2
- adaboost
- generated_view_1
- adaboost-doc_summit-generated_view_1-confusion_matrix.csv
- adaboost-doc_summit-generated_view_1-feature_importances.png
- adaboost-doc_summit-generated_view_1-full_pred.csv
- adaboost-doc_summit-generated_view_1-summary.txt
- adaboost-doc_summit-generated_view_1-test_labels.csv
- adaboost-doc_summit-generated_view_1-test_metrics.csv
- adaboost-doc_summit-generated_view_1-test_metrics.png
- adaboost-doc_summit-generated_view_1-times.csv
- adaboost-doc_summit-generated_view_1-train_labels.csv
- adaboost-doc_summit-generated_view_1-train_metrics.csv
- adaboost-doc_summit-generated_view_1-train_pred.csv
- generated_view_2
- adaboost-doc_summit-generated_view_2-confusion_matrix.csv
- adaboost-doc_summit-generated_view_2-feature_importances.png
- adaboost-doc_summit-generated_view_2-full_pred.csv
- adaboost-doc_summit-generated_view_2-summary.txt
- adaboost-doc_summit-generated_view_2-test_labels.csv
- adaboost-doc_summit-generated_view_2-test_metrics.csv
- adaboost-doc_summit-generated_view_2-test_metrics.png
- adaboost-doc_summit-generated_view_2-times.csv
- adaboost-doc_summit-generated_view_2-train_labels.csv
- adaboost-doc_summit-generated_view_2-train_metrics.csv
- adaboost-doc_summit-generated_view_2-train_pred.csv
- generated_view_3
- adaboost-doc_summit-generated_view_3-confusion_matrix.csv
- adaboost-doc_summit-generated_view_3-feature_importances.png
- adaboost-doc_summit-generated_view_3-full_pred.csv
- adaboost-doc_summit-generated_view_3-summary.txt
- adaboost-doc_summit-generated_view_3-test_labels.csv
- adaboost-doc_summit-generated_view_3-test_metrics.csv
- adaboost-doc_summit-generated_view_3-test_metrics.png
- adaboost-doc_summit-generated_view_3-times.csv
- adaboost-doc_summit-generated_view_3-train_labels.csv
- adaboost-doc_summit-generated_view_3-train_metrics.csv
- adaboost-doc_summit-generated_view_3-train_pred.csv
- generated_view_4
- adaboost-doc_summit-generated_view_4-confusion_matrix.csv
- adaboost-doc_summit-generated_view_4-feature_importances.png
- adaboost-doc_summit-generated_view_4-full_pred.csv
- adaboost-doc_summit-generated_view_4-summary.txt
- adaboost-doc_summit-generated_view_4-test_labels.csv
- adaboost-doc_summit-generated_view_4-test_metrics.csv
- adaboost-doc_summit-generated_view_4-test_metrics.png
- adaboost-doc_summit-generated_view_4-times.csv
- adaboost-doc_summit-generated_view_4-train_labels.csv
- adaboost-doc_summit-generated_view_4-train_metrics.csv
- adaboost-doc_summit-generated_view_4-train_pred.csv
- generated_view_1feature_importances.pickle
- generated_view_2feature_importances.pickle
- generated_view_3feature_importances.pickle
- generated_view_4feature_importances.pickle
- generated_view_1
- decision_tree
- generated_view_1
- decision_tree-doc_summit-generated_view_1-confusion_matrix.csv
- decision_tree-doc_summit-generated_view_1-feature_importances.png
- decision_tree-doc_summit-generated_view_1-full_pred.csv
- decision_tree-doc_summit-generated_view_1-summary.txt
- decision_tree-doc_summit-generated_view_1-test_labels.csv
- decision_tree-doc_summit-generated_view_1-train_labels.csv
- decision_tree-doc_summit-generated_view_1-train_pred.csv
- generated_view_2
- decision_tree-doc_summit-generated_view_2-confusion_matrix.csv
- decision_tree-doc_summit-generated_view_2-feature_importances.png
- decision_tree-doc_summit-generated_view_2-full_pred.csv
- decision_tree-doc_summit-generated_view_2-summary.txt
- decision_tree-doc_summit-generated_view_2-test_labels.csv
- decision_tree-doc_summit-generated_view_2-train_labels.csv
- decision_tree-doc_summit-generated_view_2-train_pred.csv
- generated_view_3
- decision_tree-doc_summit-generated_view_3-confusion_matrix.csv
- decision_tree-doc_summit-generated_view_3-feature_importances.png
- decision_tree-doc_summit-generated_view_3-full_pred.csv
- decision_tree-doc_summit-generated_view_3-summary.txt
- decision_tree-doc_summit-generated_view_3-test_labels.csv
- decision_tree-doc_summit-generated_view_3-train_labels.csv
- decision_tree-doc_summit-generated_view_3-train_pred.csv
- generated_view_4
- decision_tree-doc_summit-generated_view_4-confusion_matrix.csv
- decision_tree-doc_summit-generated_view_4-feature_importances.png
- decision_tree-doc_summit-generated_view_4-full_pred.csv
- decision_tree-doc_summit-generated_view_4-summary.txt
- decision_tree-doc_summit-generated_view_4-test_labels.csv
- decision_tree-doc_summit-generated_view_4-train_labels.csv
- decision_tree-doc_summit-generated_view_4-train_pred.csv
- generated_view_1feature_importances.pickle
- generated_view_2feature_importances.pickle
- generated_view_3feature_importances.pickle
- generated_view_4feature_importances.pickle
- generated_view_1
- feature_importances
- doc_summit-generated_view_1-feature_importances.html
- doc_summit-generated_view_1-feature_importances_dataframe.csv
- doc_summit-generated_view_2-feature_importances.html
- doc_summit-generated_view_2-feature_importances_dataframe.csv
- doc_summit-generated_view_3-feature_importances.html
- doc_summit-generated_view_3-feature_importances_dataframe.csv
- doc_summit-generated_view_4-feature_importances.html
- doc_summit-generated_view_4-feature_importances_dataframe.csv
- folds
- test_labels_fold_0.csv
- test_labels_fold_1.csv
- test_labels_fold_2.csv
- test_labels_fold_3.csv
- test_labels_fold_4.csv
- weighted_linear_late_fusion
- weighted_linear_late_fusion-doc_summit-confusion_matrix.csv
- weighted_linear_late_fusion-doc_summit-summary.txt
- doc_summit-2D_plot_data.csv
- doc_summit-accuracy_score*-class.html
- doc_summit-accuracy_score*.csv
- doc_summit-accuracy_score*.html
- doc_summit-accuracy_score*.png
- doc_summit-bar_plot_data.csv
- doc_summit-durations.html
- doc_summit-durations_dataframe.csv
- doc_summit-error_analysis_2D.html
- doc_summit-error_analysis_2D.png
- doc_summit-error_analysis_bar.html
- doc_summit-error_analysis_bar.png
- doc_summit-f1_score-class.html
- doc_summit-f1_score.csv
- doc_summit-f1_score.html
- doc_summit-f1_score.png
- train_indices.csv
- train_labels.csv
- adaboost
- iter_3
- adaboost
- generated_view_1
- adaboost-doc_summit-generated_view_1-confusion_matrix.csv
- adaboost-doc_summit-generated_view_1-feature_importances.png
- adaboost-doc_summit-generated_view_1-full_pred.csv
- adaboost-doc_summit-generated_view_1-summary.txt
- adaboost-doc_summit-generated_view_1-test_labels.csv
- adaboost-doc_summit-generated_view_1-test_metrics.csv
- adaboost-doc_summit-generated_view_1-test_metrics.png
- adaboost-doc_summit-generated_view_1-times.csv
- adaboost-doc_summit-generated_view_1-train_labels.csv
- adaboost-doc_summit-generated_view_1-train_metrics.csv
- adaboost-doc_summit-generated_view_1-train_pred.csv
- generated_view_2
- adaboost-doc_summit-generated_view_2-confusion_matrix.csv
- adaboost-doc_summit-generated_view_2-feature_importances.png
- adaboost-doc_summit-generated_view_2-full_pred.csv
- adaboost-doc_summit-generated_view_2-summary.txt
- adaboost-doc_summit-generated_view_2-test_labels.csv
- adaboost-doc_summit-generated_view_2-test_metrics.csv
- adaboost-doc_summit-generated_view_2-test_metrics.png
- adaboost-doc_summit-generated_view_2-times.csv
- adaboost-doc_summit-generated_view_2-train_labels.csv
- adaboost-doc_summit-generated_view_2-train_metrics.csv
- adaboost-doc_summit-generated_view_2-train_pred.csv
- generated_view_3
- adaboost-doc_summit-generated_view_3-confusion_matrix.csv
- adaboost-doc_summit-generated_view_3-feature_importances.png
- adaboost-doc_summit-generated_view_3-full_pred.csv
- adaboost-doc_summit-generated_view_3-summary.txt
- adaboost-doc_summit-generated_view_3-test_labels.csv
- adaboost-doc_summit-generated_view_3-test_metrics.csv
- adaboost-doc_summit-generated_view_3-test_metrics.png
- adaboost-doc_summit-generated_view_3-times.csv
- adaboost-doc_summit-generated_view_3-train_labels.csv
- adaboost-doc_summit-generated_view_3-train_metrics.csv
- adaboost-doc_summit-generated_view_3-train_pred.csv
- generated_view_4
- adaboost-doc_summit-generated_view_4-confusion_matrix.csv
- adaboost-doc_summit-generated_view_4-feature_importances.png
- adaboost-doc_summit-generated_view_4-full_pred.csv
- adaboost-doc_summit-generated_view_4-summary.txt
- adaboost-doc_summit-generated_view_4-test_labels.csv
- adaboost-doc_summit-generated_view_4-test_metrics.csv
- adaboost-doc_summit-generated_view_4-test_metrics.png
- adaboost-doc_summit-generated_view_4-times.csv
- adaboost-doc_summit-generated_view_4-train_labels.csv
- adaboost-doc_summit-generated_view_4-train_metrics.csv
- adaboost-doc_summit-generated_view_4-train_pred.csv
- generated_view_1feature_importances.pickle
- generated_view_2feature_importances.pickle
- generated_view_3feature_importances.pickle
- generated_view_4feature_importances.pickle
- generated_view_1
- decision_tree
- generated_view_1
- decision_tree-doc_summit-generated_view_1-confusion_matrix.csv
- decision_tree-doc_summit-generated_view_1-feature_importances.png
- decision_tree-doc_summit-generated_view_1-full_pred.csv
- decision_tree-doc_summit-generated_view_1-summary.txt
- decision_tree-doc_summit-generated_view_1-test_labels.csv
- decision_tree-doc_summit-generated_view_1-train_labels.csv
- decision_tree-doc_summit-generated_view_1-train_pred.csv
- generated_view_2
- decision_tree-doc_summit-generated_view_2-confusion_matrix.csv
- decision_tree-doc_summit-generated_view_2-feature_importances.png
- decision_tree-doc_summit-generated_view_2-full_pred.csv
- decision_tree-doc_summit-generated_view_2-summary.txt
- decision_tree-doc_summit-generated_view_2-test_labels.csv
- decision_tree-doc_summit-generated_view_2-train_labels.csv
- decision_tree-doc_summit-generated_view_2-train_pred.csv
- generated_view_3
- decision_tree-doc_summit-generated_view_3-confusion_matrix.csv
- decision_tree-doc_summit-generated_view_3-feature_importances.png
- decision_tree-doc_summit-generated_view_3-full_pred.csv
- decision_tree-doc_summit-generated_view_3-summary.txt
- decision_tree-doc_summit-generated_view_3-test_labels.csv
- decision_tree-doc_summit-generated_view_3-train_labels.csv
- decision_tree-doc_summit-generated_view_3-train_pred.csv
- generated_view_4
- decision_tree-doc_summit-generated_view_4-confusion_matrix.csv
- decision_tree-doc_summit-generated_view_4-feature_importances.png
- decision_tree-doc_summit-generated_view_4-full_pred.csv
- decision_tree-doc_summit-generated_view_4-summary.txt
- decision_tree-doc_summit-generated_view_4-test_labels.csv
- decision_tree-doc_summit-generated_view_4-train_labels.csv
- decision_tree-doc_summit-generated_view_4-train_pred.csv
- generated_view_1feature_importances.pickle
- generated_view_2feature_importances.pickle
- generated_view_3feature_importances.pickle
- generated_view_4feature_importances.pickle
- generated_view_1
- feature_importances
- doc_summit-generated_view_1-feature_importances.html
- doc_summit-generated_view_1-feature_importances_dataframe.csv
- doc_summit-generated_view_2-feature_importances.html
- doc_summit-generated_view_2-feature_importances_dataframe.csv
- doc_summit-generated_view_3-feature_importances.html
- doc_summit-generated_view_3-feature_importances_dataframe.csv
- doc_summit-generated_view_4-feature_importances.html
- doc_summit-generated_view_4-feature_importances_dataframe.csv
- folds
- test_labels_fold_0.csv
- test_labels_fold_1.csv
- test_labels_fold_2.csv
- test_labels_fold_3.csv
- test_labels_fold_4.csv
- weighted_linear_late_fusion
- weighted_linear_late_fusion-doc_summit-confusion_matrix.csv
- weighted_linear_late_fusion-doc_summit-summary.txt
- doc_summit-2D_plot_data.csv
- doc_summit-accuracy_score*-class.html
- doc_summit-accuracy_score*.csv
- doc_summit-accuracy_score*.html
- doc_summit-accuracy_score*.png
- doc_summit-bar_plot_data.csv
- doc_summit-durations.html
- doc_summit-durations_dataframe.csv
- doc_summit-error_analysis_2D.html
- doc_summit-error_analysis_2D.png
- doc_summit-error_analysis_bar.html
- doc_summit-error_analysis_bar.png
- doc_summit-f1_score-class.html
- doc_summit-f1_score.csv
- doc_summit-f1_score.html
- doc_summit-f1_score.png
- train_indices.csv
- train_labels.csv
- adaboost
- iter_4
- adaboost
- generated_view_1
- adaboost-doc_summit-generated_view_1-confusion_matrix.csv
- adaboost-doc_summit-generated_view_1-feature_importances.png
- adaboost-doc_summit-generated_view_1-full_pred.csv
- adaboost-doc_summit-generated_view_1-summary.txt
- adaboost-doc_summit-generated_view_1-test_labels.csv
- adaboost-doc_summit-generated_view_1-test_metrics.csv
- adaboost-doc_summit-generated_view_1-test_metrics.png
- adaboost-doc_summit-generated_view_1-times.csv
- adaboost-doc_summit-generated_view_1-train_labels.csv
- adaboost-doc_summit-generated_view_1-train_metrics.csv
- adaboost-doc_summit-generated_view_1-train_pred.csv
- generated_view_2
- adaboost-doc_summit-generated_view_2-confusion_matrix.csv
- adaboost-doc_summit-generated_view_2-feature_importances.png
- adaboost-doc_summit-generated_view_2-full_pred.csv
- adaboost-doc_summit-generated_view_2-summary.txt
- adaboost-doc_summit-generated_view_2-test_labels.csv
- adaboost-doc_summit-generated_view_2-test_metrics.csv
- adaboost-doc_summit-generated_view_2-test_metrics.png
- adaboost-doc_summit-generated_view_2-times.csv
- adaboost-doc_summit-generated_view_2-train_labels.csv
- adaboost-doc_summit-generated_view_2-train_metrics.csv
- adaboost-doc_summit-generated_view_2-train_pred.csv
- generated_view_3
- adaboost-doc_summit-generated_view_3-confusion_matrix.csv
- adaboost-doc_summit-generated_view_3-feature_importances.png
- adaboost-doc_summit-generated_view_3-full_pred.csv
- adaboost-doc_summit-generated_view_3-summary.txt
- adaboost-doc_summit-generated_view_3-test_labels.csv
- adaboost-doc_summit-generated_view_3-test_metrics.csv
- adaboost-doc_summit-generated_view_3-test_metrics.png
- adaboost-doc_summit-generated_view_3-times.csv
- adaboost-doc_summit-generated_view_3-train_labels.csv
- adaboost-doc_summit-generated_view_3-train_metrics.csv
- adaboost-doc_summit-generated_view_3-train_pred.csv
- generated_view_4
- adaboost-doc_summit-generated_view_4-confusion_matrix.csv
- adaboost-doc_summit-generated_view_4-feature_importances.png
- adaboost-doc_summit-generated_view_4-full_pred.csv
- adaboost-doc_summit-generated_view_4-summary.txt
- adaboost-doc_summit-generated_view_4-test_labels.csv
- adaboost-doc_summit-generated_view_4-test_metrics.csv
- adaboost-doc_summit-generated_view_4-test_metrics.png
- adaboost-doc_summit-generated_view_4-times.csv
- adaboost-doc_summit-generated_view_4-train_labels.csv
- adaboost-doc_summit-generated_view_4-train_metrics.csv
- adaboost-doc_summit-generated_view_4-train_pred.csv
- generated_view_1feature_importances.pickle
- generated_view_2feature_importances.pickle
- generated_view_3feature_importances.pickle
- generated_view_4feature_importances.pickle
- generated_view_1
- decision_tree
- generated_view_1
- decision_tree-doc_summit-generated_view_1-confusion_matrix.csv
- decision_tree-doc_summit-generated_view_1-feature_importances.png
- decision_tree-doc_summit-generated_view_1-full_pred.csv
- decision_tree-doc_summit-generated_view_1-summary.txt
- decision_tree-doc_summit-generated_view_1-test_labels.csv
- decision_tree-doc_summit-generated_view_1-train_labels.csv
- decision_tree-doc_summit-generated_view_1-train_pred.csv
- generated_view_2
- decision_tree-doc_summit-generated_view_2-confusion_matrix.csv
- decision_tree-doc_summit-generated_view_2-feature_importances.png
- decision_tree-doc_summit-generated_view_2-full_pred.csv
- decision_tree-doc_summit-generated_view_2-summary.txt
- decision_tree-doc_summit-generated_view_2-test_labels.csv
- decision_tree-doc_summit-generated_view_2-train_labels.csv
- decision_tree-doc_summit-generated_view_2-train_pred.csv
- generated_view_3
- decision_tree-doc_summit-generated_view_3-confusion_matrix.csv
- decision_tree-doc_summit-generated_view_3-feature_importances.png
- decision_tree-doc_summit-generated_view_3-full_pred.csv
- decision_tree-doc_summit-generated_view_3-summary.txt
- decision_tree-doc_summit-generated_view_3-test_labels.csv
- decision_tree-doc_summit-generated_view_3-train_labels.csv
- decision_tree-doc_summit-generated_view_3-train_pred.csv
- generated_view_4
- decision_tree-doc_summit-generated_view_4-confusion_matrix.csv
- decision_tree-doc_summit-generated_view_4-feature_importances.png
- decision_tree-doc_summit-generated_view_4-full_pred.csv
- decision_tree-doc_summit-generated_view_4-summary.txt
- decision_tree-doc_summit-generated_view_4-test_labels.csv
- decision_tree-doc_summit-generated_view_4-train_labels.csv
- decision_tree-doc_summit-generated_view_4-train_pred.csv
- generated_view_1feature_importances.pickle
- generated_view_2feature_importances.pickle
- generated_view_3feature_importances.pickle
- generated_view_4feature_importances.pickle
- generated_view_1
- feature_importances
- doc_summit-generated_view_1-feature_importances.html
- doc_summit-generated_view_1-feature_importances_dataframe.csv
- doc_summit-generated_view_2-feature_importances.html
- doc_summit-generated_view_2-feature_importances_dataframe.csv
- doc_summit-generated_view_3-feature_importances.html
- doc_summit-generated_view_3-feature_importances_dataframe.csv
- doc_summit-generated_view_4-feature_importances.html
- doc_summit-generated_view_4-feature_importances_dataframe.csv
- folds
- test_labels_fold_0.csv
- test_labels_fold_1.csv
- test_labels_fold_2.csv
- test_labels_fold_3.csv
- test_labels_fold_4.csv
- weighted_linear_late_fusion
- weighted_linear_late_fusion-doc_summit-confusion_matrix.csv
- weighted_linear_late_fusion-doc_summit-summary.txt
- doc_summit-2D_plot_data.csv
- doc_summit-accuracy_score*-class.html
- doc_summit-accuracy_score*.csv
- doc_summit-accuracy_score*.html
- doc_summit-accuracy_score*.png
- doc_summit-bar_plot_data.csv
- doc_summit-durations.html
- doc_summit-durations_dataframe.csv
- doc_summit-error_analysis_2D.html
- doc_summit-error_analysis_2D.png
- doc_summit-error_analysis_bar.html
- doc_summit-error_analysis_bar.png
- doc_summit-f1_score-class.html
- doc_summit-f1_score.csv
- doc_summit-f1_score.html
- doc_summit-f1_score.png
- train_indices.csv
- train_labels.csv
- adaboost
- iter_5
- adaboost
- generated_view_1
- adaboost-doc_summit-generated_view_1-confusion_matrix.csv
- adaboost-doc_summit-generated_view_1-feature_importances.png
- adaboost-doc_summit-generated_view_1-full_pred.csv
- adaboost-doc_summit-generated_view_1-summary.txt
- adaboost-doc_summit-generated_view_1-test_labels.csv
- adaboost-doc_summit-generated_view_1-test_metrics.csv
- adaboost-doc_summit-generated_view_1-test_metrics.png
- adaboost-doc_summit-generated_view_1-times.csv
- adaboost-doc_summit-generated_view_1-train_labels.csv
- adaboost-doc_summit-generated_view_1-train_metrics.csv
- adaboost-doc_summit-generated_view_1-train_pred.csv
- generated_view_2
- adaboost-doc_summit-generated_view_2-confusion_matrix.csv
- adaboost-doc_summit-generated_view_2-feature_importances.png
- adaboost-doc_summit-generated_view_2-full_pred.csv
- adaboost-doc_summit-generated_view_2-summary.txt
- adaboost-doc_summit-generated_view_2-test_labels.csv
- adaboost-doc_summit-generated_view_2-test_metrics.csv
- adaboost-doc_summit-generated_view_2-test_metrics.png
- adaboost-doc_summit-generated_view_2-times.csv
- adaboost-doc_summit-generated_view_2-train_labels.csv
- adaboost-doc_summit-generated_view_2-train_metrics.csv
- adaboost-doc_summit-generated_view_2-train_pred.csv
- generated_view_3
- adaboost-doc_summit-generated_view_3-confusion_matrix.csv
- adaboost-doc_summit-generated_view_3-feature_importances.png
- adaboost-doc_summit-generated_view_3-full_pred.csv
- adaboost-doc_summit-generated_view_3-summary.txt
- adaboost-doc_summit-generated_view_3-test_labels.csv
- adaboost-doc_summit-generated_view_3-test_metrics.csv
- adaboost-doc_summit-generated_view_3-test_metrics.png
- adaboost-doc_summit-generated_view_3-times.csv
- adaboost-doc_summit-generated_view_3-train_labels.csv
- adaboost-doc_summit-generated_view_3-train_metrics.csv
- adaboost-doc_summit-generated_view_3-train_pred.csv
- generated_view_4
- adaboost-doc_summit-generated_view_4-confusion_matrix.csv
- adaboost-doc_summit-generated_view_4-feature_importances.png
- adaboost-doc_summit-generated_view_4-full_pred.csv
- adaboost-doc_summit-generated_view_4-summary.txt
- adaboost-doc_summit-generated_view_4-test_labels.csv
- adaboost-doc_summit-generated_view_4-test_metrics.csv
- adaboost-doc_summit-generated_view_4-test_metrics.png
- adaboost-doc_summit-generated_view_4-times.csv
- adaboost-doc_summit-generated_view_4-train_labels.csv
- adaboost-doc_summit-generated_view_4-train_metrics.csv
- adaboost-doc_summit-generated_view_4-train_pred.csv
- generated_view_1feature_importances.pickle
- generated_view_2feature_importances.pickle
- generated_view_3feature_importances.pickle
- generated_view_4feature_importances.pickle
- generated_view_1
- decision_tree
- generated_view_1
- decision_tree-doc_summit-generated_view_1-confusion_matrix.csv
- decision_tree-doc_summit-generated_view_1-feature_importances.png
- decision_tree-doc_summit-generated_view_1-full_pred.csv
- decision_tree-doc_summit-generated_view_1-summary.txt
- decision_tree-doc_summit-generated_view_1-test_labels.csv
- decision_tree-doc_summit-generated_view_1-train_labels.csv
- decision_tree-doc_summit-generated_view_1-train_pred.csv
- generated_view_2
- decision_tree-doc_summit-generated_view_2-confusion_matrix.csv
- decision_tree-doc_summit-generated_view_2-feature_importances.png
- decision_tree-doc_summit-generated_view_2-full_pred.csv
- decision_tree-doc_summit-generated_view_2-summary.txt
- decision_tree-doc_summit-generated_view_2-test_labels.csv
- decision_tree-doc_summit-generated_view_2-train_labels.csv
- decision_tree-doc_summit-generated_view_2-train_pred.csv
- generated_view_3
- decision_tree-doc_summit-generated_view_3-confusion_matrix.csv
- decision_tree-doc_summit-generated_view_3-feature_importances.png
- decision_tree-doc_summit-generated_view_3-full_pred.csv
- decision_tree-doc_summit-generated_view_3-summary.txt
- decision_tree-doc_summit-generated_view_3-test_labels.csv
- decision_tree-doc_summit-generated_view_3-train_labels.csv
- decision_tree-doc_summit-generated_view_3-train_pred.csv
- generated_view_4
- decision_tree-doc_summit-generated_view_4-confusion_matrix.csv
- decision_tree-doc_summit-generated_view_4-feature_importances.png
- decision_tree-doc_summit-generated_view_4-full_pred.csv
- decision_tree-doc_summit-generated_view_4-summary.txt
- decision_tree-doc_summit-generated_view_4-test_labels.csv
- decision_tree-doc_summit-generated_view_4-train_labels.csv
- decision_tree-doc_summit-generated_view_4-train_pred.csv
- generated_view_1feature_importances.pickle
- generated_view_2feature_importances.pickle
- generated_view_3feature_importances.pickle
- generated_view_4feature_importances.pickle
- generated_view_1
- feature_importances
- doc_summit-generated_view_1-feature_importances.html
- doc_summit-generated_view_1-feature_importances_dataframe.csv
- doc_summit-generated_view_2-feature_importances.html
- doc_summit-generated_view_2-feature_importances_dataframe.csv
- doc_summit-generated_view_3-feature_importances.html
- doc_summit-generated_view_3-feature_importances_dataframe.csv
- doc_summit-generated_view_4-feature_importances.html
- doc_summit-generated_view_4-feature_importances_dataframe.csv
- folds
- test_labels_fold_0.csv
- test_labels_fold_1.csv
- test_labels_fold_2.csv
- test_labels_fold_3.csv
- test_labels_fold_4.csv
- weighted_linear_late_fusion
- weighted_linear_late_fusion-doc_summit-confusion_matrix.csv
- weighted_linear_late_fusion-doc_summit-summary.txt
- doc_summit-2D_plot_data.csv
- doc_summit-accuracy_score*-class.html
- doc_summit-accuracy_score*.csv
- doc_summit-accuracy_score*.html
- doc_summit-accuracy_score*.png
- doc_summit-bar_plot_data.csv
- doc_summit-durations.html
- doc_summit-durations_dataframe.csv
- doc_summit-error_analysis_2D.html
- doc_summit-error_analysis_2D.png
- doc_summit-error_analysis_bar.html
- doc_summit-error_analysis_bar.png
- doc_summit-f1_score-class.html
- doc_summit-f1_score.csv
- doc_summit-f1_score.html
- doc_summit-f1_score.png
- train_indices.csv
- train_labels.csv
- adaboost
- 2020_04_02-14_12-.hdf5--doc_summit-LOG.log
- clf_errors.csv
- config_file.yml
- doc_summit-durations.html
- doc_summit-durations_dataframe.csv
- doc_summit-durations_stds_dataframe.csv
- doc_summit-mean_on_5_iter-accuracy_score*-class.html
- doc_summit-mean_on_5_iter-accuracy_score*.csv
- doc_summit-mean_on_5_iter-accuracy_score*.html
- doc_summit-mean_on_5_iter-accuracy_score*.png
- doc_summit-mean_on_5_iter-f1_score-class.html
- doc_summit-mean_on_5_iter-f1_score.csv
- doc_summit-mean_on_5_iter-f1_score.html
- doc_summit-mean_on_5_iter-f1_score.png
- error_analysis_2D.html
- error_analysis_2D.png
- error_analysis_bar.html
- error_analysis_bar.png
- example_errors.csv
- random_state.pickle
If you look closely, nearly all the files from Example 1 are in each iter_
directories, and some files have appeared, in which the main figures are saved.
Indeed, the files stored in started_1560_12_25-15_42/
are the ones that show the mean results on all the statistical iterations.
For example, started_1560_12_25-15_42/*-accuracy_score.html
looks like :
Similarly for the f1-score :
The main difference between this plot an the one from Example 1 is that here, the scores are means over all the statistical iterations, and the standard deviations are plotted as vertical lines on top of the bars and printed after each score under the bars as “± <std>”.
This has also an impact on the display of error analysis. Indeed, now it has multiple shades of gray depending on the number of iterations that succeeded or failed on the sample :
Indeed, if we zoom in, we can distinguish them better :

Duration
Increasing the number of statistical iterations can be costly in terms of computational resources, indeed it is nearly a straight multiplication of the computation time .
Note
Parallelizing SuMMIT’s statistical iterations can improve its efficiency when using multiple iterations, it is currently work in progress.