Welcome to Supervised MultiModal Integration Tool’s documentation
This package has been designed as an easy-to-use platform to estimate different mono- and multi-view classifiers’ performances on a multiview dataset.
The main advantage of the platform is that it allows to add and remove a classifier without modifying its core code (the procedure is described thoroughly in this documentation).
This documentation consists in a short read me, with instructions to install and get started with SuMMIT, then several use cases to discover the features, and all the documented sources.
Note
The documentation, the platform and the tests are constantly being updated. All the content labelled WIP is Work In Progress
- Available monoview classifiers
- Available multiview classifiers
- bayesian_inference_fusion
- difficulty_fusion
- disagree_fusion
- double_fault_fusion
- early_fusion_adaboost
- early_fusion_decision_tree
- early_fusion_gradient_boosting
- early_fusion_lasso
- early_fusion_random_forest
- early_fusion_sgd
- early_fusion_svm_rbf
- entropy_fusion
- majority_voting_fusion
- svm_jumbo_fusion
- weighted_linear_early_fusion
- weighted_linear_late_fusion
Read me
Supervised MultiModal Integration Tool’s Readme
This project aims to be an easy-to-use solution to run a prior benchmark on a dataset and evaluate mono- & multi-view algorithms capacity to classify it correctly.
Getting Started
SuMMIT has been designed and uses continuous integration for Linux platforms (ubuntu 18.04), but we try to keep it as compatible as possible with Mac and Windows.
Platform |
Last positive test |
---|---|
Linux |
|
Mac |
1st of May, 2020 |
Windows |
1st of May, 2020 |
Prerequisites
To be able to use this project, you’ll need :
And the following python modules will be automatically installed :
matplotlib - Used to plot results,
sklearn - Used for the monoview classifiers,
joblib - Used to compute on multiple threads,
h5py - Used to generate HDF5 datasets on hard drive and use them to spare RAM,
pickle - Used to store some results,
pandas - Used to manipulate data efficiently,
six -
m2r - Used to generate documentation from the readme,
docutils - Used to generate documentation,
pyyaml - Used to read the config files,
plotly - Used to generate interactive HTML visuals,
tabulate - Used to generated the confusion matrix.
pyscm-ml -
Installing
Once you cloned the project from the gitlab repository, you just have to use :
cd path/to/summit/
pip install -e .
In the summit directory to install SuMMIT and its dependencies.
Running the tests
To run the test suite of SuMMIT, run :
cd path/to/summit
pip install -e .[dev]
pytest
The coverage report is automatically generated and stored in the htmlcov/
directory
Building the documentation
To locally build the documentation run :
cd path/to/summit
pip install -e .[doc]
python setup.py build_sphinx
The built html files will be stored in path/to/summit/build/sphinx/html
Running on simulated data
For your first go with SuMMIT, you can run it on simulated data with
python
>>> from summit.execute import execute
>>> execute("example 1")
This will run the benchmark of documentation’s Example 1.
For more information about the examples, see the documentation.
Results will, by default, be stored in the results directory of the installation path :
path/to/summit/multiview_platform/examples/results
.
The documentation proposes a detailed interpretation of the results and arguments of SuMMIT through 6 tutorials.
Dataset compatibility
In order to start a benchmark on your own dataset, you need to format it so SuMMIT can use it. To do so, a python script is provided.
For more information, see Example 5
Running on your dataset
Once you have formatted your dataset, to run SuMMIT on it you need to modify the config file as
name: ["your_file_name"]
pathf: "path/to/your/dataset"
It is however highly recommended to follow the documentation’s tutorials to learn the use of each parameter.