Welcome to the rubicon-ml Docs!¶
rubicon-ml is a data science tool that captures and stores model training and
execution information, like parameters and outcomes, in a repeatable and
searchable way. Its git
integration associates these inputs and outputs
directly with the model code that produced them to ensure full auditability and
reproducibility for both developers and stakeholders alike. And while experimenting,
the dashboard makes it easy to explore, filter, visualize, and share
recorded work.
Visit the glossary to explore rubicon-ml’s terminology or get started with the first example in our quick look!
Components¶
rubicon-ml’s core functionality is broken down into three parts…
Logging: organize, store, and retrieve model inputs and outputs with various backend storage options - powered by fsspec
Sharing: share a selected subset of logged data with collaborators or reviewers - powered by intake
Visualizing: explore and compare logged model metadata with the dashboard and other widgets - powered by dash
Workflow¶
Use rubicon_ml
to capture model inputs and outputs over time.
It easily integrates into existing Python models or pipelines and supports both
concurrent logging (so multiple experiments can be logged in parallel) and
asynchronous communication with S3 (so network reads and writes won’t block).
Meanwhile, periodically review the logged data within the dashboard to steer the model tweaking process in the right direction. The dashboard lets you quickly spot trends by exploring and filtering your logged results and visualizes how the model inputs impacted the model outputs.
When the model is ready for review, rubicon-ml makes it easy to share specific subsets of the data with model reviewers and stakeholders, giving them the context necessary for a complete model review and approval.
Install¶
rubicon-ml is available to install via conda
and pip
. When using conda
,
make sure to set the channel to conda-forge
. You should only need to do this once:
conda config --add channels conda-forge
then…
conda install rubicon-ml
Alternatively:
pip install rubicon-ml
Warning
rubicon-ml version 0.3.0+ requires Python version 3.8+
Extras¶
rubicon-ml has a few optional extras if you’re installing with pip
(these extras are all
installed by default when using conda
).
The s3
extra installs s3fs
to enable logging to Amazon S3.
pip install rubicon-ml[s3]
The viz
extra installs the requirements necessary for using the visualization tools.
For a preview, take a look at the Visualizations section of the docs.
pip install rubicon-ml[viz]
The prefect
extra installs the requirements necessary for using the Prefect
tasks in the rubicon_ml.workflow
module. Take a look at the Prefect integration
to see the library integrated into a simple Prefect flow.
pip install rubicon-ml[prefect]
To install all extra modules, use the all
extra.
pip install rubicon-ml[all]