View this notebook on GitHub or run it yourself on Binder!

Log with Multiple Backends

rubicon-ml allows users to instantiate Rubicon objects with multiple backends to write to/read from at once. These backends include local, memory, and S3 repositories. Here’s a walk through of how one might instantiate and use a Rubicon object with multiple backends.

from rubicon_ml import Rubicon

Let’s say we want to log to two separate locations on our local filesystem. This example is a bit contrived, but you could imagine writing to both a local filesystem for quick, ad-hoc exploration and an S3 bucket for persistent storage.

rubicon_composite = Rubicon(composite_config=[
    {"persistence": "filesystem", "root_dir": "./rubicon-root/root_a"},
    {"persistence": "filesystem", "root_dir": "./rubicon-root/root_b"},


All of rubicon-ml’s logging functions will now log to both locations in the filesystem with a single function call.

import pandas as pd

project_composite = rubicon_composite.create_project(name="multiple backends")
experiment_composite = project_composite.log_experiment()

feature = experiment_composite.log_feature(name="year")
metric = experiment_composite.log_metric(name="accuracy", value=1.0)
parameter = experiment_composite.log_parameter(name="n_estimators", value=100)
artifact = experiment_composite.log_artifact(
    data_bytes=b"bytes", name="example artifact"
dataframe = experiment_composite.log_dataframe(
    pd.DataFrame([[5, 0, 0], [0, 5, 1], [0, 0, 4]], columns=["x", "y", "z"]),
    name="example dataframe",

Let’s verify both of our backends have been written to by retrieving the data one location at a time.

rubicon_a = Rubicon(persistence="filesystem", root_dir="./rubicon-root/root_a")
project_a = rubicon_a.get_project(name="multiple backends")


Each experiments’ IDs match, confirming they are the same.

rubicon_b = Rubicon(persistence="filesystem", root_dir="./rubicon-root/root_b")
project_b = rubicon_a.get_project(name="multiple backends")



rubicon-ml’s reading functions will iterate over all backend repositories and return from the first one they are able to read from. A RubiconException will be raised if none of the backend repositories can be read the requested item(s).

project_read = rubicon_composite.get_project(name="multiple backends")
<rubicon_ml.client.project.Project at 0x16aeb83e0>
for experiment in project_read.experiments():
    print(f"features: {[ for f in experiment.features()]}")
    print(f"metrics: {[ for m in experiment.metrics()]}")
    print(f"parameters: {[ for p in experiment.parameters()]}")
    print(f"artifact data: {experiment.artifact(name='example artifact').get_data()}")
    print(f"dataframe data:\n{experiment.dataframe(name='example dataframe').get_data()}")
features: ['year']
metrics: ['accuracy']
parameters: ['n_estimators']
artifact data: b'bytes'
dataframe data:
   x  y  z
0  5  0  0
1  0  5  1
2  0  0  4