View this notebook on GitHub or run it yourself on Binder!


Logging Concurrently

Let’s see how a few other popular scikit-learn models perform with the wine dataset. rubicon_ml’s filesystem logging is totally thread-safe, so we can test a lot of model configurations at once.

Note: rubicon_ml’s in-memory logging is inherently not threadsafe as each thread has its own memory.

[1]:
import os

from rubicon_ml import Rubicon


root_dir = os.environ.get("RUBICON_ROOT", "rubicon-root")
root_path = f"{os.path.dirname(os.getcwd())}/{root_dir}"

rubicon = Rubicon(persistence="filesystem", root_dir=root_path)
project = rubicon.get_or_create_project(
    "Concurrent Experiments",
    description="training multiple models in parallel",
)

project
[1]:
<rubicon_ml.client.project.Project at 0x155660d00>

For a recap of the contents of the wine dataset, check out wine.DESCR and wine.data. We’ll put together a training dataset using a subset of the data.

[2]:
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split


wine = load_wine()
wine_feature_names = wine.feature_names
wine_datasets = train_test_split(
    wine["data"],
    wine["target"],
    test_size=0.25,
)

We’ll use this run_experiment function to log a new experiment to the provided project then train, run and log a model of type classifier_cls using the training and testing data in wine_datasets.

[3]:
from collections import namedtuple


SklearnTrainingMetadata = namedtuple("SklearnTrainingMetadata", "module_name method")

def run_experiment(project, classifier_cls, wine_datasets, feature_names, **kwargs):
    X_train, X_test, y_train, y_test = wine_datasets

    experiment = project.log_experiment(
        training_metadata=[
            SklearnTrainingMetadata("sklearn.datasets", "load_wine"),
        ],
        model_name=classifier_cls.__name__,
        tags=[classifier_cls.__name__],
    )

    for key, value in kwargs.items():
        experiment.log_parameter(key, value)

    for name in feature_names:
        experiment.log_feature(name)

    classifier = classifier_cls(**kwargs)
    classifier.fit(X_train, y_train)
    classifier.predict(X_test)

    accuracy = classifier.score(X_test, y_test)

    experiment.log_metric("accuracy", accuracy)

    if accuracy >= .95:
        experiment.add_tags(["success"])
    else:
        experiment.add_tags(["failure"])

If you’re familiar with trying to use the multiprocessing library in iPython 3, you’ll know the two don’t play along well. In order for the multiprocessing library to locate the run_experiment function, we’ll need to import it from a separate file. ./logging_concurrently.py defines the same run_experiment function as above.

[4]:
from logging_concurrently import run_experiment

This time we’ll take a look at three classifiers - RandomForestClassifier, DecisionTreeClassifier, and KNeighborsClassifier - to see which performs best. Each classifier will be run across four sets of parameters (provided as kwargs to run_experiment), for a total of 12 experiments. Here, we’ll build up a list of processes that will run each experiment in parallel.

[5]:
import multiprocessing

from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier


processes = []

for n_estimators in [10, 20, 30, 40]:
    processes.append(multiprocessing.Process(
        target=run_experiment,
        args=[project, RandomForestClassifier, wine_datasets, wine_feature_names],
        kwargs={"n_estimators": n_estimators},
    ))

for n_neighbors in [5, 10, 15, 20]:
    processes.append(multiprocessing.Process(
        target=run_experiment,
        args=[project, KNeighborsClassifier, wine_datasets, wine_feature_names],
        kwargs={"n_neighbors": n_neighbors},
    ))

for criterion in ["gini", "entropy"]:
    for splitter in ["best", "random"]:
        processes.append(multiprocessing.Process(
            target=run_experiment,
            args=[project, DecisionTreeClassifier, wine_datasets, wine_feature_names],
            kwargs={
                "criterion": criterion,
                "splitter": splitter,
            },
        ))

Let’s run all our experiments in parallel!

[6]:
for process in processes:
    process.start()

for process in processes:
    process.join()

Now we can validate that we successfully logged all 12 experiments to our project.

[7]:
len(project.experiments())
[7]:
12

Let’s see which experiments we tagged as successful and what type of model they used.

[8]:
for e in project.experiments(tags=["success"]):
    print(f"experiment {e.id[:8]} was successful using a {e.model_name}")
experiment 3a144831 was successful using a RandomForestClassifier
experiment 67961ac9 was successful using a RandomForestClassifier
experiment 712564c0 was successful using a RandomForestClassifier
experiment f143ea43 was successful using a RandomForestClassifier

We can also take a deeper look at any of our experiments.

[9]:
first_experiment = project.experiments()[0]

training_metadata = SklearnTrainingMetadata(*first_experiment.training_metadata)
tags = first_experiment.tags

parameters = [(p.name, p.value) for p in first_experiment.parameters()]
metrics = [(m.name, m.value) for m in first_experiment.metrics()]

print(
    f"experiment {first_experiment.id}\n"
    f"training metadata: {training_metadata}\ntags: {tags}\n"
    f"parameters: {parameters}\nmetrics: {metrics}"
)
experiment 0b6b8114-fe72-49af-813b-23e1667c1486
training metadata: SklearnTrainingMetadata(module_name='sklearn.datasets', method='load_wine')
tags: ['KNeighborsClassifier', 'failure']
parameters: [('n_neighbors', 10)]
metrics: [('accuracy', 0.7111111111111111)]

Or we could grab the project’s data as a dataframe!

[10]:
ddf = rubicon.get_project_as_df("Concurrent Experiments", df_type="dask")
ddf.compute()
[10]:
id name description model_name commit_hash tags created_at splitter criterion n_estimators n_neighbors accuracy
0 aa4faa9b-3051-454d-915d-54b5c16f6e34 None None DecisionTreeClassifier None [DecisionTreeClassifier, failure] 2021-04-28 12:45:27.759817 best gini NaN NaN 0.933333
1 7aec1e7b-c7dd-43e7-b20c-a400b2141d23 None None DecisionTreeClassifier None [DecisionTreeClassifier, failure] 2021-04-28 12:45:27.752157 best entropy NaN NaN 0.933333
2 3a144831-b093-4dc3-85a6-71bc7a4e26a4 None None RandomForestClassifier None [RandomForestClassifier, success] 2021-04-28 12:45:27.715676 NaN NaN 20.0 NaN 1.000000
3 7503bcf0-2daa-4c66-9b45-a75863d446a5 None None DecisionTreeClassifier None [DecisionTreeClassifier, failure] 2021-04-28 12:45:27.714952 random gini NaN NaN 0.911111
4 f143ea43-d780-4639-a082-5c81b850f1e0 None None RandomForestClassifier None [RandomForestClassifier, success] 2021-04-28 12:45:27.636078 NaN NaN 10.0 NaN 0.977778
5 67961ac9-4bd5-4916-925c-fd38844d78c6 None None RandomForestClassifier None [RandomForestClassifier, success] 2021-04-28 12:45:27.635968 NaN NaN 30.0 NaN 0.977778
6 8b6b9b9a-1e06-4e0a-af2e-9649d79872e3 None None KNeighborsClassifier None [KNeighborsClassifier, failure] 2021-04-28 12:45:27.619764 NaN NaN NaN 15.0 0.733333
7 3c679288-1402-4ffb-bbb1-23fcfd0b1415 None None DecisionTreeClassifier None [DecisionTreeClassifier, failure] 2021-04-28 12:45:27.615204 random entropy NaN NaN 0.888889
8 38aa6e7b-efcc-4ee5-9950-5449ef8681ff None None KNeighborsClassifier None [KNeighborsClassifier, failure] 2021-04-28 12:45:27.585590 NaN NaN NaN 20.0 0.711111
9 0b6b8114-fe72-49af-813b-23e1667c1486 None None KNeighborsClassifier None [KNeighborsClassifier, failure] 2021-04-28 12:45:27.583647 NaN NaN NaN 10.0 0.711111
10 a0b7e295-6047-45da-81e6-08dc8288a2eb None None KNeighborsClassifier None [KNeighborsClassifier, failure] 2021-04-28 12:45:27.564990 NaN NaN NaN 5.0 0.733333
11 712564c0-e113-42ea-b59c-90a5ce0cf37b None None RandomForestClassifier None [RandomForestClassifier, success] 2021-04-28 12:45:27.511807 NaN NaN 40.0 NaN 0.955556