View this notebook on GitHub or run it yourself on Binder!
Logging Experiments¶
rubicon_ml
’s core functionality is centered around logging experiments to explain and explore various model runs throughout the model development lifecycle. This example will take a quick look at how we can log model metadata to rubicon_ml
in the context of a simple classification project.
We’ll leverage the palmerpenguins
dataset collected by Dr. Kristen Gorman as our training/testing data. More information on the dataset can be found here.
Our goal is to create a simple classification model to differentiate the species of penguins present in the dataset. We’ll leverage rubicon_ml
logging to make it easy to compare runs of our model as well as preserve important information for reproducibility later.
[1]:
! pip install palmerpenguins
Requirement already satisfied: palmerpenguins in /Users/nvd215/opt/miniconda3/envs/rubicon-ml/lib/python3.10/site-packages (0.1.4)
Requirement already satisfied: numpy in /Users/nvd215/opt/miniconda3/envs/rubicon-ml/lib/python3.10/site-packages (from palmerpenguins) (1.21.6)
Requirement already satisfied: pandas in /Users/nvd215/opt/miniconda3/envs/rubicon-ml/lib/python3.10/site-packages (from palmerpenguins) (1.4.2)
Requirement already satisfied: python-dateutil>=2.8.1 in /Users/nvd215/opt/miniconda3/envs/rubicon-ml/lib/python3.10/site-packages (from pandas->palmerpenguins) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /Users/nvd215/opt/miniconda3/envs/rubicon-ml/lib/python3.10/site-packages (from pandas->palmerpenguins) (2022.1)
Requirement already satisfied: six>=1.5 in /Users/nvd215/opt/miniconda3/envs/rubicon-ml/lib/python3.10/site-packages (from python-dateutil>=2.8.1->pandas->palmerpenguins) (1.16.0)
First, we’ll load the dataset and perform some basic data preparation. In many scenarios, this will likely be done before loading training/testing data and before experimentation begins.
[2]:
from palmerpenguins import load_penguins
penguins_df = load_penguins()
target_values = penguins_df['species'].unique()
print(f"target classes (species): {target_values}")
penguins_df.head()
target classes (species): ['Adelie' 'Gentoo' 'Chinstrap']
[2]:
species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex | year | |
---|---|---|---|---|---|---|---|---|
0 | Adelie | Torgersen | 39.1 | 18.7 | 181.0 | 3750.0 | male | 2007 |
1 | Adelie | Torgersen | 39.5 | 17.4 | 186.0 | 3800.0 | female | 2007 |
2 | Adelie | Torgersen | 40.3 | 18.0 | 195.0 | 3250.0 | female | 2007 |
3 | Adelie | Torgersen | NaN | NaN | NaN | NaN | NaN | 2007 |
4 | Adelie | Torgersen | 36.7 | 19.3 | 193.0 | 3450.0 | female | 2007 |
Let’s encode the string variables in our dataset to categoricals so our KNN can work with the data.
[3]:
from sklearn.preprocessing import LabelEncoder
for column in ["species", "island", "sex"]:
penguins_df[column] = LabelEncoder().fit_transform(penguins_df[column])
print(f"target classes (species): {penguins_df['species'].unique()}")
penguins_df.head()
target classes (species): [0 2 1]
[3]:
species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex | year | |
---|---|---|---|---|---|---|---|---|
0 | 0 | 2 | 39.1 | 18.7 | 181.0 | 3750.0 | 1 | 2007 |
1 | 0 | 2 | 39.5 | 17.4 | 186.0 | 3800.0 | 0 | 2007 |
2 | 0 | 2 | 40.3 | 18.0 | 195.0 | 3250.0 | 0 | 2007 |
3 | 0 | 2 | NaN | NaN | NaN | NaN | 2 | 2007 |
4 | 0 | 2 | 36.7 | 19.3 | 193.0 | 3450.0 | 0 | 2007 |
Finally, we’ll split the preprocessed data into a train and test set.
[4]:
from sklearn.model_selection import train_test_split
train_penguins_df, test_penguins_df = train_test_split(penguins_df, test_size=.30)
target_name = "species"
feature_names = [c for c in train_penguins_df.columns if c != target_name]
X_train, y_train = train_penguins_df[feature_names], train_penguins_df[target_name]
X_test, y_test = test_penguins_df[feature_names], test_penguins_df[target_name]
X_train.shape, y_train.shape, X_test.shape, y_test.shape
[4]:
((240, 7), (240,), (104, 7), (104,))
Now we can create and train a simple Scikit-learn pipeline to organize our model training code. We’ll use a SimpleImputer
to fill in missing values followed by a KNeighborsClassifier
to classify the penguins.
[5]:
from sklearn.impute import SimpleImputer
from sklearn.neighbors import KNeighborsClassifier
from sklearn.pipeline import Pipeline
imputer_strategy = "mean"
classifier_n_neighbors = 5
steps = [
("si", SimpleImputer(strategy=imputer_strategy)),
("kn", KNeighborsClassifier(n_neighbors=classifier_n_neighbors)),
]
penguin_pipeline = Pipeline(steps=steps)
penguin_pipeline.fit(X_train, y_train)
score = penguin_pipeline.score(X_test, y_test)
score
[5]:
0.7307692307692307
We’ve completed a training run, so let’s finally log our results to rubicon_ml
! We’ll create an entrypoint to the local filesystem and create a project called “classifying penguins” to store our results. rubicon_ml
’s log_*
methods can be placed throughout your model code to log any important information along the way. Entities available for logging via the log_*
methods can be found in our glossary.
[6]:
from rubicon_ml import Rubicon
rubicon = Rubicon(
persistence="filesystem",
root_dir="./rubicon-root",
auto_git_enabled=True,
)
project = rubicon.get_or_create_project(name="classifying penguins")
experiment = project.log_experiment()
for feature_name in feature_names:
experiment.log_feature(name=feature_name)
_ = experiment.log_parameter(name="strategy", value=imputer_strategy)
_ = experiment.log_parameter(name="n_neighbors", value=classifier_n_neighbors)
_ = experiment.log_metric(name="accuracy", value=score)
After logging, we can inspect the various attributes of our logged entities. All available attributes can be found in our API reference.
[7]:
print(experiment)
print()
print(f"git info:")
print(f"\tbranch name: {experiment.branch_name}\n\tcommit hash: {experiment.commit_hash}")
print(f"features: {[f.name for f in experiment.features()]}")
print(f"parameters: {[(p.name, p.value) for p in experiment.parameters()]}")
print(f"metrics: {[(m.name, m.value) for m in experiment.metrics()]}")
Experiment(project_name='classifying penguins', id='c484caf8-bdc1-429f-b012-7a4e02dbc83a', name=None, description=None, model_name=None, branch_name='210-new-quick-look', commit_hash='490e8af895f2cd0636c72295c2762b21cd6c8102', training_metadata=None, tags=[], created_at=datetime.datetime(2022, 6, 30, 13, 51, 4, 958916))
git info:
branch name: 210-new-quick-look
commit hash: 490e8af895f2cd0636c72295c2762b21cd6c8102
features: ['island', 'bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g', 'sex', 'year']
parameters: [('strategy', 'mean'), ('n_neighbors', 5)]
metrics: [('accuracy', 0.7307692307692307)]
Tracking the results of a single model fit is nice, but rubicon_ml
can really shine when we’re iterating over numerous model fits - like a hyperparameter search. The code below performs a very basic hyperparameter search for a strategy
for the SimpleImputer
and an n_neighbors
for the KNeighborsClassifier
while logging the results of each model fit to a new rubicon_ml
experiment.
[8]:
from sklearn.base import clone
for imputer_strategy in ["mean", "median", "most_frequent"]:
for classifier_n_neighbors in [5, 10, 15, 20]:
pipeline = clone(penguin_pipeline)
pipeline.set_params(
si__strategy=imputer_strategy,
kn__n_neighbors=classifier_n_neighbors,
)
pipeline.fit(X_train, y_train)
score = pipeline.score(X_test, y_test)
experiment = project.log_experiment(tags=["parameter search"])
for feature_name in feature_names:
experiment.log_feature(name=feature_name)
experiment.log_parameter(name="strategy", value=imputer_strategy)
experiment.log_parameter(name="n_neighbors", value=classifier_n_neighbors)
experiment.log_metric(name="accuracy", value=score)
Now we can take a look at a few experiments and compare our results. Notice that we’re still pulling experiments from the same project that we logged the first one to. However, we’re only retrieving the experiments from the search above by using the “parameter search” tag when we get our experiments. Each experiment in the hyperparameter search above was tagged with “parameter search” when it was logged.
[9]:
print("experiments:")
for experiment in project.experiments(tags=["parameter search"]):
print(
f"\tid: {experiment.id}, "
f"parameters: {[(p.name, p.value) for p in experiment.parameters()]}, "
f"metrics: {[(m.name, m.value) for m in experiment.metrics()]}"
)
experiments:
id: a75b1258-2276-4eb1-beb5-caf83e9aacf3, parameters: [('strategy', 'mean'), ('n_neighbors', 5)], metrics: [('accuracy', 0.7307692307692307)]
id: 02a89318-b8d9-49a5-9337-7e4368cc54da, parameters: [('strategy', 'mean'), ('n_neighbors', 10)], metrics: [('accuracy', 0.75)]
id: ce24eeef-4686-4fc7-8c0a-e73d6c9cdb71, parameters: [('strategy', 'mean'), ('n_neighbors', 15)], metrics: [('accuracy', 0.7596153846153846)]
id: 093a9d02-89f7-4e48-82b1-f9ade435ef03, parameters: [('strategy', 'mean'), ('n_neighbors', 20)], metrics: [('accuracy', 0.7211538461538461)]
id: bc4d0503-32d1-4a11-8222-4151dae893cf, parameters: [('strategy', 'median'), ('n_neighbors', 5)], metrics: [('accuracy', 0.7211538461538461)]
id: c1b6cb3a-0ad1-4932-914d-ba53a054891b, parameters: [('strategy', 'median'), ('n_neighbors', 10)], metrics: [('accuracy', 0.7403846153846154)]
id: 9d6ffe67-088d-483f-9d3f-8f0fb34c22e8, parameters: [('strategy', 'median'), ('n_neighbors', 15)], metrics: [('accuracy', 0.7596153846153846)]
id: f497245a-6149-4604-9ceb-da74ae9855d4, parameters: [('strategy', 'median'), ('n_neighbors', 20)], metrics: [('accuracy', 0.7211538461538461)]
id: b2cd8067-ad4c-4ed5-87f7-2cd4536b2c73, parameters: [('strategy', 'most_frequent'), ('n_neighbors', 5)], metrics: [('accuracy', 0.7211538461538461)]
id: c4277327-381a-4885-aba4-a07c050463a5, parameters: [('strategy', 'most_frequent'), ('n_neighbors', 10)], metrics: [('accuracy', 0.75)]
id: d4ea2fe7-061e-4f5e-8958-e6ac29025708, parameters: [('strategy', 'most_frequent'), ('n_neighbors', 15)], metrics: [('accuracy', 0.7596153846153846)]
id: d9fe2005-824c-4e23-9809-e0459e57d78a, parameters: [('strategy', 'most_frequent'), ('n_neighbors', 20)], metrics: [('accuracy', 0.7211538461538461)]
rubicon_ml
can log more complex data as well. Below we’ll log our trained model as an artifact (generic binary) and a confusion matrix explaining the results as a dataframe (accepts both pandas
and dask
dataframes natively).
[10]:
import pandas as pd
from sklearn.metrics import confusion_matrix
experiment = project.experiments(tags=["parameter search"])[-1]
trained_model = pipeline._final_estimator
experiment.log_artifact(data_object=trained_model, name="trained model")
y_pred = pipeline.predict(X_test)
confusion_matrix_df = pd.DataFrame(
confusion_matrix(y_test, y_pred),
columns=target_values,
index=target_values,
)
experiment.log_dataframe(confusion_matrix_df, name="confusion matrix")
print(experiment.artifact(name="trained model").get_data(unpickle=True))
experiment.dataframe(name="confusion matrix").get_data()
KNeighborsClassifier(n_neighbors=20)
[10]:
Adelie | Gentoo | Chinstrap | |
---|---|---|---|
Adelie | 37 | 0 | 3 |
Gentoo | 19 | 0 | 1 |
Chinstrap | 6 | 0 | 38 |