View this notebook on GitHub or run it yourself on Binder!


Tagging#

Sometimes we might want to tag experiments and objects with distinct values to organize and filter them later on. For example, tags could be used to differentiate between the type of model or classifier used during the experiment (i.e. linear regression or random forest). Besides, experiments, rubicon_ml can tag artifacts, dataframes, features, metrics, and parameters.

Below, we’ll see examples of tagging functionality.

Adding tags when logging#

By utilizing the tags parameter:

[1]:
from rubicon_ml import Rubicon
import pandas as pd

rubicon = Rubicon(persistence="memory")
project = rubicon.get_or_create_project("Tagging")

#logging experiments with tags
experiment1 = project.log_experiment(name="experiment1", tags=["odd_num_exp"])
experiment2 = project.log_experiment(name="experiment2", tags=["even_num_exp"])

#logging artifacts, dataframes, features, metrics and parameters with tags
first_artifact = experiment1.log_artifact(data_bytes=b"bytes", name="data_path", tags=["data"])

confusion_matrix = pd.DataFrame([[5, 0, 0], [0, 5, 1], [0, 0, 4]], columns=["x", "y", "z"])
first_dataframe = experiment1.log_dataframe(confusion_matrix, tags=["three_column"])

first_feature = experiment1.log_feature("year", tags=["time"])

first_metric = experiment1.log_metric("accuracy", .8, tags=["scalar"])

#can add multiple tags at logging (works for all objects)
first_parameter = experiment1.log_parameter("n_estimators", tags=["tag1", "tag2"])

Viewing tags#

Use the .tags attribute to view tags associated with an object:

[2]:
print(experiment1.tags)
print(experiment2.tags)
print(first_artifact.tags)
print(first_dataframe.tags)
print(first_feature.tags)
print(first_metric.tags)
print(first_parameter.tags)
['odd_num_exp']
['even_num_exp']
['data']
['three_column']
['time']
['scalar']
['tag1', 'tag2']

Adding tags to existing objects#

Use the object’s add_tags() method. Works the same for all taggable objects. Here’s an example:

[3]:
experiment1.add_tags(["linear regression"])
experiment2.add_tags(["random forrest"])
first_artifact.add_tags(["added_tag"])
first_dataframe.add_tags(["added_tag"])
first_feature.add_tags(["added_tag"])
first_metric.add_tags(["added_tag"])

#can add multiple tags (works for all objects)
first_parameter.add_tags(["added_tag1", "added_tag2"])


print(experiment1.tags)
print(experiment2.tags)
print(first_artifact.tags)
print(first_dataframe.tags)
print(first_feature.tags)
print(first_metric.tags)
print(first_parameter.tags)
['linear regression', 'odd_num_exp']
['even_num_exp', 'random forrest']
['data', 'added_tag']
['added_tag', 'three_column']
['time', 'added_tag']
['added_tag', 'scalar']
['added_tag2', 'tag1', 'tag2', 'added_tag1']

Removing tags from existing objects#

Use the object’s remove_tags() method. Works the same for all taggable objects. Here’s an example:

[4]:
experiment1.remove_tags(["linear regression"])
experiment2.remove_tags(["random forrest"])
first_artifact.remove_tags(["added_tag"])
first_dataframe.remove_tags(["added_tag"])
first_feature.remove_tags(["added_tag"])
first_metric.remove_tags(["added_tag"])

#can remove multiple tags (works for all objects)
first_parameter.remove_tags(["added_tag2", "added_tag1"])

print(experiment1.tags)
print(experiment2.tags)
print(first_artifact.tags)
print(first_dataframe.tags)
print(first_feature.tags)
print(first_metric.tags)
print(first_parameter.tags)
['odd_num_exp']
['even_num_exp']
['data']
['three_column']
['time']
['scalar']
['tag2', 'tag1']

Retreiving objects by their tags#

After logging objects, here’s how we can include tags as a paramter to filter our results. We can specify the qtype parameter to change the search type to “and” from “or” (default). Here this is only shown with experiments, but works for any taggable object when doing parentObject.retrievalObjects():

[5]:
experiment1.add_tags(["old_exp"])
experiment2.add_tags(["old_exp"])
experiment3 = project.log_experiment(name="experiment3", tags=["odd_num_exp","new_exp"])

#want just old experiments
old_experiments = project.experiments(tags=["old_exp"])

#want just new experiments
new_experiments = project.experiments(tags=["new_exp"])

#want just the odd number experiments
odd_experiments = project.experiments(tags=["odd_num_exp"])

#this will return the same result as above since qtype="or" by default
same_experiments = project.experiments(tags=["odd_num_exp", "new_exp"])

#this will return just experiment3
expected_experiment = project.experiments(tags=["odd_num_exp", "new_exp"], qtype="and")


#getting both the old experiments 1 and 2
print("old experiments: " + str(old_experiments[0].name) + ", " + str(old_experiments[1].name) + "\n")

#getting just the new experiment 3
print("new experiments: " + str(new_experiments[0].name) + "\n")

#getting both odd experiments 1 and 3
print("odd experiments: " + str(odd_experiments[0].name) + ", " + str(odd_experiments[1].name) + "\n")

#again getting both experiments 1 and 3
print("same experiments: " + str(same_experiments[0].name) + ", " + str(same_experiments[1].name) + "\n")

#getting just experiment 3
print("expected experiment: " + str(expected_experiment[0].name) + "\n")
old experiments: experiment1, experiment2

new experiments: experiment3

odd experiments: experiment1, experiment3

same experiments: experiment1, experiment3

expected experiment: experiment3