Contribute a schema¶
Consider the following schema that was created in the “Register a custom schema” section:
extended_schema = {
"name": "sklearn__RandomForestClassifier__ext",
"extends": "sklearn__RandomForestClassifier",
"parameters": [
{"name": "runtime_environment", "value_env": "RUNTIME_ENV"},
],
}
To contribute “sklearn__RandomForestClassifier__ext” to the rubicon_ml.schema
registry,
first write the dictionary out to a YAML file.
import yaml
schema_filename = "sklearn__RandomForestClassifier__ext.yaml"
with open(schema_filename, "w") as file:
file.write(yaml.dump(extended_schema))
Once “sklearn__RandomForestClassifier__ext.yaml” is created, follow the “Developer instructions” to fork the rubicon-ml GitHub repository and prepare to make a contribution.
From the root of the forked repository, copy the new schema into the library’s schema directory:
cp [PATH_TO]/sklearn__RandomForestClassifier__ext.yaml rubicon_ml/schema/schema/
Then update rubicon_ml/schema/registry.py, adding the new schema to the
RUBICON_SCHEMA_REGISTRY
:
RUBICON_SCHEMA_REGISTRY = {
# other schema entries...
"sklearn__RandomForestClassifier__ext": lambda: _load_schema(
os.path.join("schema", "sklearn__RandomForestClassifier__ext.yaml")
),
}
Finally refer back to the “Contribute” section of the “Developer instructions” to push your
changes to GitHub and open a pull request. Once the pull request is merged,
“sklearn__RandomForestClassifier__ext” will be available in the next release of
rubicon_ml
.
Schema naming conventions¶
When naming a schema that extends a schema already made available by rubicon_ml.schema
, simply
append a double-underscore and a unique identifier. The “sklearn__RandomForestClassifier__ext”
above is named following this convention.
When naming a schema that represents an object that is not yet present in schema,
leverage the registry.get_schema_name
function to generate a name. For example, if
you are making a schema for an object my_obj
of class Model
from a module my_model
,
registry.get_schema_name(my_obj)
will return the name “my_model__Model”.