API Reference¶
Complete API documentation for the Slingshot SDK.
Client¶
- class slingshot.client.SlingshotClient(api_key: str | None = None, api_url: str | None = None)[source]¶
Bases:
objectSlingshotClient is a client for interacting with the Slingshot API.
Get an API key from: https://slingshot.capitalone.com/configurations/api-keys
- __init__(api_key: str | None = None, api_url: str | None = None)[source]¶
Initialize the Slingshot client.
- Parameters:
api_key (str) – The API key for authentication. If not provided, it will look for the environment variable SLINGSHOT_API_KEY.
api_url (str) – The base URL for the Slingshot API. If not provided, it will look for the environment variable SLINGSHOT_API_URL, if not set, it will default to “https://slingshot.capitalone.com/prod/api/gradient”.
- Raises:
ValueError – If the API key is not provided and not found in the environment.
Example
>>> from slingshot.client import SlingshotClient >>> # Or: >>> # from slingshot import SlingshotClient >>> client = SlingshotClient(api_key="your_api_key")
- property projects: ProjectAPI¶
Get the projects API client.
API Modules¶
Projects API¶
- class slingshot.api.projects.ProjectAPI(client: SlingshotClient)[source]¶
Bases:
objectAPI for managing projects in Slingshot.
- __init__(client: SlingshotClient)[source]¶
Initialize the ProjectAPI.
- create(name: str, workspace_id: str, description: str | None = Ellipsis, app_id: str | None = Ellipsis, job_id: str | None = Ellipsis, cluster_path: str | None = Ellipsis, settings: AssignSettingsSchema | None = Ellipsis) ProjectSchema[source]¶
Create a new Slingshot project for optimizing a Databricks job cluster.
- Parameters:
name (str) – The name of the Slingshot project.
workspace_id (str) – The Databricks workspace ID where the job runs.
description (Optional[str], optional) – A description for the Slingshot project. Defaults to None.
app_id (Optional[str], optional) – The application ID, which must be unique across all active (not deleted) projects belonging to a Slingshot subscriber. This field can be used to search for a project with the
get_projects()anditerate_projects()methods. The app_id is immutable once the project is created. Defaults to None.job_id (Optional[str], optional) – The Databricks job ID that will be associated with this Slingshot project. Defaults to None.
cluster_path (Optional[str], optional) –
The name of the Databricks job cluster to be optimized by this Slingshot project, prefixed with “job_clusters/” for a job cluster that is available to any task in the job; or the task name prefixed with “tasks/” for a task-specific cluster not available to other tasks in the job. For example, “job_clusters/my-cluster” or “tasks/task_1”. This field is required if the job has multiple compute clusters. If the job has only one compute cluster, this field is optional. Defaults to None.
Each Slingshot project is linked to a single compute cluster in Databricks. If the cluster_path is not provided for a job that has multiple compute clusters, the Slingshot project will not be able to retrieve information about the job runs nor generate recommendations for optimizing the compute cluster.
You can find the cluster name in the Databricks UI when viewing the configuration for a job cluster as the “Cluster name” field, or using the Databricks API, where it is called “job_cluster_key”.
The task name is shown in the Databricks UI as the “Task name” field after selecting the task in the job configuration. In the Databricks API, it is called “task_key”.
With the Databricks Python SDK, you can retrieve the cluster_path using the job_cluster_key or task_key from the job or task settings. For example, to get the
Jobobject and extract the job_cluster_key or task_key, you can use the following code:>>> from databricks.sdk import WorkspaceClient >>> workspace_client = WorkspaceClient() >>> job = workspace_client.jobs.get(job_id=1234567890)
If the job cluster is defined for the job and potentially shared across tasks in the job (which is the case for jobs created in the Databricks UI), you can retrieve the job_cluster_key like this:
>>> cluster_name = job.settings.job_clusters[0].job_cluster_key >>> print(f'cluster_path="job_clusters/{cluster_name}"')
Or, if the job cluster definition is tied to a specific task rather than shared across the entire job, you can first check whether the task is using a shared cluster, and if not, use the task_key as the cluster_path. When jobs are created with the Databricks API or SDK, tasks can be configured to use a new_cluster that is not shared with other tasks, in which case the job_cluster_key will not be set, and you should use the task_key instead:
>>> if (cluster_name := job.settings.tasks[0].job_cluster_key): >>> print(f'cluster_path="job_clusters/{cluster_name}"') >>> else: >>> task_name = job.settings.tasks[0].task_key >>> print(f'cluster_path="tasks/{task_name}"')
See also:
settings (AssignSettingsSchema, optional) –
A dictionary that sets Slingshot project options. Defaults to None.
- sla_minutes (Optional[int], optional): The acceptable time (in minutes) for the job to complete.
The SLA (Service Level Agreement) is the maximum time the job should take to complete. Slingshot uses this value as an expected upper bound when optimizing the job for lowest cost. Defaults to None.
- auto_apply_recs (Optional[bool], optional): Automatically apply recommendations.
Defaults to False.
- Returns:
The details of the newly created project.
- Return type:
- update(project_id: str, name: str | None = Ellipsis, workspace_id: str | None = Ellipsis, description: str | None = Ellipsis, job_id: str | None = Ellipsis, cluster_path: str | None = Ellipsis, settings: AssignSettingsSchema | None = Ellipsis) ProjectSchema[source]¶
Update the attributes of an existing Slingshot project.
Only those attributes that are provided will be updated. Attributes set to None will overwrite the project attribute with None.
- Parameters:
project_id (str) – The ID of the Slingshot project to update.
name (Optional[str], optional) – The new name for the Slingshot project.
workspace_id (Optional[str], optional) –
The new Databricks workspace ID where the job runs.
Note: If you are changing the Databricks workspace associated with the Slingshot project, you probably also want to reset the project using the
reset()method. This will remove all previous job run data from the project, allowing Slingshot to re-optimize the job without the influence of previous runs.description (Optional[str], optional) – The new description for the Slingshot project.
job_id (Optional[str], optional) –
The new Databricks job ID that will be associated with this Slingshot project.
Note: If you are changing the Databricks job associated with the Slingshot project, you probably also want to reset the project using the
reset()method. This will remove all previous job run data from the project, allowing Slingshot to re-optimize the job without the influence of previous runs.cluster_path (Optional[str], optional) –
The name of the Databricks job cluster to be optimized by this Slingshot project, prefixed with “job_clusters/” for a job cluster that is available to any task in the job; or the task name prefixed with “tasks/” for a task-specific cluster not available to other tasks in the job. For example, “job_clusters/my-cluster” or “tasks/task_1”. This field is required if the job has multiple compute clusters. If the job has only one compute cluster, this field is optional.
Each Slingshot project is linked to a single compute cluster in Databricks. If the cluster_path is not provided for a job that has multiple compute clusters, the Slingshot project will not be able to retrieve information about the job runs nor generate recommendations for optimizing the compute cluster.
You can find the cluster name in the Databricks UI when viewing the configuration for a job cluster as the “Cluster name” field, or using the Databricks API, where it is called “job_cluster_key”.
The task name is shown in the Databricks UI as the “Task name” field after selecting the task in the job configuration. In the Databricks API, it is called “task_key”.
With the Databricks Python SDK, you can retrieve the cluster_path using the job_cluster_key or task_key from the job or task settings. For example, to get the
Jobobject and extract the job_cluster_key or task_key, you can use the following code:>>> from databricks.sdk import WorkspaceClient >>> workspace_client = WorkspaceClient() >>> job = workspace_client.jobs.get(job_id=1234567890)
If the job cluster is defined for the job and potentially shared across tasks in the job (which is the case for jobs created in the Databricks UI), you can retrieve the job_cluster_key like this:
>>> cluster_name = job.settings.job_clusters[0].job_cluster_key >>> print(f'cluster_path="job_clusters/{cluster_name}"')
Or, if the job cluster definition is tied to a specific task rather than shared across the entire job, you can first check whether the task is using a shared cluster, and if not, use the task_key as the cluster_path. When jobs are created with the Databricks API or SDK, tasks can be configured to use a new_cluster that is not shared with other tasks, in which case the job_cluster_key will not be set, and you should use the task_key instead:
>>> if (cluster_name := job.settings.tasks[0].job_cluster_key): >>> print(f'cluster_path="job_clusters/{cluster_name}"') >>> else: >>> task_name = job.settings.tasks[0].task_key >>> print(f'cluster_path="tasks/{task_name}"')
See also:
settings (AssignSettingsSchema, optional) –
A dictionary with updates to the options for the Slingshot project. The options are:
- sla_minutes (Optional[int], optional): The acceptable time (in minutes) for the job to complete.
The SLA (Service Level Agreement) is the maximum time the job should take to complete. Slingshot uses this value as an expected upper bound when optimizing the job for lowest cost.
auto_apply_recs (Optional[bool], optional): Automatically apply recommendations.
- Returns:
The details of the updated project.
- Return type:
- delete(project_id: str) None[source]¶
Delete a Slingshot project by its ID.
This method removes the Slingshot project but does not affect the Databricks job that was associated with the project.
- Parameters:
project_id (str) – The ID of the Slingshot project to delete.
- Returns:
None
- reset(project_id: str) None[source]¶
Reset a Slingshot project by its ID, removing all previous job run data from the project.
Use this method to clear all previous job run data and start fresh with the same project. It is useful when a job changes significantly and you want to re-optimize it without the influence of previous runs, since Slingshot uses historical run data to optimize the job.
This does not affect the Databricks job associated with the project; run history will still be accessible from the Databricks platform.
- Parameters:
project_id (str) – The ID of the Slingshot project to reset.
- Returns:
None
- get_projects(include: list[str] | None = None, creator_id: str | None = None, app_id: str | None = None, job_id: str | None = None, page: int = 1, size: int = 50) Page[ProjectSchema][source]¶
Retrieve a paginated list of projects based on filter criteria.
- Parameters:
include (Optional[list[str]]) – Attributes within
ProjectSchemato include in the response. If not provided, all available attributes are included. Defaults to None.creator_id (Optional[str], optional) – The ID of the project creator to filter projects by. Defaults to None.
app_id (Optional[str], optional) – The application ID to filter projects by. This is an identifier that is unique across all projects for a Slingshot subscriber and is set at the time a project is created. Defaults to None.
job_id (Optional[str], optional) – The Databricks job ID to filter projects by. Defaults to None.
page (int, optional) – The page number to retrieve. Defaults to 1.
size (int, optional) – The number of projects to retrieve per page. Defaults to 50.
- Returns:
A list of project details for the requested page.
- Return type:
- iterate_projects(include: list[str] | None = None, creator_id: str | None = None, app_id: str | None = None, job_id: str | None = None, size: int = 50, max_pages: int = 1000) Iterator[ProjectSchema][source]¶
Fetch all projects page by page using a memory-efficient generator.
- Parameters:
include (Optional[list[str]]) – Attributes within
ProjectSchemato include in the response. If not provided, all available attributes are included. Defaults to None.creator_id (Optional[str], optional) – The ID of the project creator to filter projects by. Defaults to None.
app_id (Optional[str], optional) – The application ID to filter projects by. This is an identifier that is unique across all projects for a Slingshot subscriber and is set at the time a project is created. Defaults to None.
job_id (Optional[str], optional) – The Databricks job ID to filter projects by. Defaults to None.
size (int, optional) – The number of projects to retrieve per page. Defaults to 50.
max_pages (int, optional) – The maximum number of pages allowed to traverse. Defaults to 1000.
- Yields:
Iterator[ProjectSchema] – A project object, one at a time.
- get_project(project_id: str, include: list[str] | None = None) ProjectSchema[source]¶
Fetch a project by its ID.
- Parameters:
- Returns:
The project details.
- Return type:
- create_recommendation(project_id: str) RecommendationDetailsSchema[source]¶
Create a new recommendation for a Slingshot project.
Recommendations are suggested changes to Databricks job cluster configurations meant to minimize costs while keeping job run time within required SLAs. They are generated based on the previous job runs associated with the Slingshot project.
A recommendation can be created for a project once Slingshot has received details about a successful job run associated with that project. Slingshot will begin checking for job runs after a project is linked to a Databricks job (or a cluster within that job).
The recommendation will be in a “PENDING” state immediately after creation, meaning it is still being processed. It can be applied if its state is “PENDING”, “UPLOADING”, or “SUCCESS” (but not “FAILURE”).
Note
The returned value, a dictionary with info about the recommendation, lacks the full details of the recommendation because the state is still “PENDING” immediately after the recommendation is created. Use the method
get_recommendation()to retrieve the full details, like this:>>> from slingshot import SlingshotClient >>> client = SlingshotClient() >>> project_id = "your_project_id"
>>> # Create a recommendation >>> recommendation = client.projects.create_recommendation(project_id) >>> # Get the recommendation details >>> recommendation_details = client.projects.get_recommendation( >>> project_id=project_id, recommendation_id=recommendation["id"] >>> )
- Parameters:
project_id (str) – The ID of the project to create a recommendation for.
- Returns:
A dictionary with details about the recommendation that was created. The recommendation will have a “PENDING” state, meaning it is still being processed. To get the full details of the recommendation, use the
get_recommendation()method with the recommendation ID returned in the response.- Return type:
- get_recommendation(project_id: str, recommendation_id: str) RecommendationDetailsSchema[source]¶
Fetch a specific recommendation for a Slingshot project.
Recommendations are suggested changes to Databricks job cluster configurations meant to minimize costs while keeping job run time within required SLAs. They are generated based on the previous job runs associated with the Slingshot project.
- Parameters:
- Returns:
A dictionary with details of the recommendation.
- Return type:
- apply_recommendation(project_id: str, recommendation_id: str) RecommendationDetailsSchema[source]¶
Apply a recommendation to the Slingshot project.
The recommendation is applied to the Databricks job cluster associated with the Slingshot project.
Recommendations are suggested changes to Databricks job cluster configurations meant to minimize costs while keeping job run time within required SLAs. They are generated based on the previous job runs linked to the Slingshot project.
A recommendation can be applied if its state is “SUCCESS”, “PENDING”, or “UPLOADING”. If the recommendation is in a “FAILURE” state, applying it will raise an error.
- Parameters:
- Returns:
A dictionary with details of the recommendation that was applied.
- Return type:
API Schema Types¶
These are the data types returned by the API methods that match the Slingshot API schema.
- class slingshot.types.ProjectSchema(_typename, _fields=None, /, **kwargs)[source]¶
Bases:
dictSchema for a project in Slingshot.
- settings: ProjectSettingsSchema | None¶
- metrics: ProjectMetricsSchema | None¶
- creator: ProjectCreatorSchema | None¶
- class slingshot.types.ProjectSettingsSchema(_typename, _fields=None, /, **kwargs)[source]¶
Bases:
dictSchema for retrieving the project additional settings in Slingshot.
- class slingshot.types.AssignSettingsSchema(_typename, _fields=None, /, **kwargs)[source]¶
Bases:
dictSchema for assigning additional project settings in Slingshot.
- class slingshot.types.ProjectMetricsSchema(_typename, _fields=None, /, **kwargs)[source]¶
Bases:
dictSchema for retrieving the project metrics in Slingshot.
- class slingshot.types.ProjectCreatorSchema(_typename, _fields=None, /, **kwargs)[source]¶
Bases:
dictSchema for retrieving the project creator in Slingshot.
- class slingshot.types.RecommendationDetailsSchema(_typename, _fields=None, /, **kwargs)[source]¶
Bases:
dictSchema for retrieving the details of recommendation to a project in Slingshot.
- recommendation: RecommendationSchema | None¶
- class slingshot.types.RecommendationSchema(_typename, _fields=None, /, **kwargs)[source]¶
Bases:
dictSchema for the recommendation of a project in Slingshot.
- metrics: MetricsSchema | None¶
- configuration: ConfigurationSchema | None¶
- class slingshot.types.MetricsSchema(_typename, _fields=None, /, **kwargs)[source]¶
Bases:
dictSchema for the recommended metrics in Slingshot.
- class slingshot.types.ConfigurationSchema(_typename, _fields=None, /, **kwargs)[source]¶
Bases:
dictSchema for the recommended configuration in Slingshot.
- autoscale: AutoscaleSchema | None¶
- aws_attributes: AwsAttributesSchema | None¶
- azure_attributes: AzureAttributesSchema | None¶
- cluster_log_conf: ClusterLogConfSchema | None¶
- spec: SpecSchema | None¶
- class slingshot.types.AutoscaleSchema(_typename, _fields=None, /, **kwargs)[source]¶
Bases:
dictSchema for the autoscale configuration in a recommendation.
- class slingshot.types.AwsAttributesSchema(_typename, _fields=None, /, **kwargs)[source]¶
Bases:
dictSchema for the AWS attributes in a recommendation.
- class slingshot.types.AzureAttributesSchema(_typename, _fields=None, /, **kwargs)[source]¶
Bases:
dictSchema for the Azure attributes in a recommendation.
- class slingshot.types.ClusterLogConfSchema(_typename, _fields=None, /, **kwargs)[source]¶
Bases:
dictSchema for the cluster log configuration in a recommendation.
- dbfs: DbfsLogConfSchema | None¶
- s3: S3LogConfSchema | None¶
- volumes: VolumesLogConfSchema | None¶
- class slingshot.types.DbfsLogConfSchema(_typename, _fields=None, /, **kwargs)[source]¶
Bases:
dictSchema for DBFS log configuration in a cluster log configuration.
- class slingshot.types.S3LogConfSchema(_typename, _fields=None, /, **kwargs)[source]¶
Bases:
dictSchema for S3 log configuration in a cluster log configuration.
- class slingshot.types.VolumesLogConfSchema(_typename, _fields=None, /, **kwargs)[source]¶
Bases:
dictSchema for volume log configuration in a cluster log configuration.