{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Visualize Logged Dataframes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "``rubicon_ml`` makes it easy to log dataframes and comes with default visualization support, using\n", "[Dash and Plotly](https://plotly.com/dash/) under the hood. Specifically, ``rubicon_ml`` leverages\n", "[Plotly express](https://plotly.com/python/plotly-express/) to provide many robust visualization\n", "options." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup\n", "\n", "Let's create a project and log a sample dataframe to it so we can visualize it." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import os\n", "\n", "from rubicon_ml import Rubicon\n", "\n", "\n", "rubicon = Rubicon(persistence=\"memory\")\n", "project = rubicon.get_or_create_project(\"Plotting Example\")\n", "\n", "project" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "\n", "\n", "df = pd.DataFrame.from_records(\n", " [\n", " [\"Walmart\", 514405],\n", " [\"Exxon Mobil\", 290212],\n", " [\"Apple\", 265595],\n", " [\"Berkshire Hathaway\", 247837],\n", " [\"Amazon.com\", 232887]\n", " ],\n", " columns=[\"Company\", \"Revenue (in millions)\"]\n", ")\n", "\n", "experiment = project.log_experiment()\n", "dataframe = experiment.log_dataframe(df, name=\"sample revenue df\")\n", "\n", "dataframe" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualizing Dataframes\n", "\n", "``rubicon_ml.Dataframe.plot`` exposes plotting functionality to create simple plots like line, bar,\n", "or scatter plots and tables. There are even more complex plots available in ``plotly.express`` - any\n", "can be passed into ``rubicon_ml`` for visualization!\n", "\n", "> https://plotly.com/python/plotly-express/\n", "\n", "Here we'll visualize the dataframe with the default line chart (``px.line``). Any keyword arguments\n", "will be passed directly to the underlying Plotly express function.\n", "\n", "> **Note**: If you're viewing this notebook from the ``rubicon_ml`` docs, the plots and widget below\n", "are only images since we can't run a Dash server on GitHub pages. Running locally will produce\n", "fully interactive plots!" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": {}, "metadata": {}, "output_type": "display_data" } ], "source": [ "revenue_line = dataframe.plot(x=\"Company\", y=\"Revenue (in millions)\")\n", "\n", "revenue_line" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![dataframe line](visualizing-logged-dataframes-line.png \"dataframe-line\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This time we'll provide ``px.scatter`` as the ``plotting_func`` argument to generate a\n", "scatter plot." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": {}, "metadata": {}, "output_type": "display_data" } ], "source": [ "import plotly.express as px\n", "\n", "revenue_scatter = dataframe.plot(\n", " plotting_func=px.scatter,\n", " x=\"Company\", \n", " y=\"Revenue (in millions)\",\n", ")\n", "\n", "revenue_scatter" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![dataframe scatter](visualizing-logged-dataframes-scatter.png \"dataframe-scatter\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Dataframes can also be visualzied with the dataframe plot widget. For more info the dataframe plot,\n", "check out [this example](https://capitalone.github.io/rubicon-ml/visualizations/dataframe-plot.html).\n", "For more info on ``rubicon_ml``'s visualization widgets in general, check out \n", "[the docs](https://capitalone.github.io/rubicon-ml/visualizations.html)." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Dash is running on http://127.0.0.1:8050/\n", "\n", " * Serving Flask app 'rubicon_ml.viz.base' (lazy loading)\n", " * Environment: production\n", "\u001b[31m WARNING: This is a development server. Do not use it in a production deployment.\u001b[0m\n", "\u001b[2m Use a production WSGI server instead.\u001b[0m\n", " * Debug mode: off\n" ] } ], "source": [ "from rubicon_ml.viz import DataframePlot\n", "\n", "DataframePlot(\n", " experiments=[experiment],\n", " dataframe_name=\"sample revenue df\",\n", ").show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![dataframe widget](visualizing-logged-dataframes-widget.png \"dataframe-widget\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.4" } }, "nbformat": 4, "nbformat_minor": 4 }