-
Notifications
You must be signed in to change notification settings - Fork 53
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Adding Databricks tool demo notebooks for qualification and profiling (…
…#249) * Adding Databricks tool demo notebooks for basic qualification and profiling Signed-off-by: Matt Ahrens <[email protected]> * Adding README updates for the databricks tool notebooks Signed-off-by: Matt Ahrens <[email protected]> * Fixing typo in OUTPUT_DIR in qual notebook Signed-off-by: Matt Ahrens <[email protected]> Signed-off-by: Matt Ahrens <[email protected]>
- Loading branch information
1 parent
41f25c7
commit 5f77070
Showing
4 changed files
with
15 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
# Databricks Tools Demo Notebooks | ||
|
||
The RAPIDS Accelerator for Apache Spark includes two key tools for understanding the benefits of | ||
GPU acceleration as well as analyzing GPU Spark jobs. For customers on Databricks, the demo | ||
notebooks offer a simple interface for running the tools given a set of Spark event logs from | ||
CPU (qualification) or GPU (profiling) application runs. | ||
|
||
To use a demo notebook, you can import the notebook in the Databricks Notebook UI via File->Import Notebook. | ||
|
||
Once the demo notebook is imported, you can select run to activate the notebook to an available compute | ||
cluster. Once the notebook is activated, you can enter in the log path location in the text widget at the | ||
top of the notebook. After that, select *Run all* to execute the tools for the specific logs in the log path. |
1 change: 1 addition & 0 deletions
1
...s/databricks/[RAPIDS Accelerator for Apache Spark] Profiling Tool Notebook Template.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"cells":[{"cell_type":"markdown","source":["# Welcome to the Profiling Tool for the RAPIDS Accelerator for Apache Spark\nTo run the tool, you need to enter a log path that represents the DBFS location for your Spark GPU event logs. Then you can select \"Run all\" to execute the notebook. After the notebook completes, you will see various output tables show up below.\n\n## GPU Job Tuning Recommendations\nThis has general suggestions for tuning your applications to run optimally on GPUs.\n\n## Per-Job Profile\nThe profiler output includes information about the application, data sources, executors, SQL stages, Spark properties, and key application metrics at the job and stage levels."],"metadata":{"application/vnd.databricks.v1+cell":{"showTitle":false,"cellMetadata":{},"nuid":"5156a76c-7af7-465d-aff4-41a2e54e3595","inputWidgets":{},"title":""}}},{"cell_type":"code","source":["import json\nimport requests\nimport base64\nimport shlex\nimport subprocess\nimport pandas as pd\n\nTOOL_JAR_URL = 'https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/22.10.0/rapids-4-spark-tools_2.12-22.10.0.jar'\nTOOL_JAR_LOCAL_PATH = '/tmp/rapids-4-spark-tools.jar'\n\n# Profiling tool output directory.\nOUTPUT_DIR = '/tmp' \n\nresponse = requests.get(TOOL_JAR_URL)\nopen(TOOL_JAR_LOCAL_PATH, \"wb\").write(response.content)"],"metadata":{"application/vnd.databricks.v1+cell":{"showTitle":false,"cellMetadata":{},"nuid":"53b4d770-9db6-4bd7-9b93-d036d375eac5","inputWidgets":{},"title":""}},"outputs":[],"execution_count":0},{"cell_type":"code","source":["dbutils.widgets.text(\"log_path\", \"\")"],"metadata":{"application/vnd.databricks.v1+cell":{"showTitle":false,"cellMetadata":{},"nuid":"f0e4371a-d2d9-4449-81ed-8f6c61ae8f80","inputWidgets":{},"title":""}},"outputs":[],"execution_count":0},{"cell_type":"code","source":["eventlog_string=dbutils.widgets.get(\"log_path\") \n\nq_command_string=\"java -Xmx10g -cp /tmp/rapids-4-spark-tools.jar:/databricks/jars/* com.nvidia.spark.rapids.tool.profiling.ProfileMain --csv --auto-tuner -o {} \".format(OUTPUT_DIR) + eventlog_string\nargs = shlex.split(q_command_string)\ncmd_out = subprocess.run(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)\n\nif cmd_out.returncode != 0:\n dbutils.notebook.exit(\"Profiling Tool failed with stderr:\" + cmd_out.stderr)"],"metadata":{"application/vnd.databricks.v1+cell":{"showTitle":false,"cellMetadata":{},"nuid":"e9e7cecf-c2dc-4a0f-aea1-61a323e4ccc4","inputWidgets":{},"title":""}},"outputs":[],"execution_count":0},{"cell_type":"code","source":["import os\n\napp_df = pd.DataFrame(columns = ['appId', 'appName'])\n\nfor x in os.scandir(OUTPUT_DIR + \"/rapids_4_spark_profile/\"):\n tmp_df = pd.read_csv(x.path + \"/application_information.csv\")\n app_df = app_df.append(tmp_df[['appId', 'appName']])"],"metadata":{"application/vnd.databricks.v1+cell":{"showTitle":false,"cellMetadata":{},"nuid":"be0a2da7-1ee3-475e-96f9-303779edfd85","inputWidgets":{},"title":""}},"outputs":[],"execution_count":0},{"cell_type":"markdown","source":["## GPU Job Tuning Recommendations"],"metadata":{"application/vnd.databricks.v1+cell":{"showTitle":false,"cellMetadata":{},"nuid":"a1e326ec-5701-4b08-ae0f-7df0c8440038","inputWidgets":{},"title":""}}},{"cell_type":"code","source":["app_list = app_df[\"appId\"].tolist()\napp_recommendations = pd.DataFrame(columns=['app', 'recommendations'])\n\nfor app in app_list:\n app_file = open(OUTPUT_DIR + \"/rapids_4_spark_profile/\" + app + \"/profile.log\")\n recommendations_start = 0\n recommendations_str = \"\"\n for line in app_file:\n if recommendations_start == 1:\n recommendations_str = recommendations_str + line\n if \"### D. Recommended Configuration ###\" in line:\n recommendations_start = 1\n app_recommendations = app_recommendations.append({'app': app, 'recommendations': recommendations_str}, ignore_index=True)\n \ndisplay(app_recommendations)"],"metadata":{"application/vnd.databricks.v1+cell":{"showTitle":false,"cellMetadata":{},"nuid":"4979f78c-44a0-4e54-b803-e5e194b71104","inputWidgets":{},"title":""}},"outputs":[],"execution_count":0},{"cell_type":"markdown","source":["## Per-App Profile"],"metadata":{"application/vnd.databricks.v1+cell":{"showTitle":false,"cellMetadata":{},"nuid":"1d4f9927-e9d8-4897-b604-f7832dc634aa","inputWidgets":{},"title":""}}},{"cell_type":"code","source":["for x in os.scandir(OUTPUT_DIR + \"/rapids_4_spark_profile/\"):\n print(\"APPLICATION ID = \" + str(x))\n log = open(x.path + \"/profile.log\")\n print(log.read())"],"metadata":{"application/vnd.databricks.v1+cell":{"showTitle":false,"cellMetadata":{},"nuid":"9a8f1a58-e86f-4bd0-a245-878186feb8b9","inputWidgets":{},"title":""}},"outputs":[],"execution_count":0}],"metadata":{"application/vnd.databricks.v1+notebook":{"notebookName":"[RAPIDS Accelerator for Apache Spark] Profiling Tool Notebook Template","dashboards":[{"elements":[{"elementNUID":"be0a2da7-1ee3-475e-96f9-303779edfd85","dashboardResultIndex":0,"guid":"05eef9d3-7c55-4e26-8d1f-fa80338359e6","resultIndex":null,"options":null,"position":{"x":0,"y":0,"height":6,"width":24,"z":null},"elementType":"command"}],"guid":"a9ea7799-040a-484e-a59d-c3cdf5072953","layoutOption":{"stack":true,"grid":true},"version":"DashboardViewV1","nuid":"91c1bfb2-695a-4e5c-8a25-848a433108dc","origId":2690941040041430,"title":"Executive View","width":1600,"globalVars":{}},{"elements":[],"guid":"0896a45f-af1b-4849-b6c2-2b6abcb8b97b","layoutOption":{"stack":true,"grid":true},"version":"DashboardViewV1","nuid":"62243296-4562-4f06-90ac-d7a609f19c16","origId":2690941040041431,"title":"App View","width":1920,"globalVars":{}}],"notebookMetadata":{"pythonIndentUnit":2,"widgetLayout":[{"name":"log_path","width":576,"breakBefore":false},{"name":"Apps","width":494,"breakBefore":false}]},"language":"python","widgets":{"log_path":{"nuid":"c7ce3870-db19-4813-b1cb-cead3f4c36f1","currentValue":"/dbfs/","widgetInfo":{"widgetType":"text","name":"log_path","defaultValue":"","label":null,"options":{"widgetType":"text","validationRegex":null}}}},"notebookOrigID":2690941040041407}},"nbformat":4,"nbformat_minor":0} |
1 change: 1 addition & 0 deletions
1
...tabricks/[RAPIDS Accelerator for Apache Spark] Qualification Tool Notebook Template.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"cells":[{"cell_type":"markdown","source":["# Welcome to the Qualification Tool for the RAPIDS Accelerator for Apache Spark\nTo run the tool, you need to enter a log path that represents the DBFS location for your Spark CPU event logs. Then you can select \"Run all\" to execute the notebook. After the notebook completes, you will see various output tables show up below.\n\n## Summary Output\nThe report represents the entire app execution, including unsupported operators and non-SQL operations. By default, the applications and queries are sorted in descending order by the following fields:\n- Recommendation;\n- Estimated GPU Speed-up;\n- Estimated GPU Time Saved; and\n- End Time.\n\n## Stages Output\nFor each stage used in SQL operations, the Qualification tool generates the following information:\n1. App ID\n1. Stage ID\n1. Average Speedup Factor: the average estimated speed-up of all the operators in the given stage.\n1. Stage Task Duration: amount of time spent in tasks of SQL Dataframe operations for the given stage.\n1. Unsupported Task Duration: sum of task durations for the unsupported operators. For more details, see Supported Operators.\n1. Stage Estimated: True or False indicates if we had to estimate the stage duration.\n\n## Execs Output\nThe Qualification tool generates a report of the “Exec” in the “SparkPlan” or “Executor Nodes” along with the estimated acceleration on the GPU. Please refer to the Supported Operators guide for more details on limitations on UDFs and unsupported operators.\n1. App ID\n1. SQL ID\n1. Exec Name: example Filter, HashAggregate\n1. Expression Name\n1. Task Speedup Factor: it is simply the average acceleration of the operators based on the original CPU duration of the operator divided by the GPU duration. The tool uses historical queries and benchmarks to estimate a speed-up at an individual operator level to calculate how much a specific operator would accelerate on GPU.\n1. Exec Duration: wall-Clock time measured since the operator starts till it is completed.\n1. SQL Node Id\n1. Exec Is Supported: whether the Exec is supported by RAPIDS or not. Please refer to the Supported Operators section.\n1. Exec Stages: an array of stage IDs\n1. Exec Children\n1. Exec Children Node Ids\n1. Exec Should Remove: whether the Op is removed from the migrated plan."],"metadata":{"application/vnd.databricks.v1+cell":{"showTitle":false,"cellMetadata":{},"nuid":"df33c614-2ecc-47a0-8600-bc891681997f","inputWidgets":{},"title":""}}},{"cell_type":"code","source":["import json\nimport requests\nimport base64\nimport shlex\nimport subprocess\nimport pandas as pd\n\nTOOL_JAR_URL = 'https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/22.10.0/rapids-4-spark-tools_2.12-22.10.0.jar'\nTOOL_JAR_LOCAL_PATH = '/tmp/rapids-4-spark-tools.jar'\n\n# Qualification tool output directory.\nOUTPUT_DIR = '/tmp/'\n\nresponse = requests.get(TOOL_JAR_URL)\nopen(TOOL_JAR_LOCAL_PATH, \"wb\").write(response.content)"],"metadata":{"application/vnd.databricks.v1+cell":{"showTitle":false,"cellMetadata":{},"nuid":"53b4d770-9db6-4bd7-9b93-d036d375eac5","inputWidgets":{},"title":""}},"outputs":[],"execution_count":0},{"cell_type":"code","source":["dbutils.widgets.text(\"log_path\", \"\")\neventlog_string=dbutils.widgets.get(\"log_path\")\n\nq_command_string=\"java -Xmx10g -cp /tmp/rapids-4-spark-tools.jar:/databricks/jars/* com.nvidia.spark.rapids.tool.qualification.QualificationMain -o {} \".format(OUTPUT_DIR) + eventlog_string\nargs = shlex.split(q_command_string)\ncmd_out = subprocess.run(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)\n\n\nif cmd_out.returncode != 0:\n dbutils.notebook.exit(\"Qualification Tool failed with stderr:\" + cmd_out.stderr)"],"metadata":{"application/vnd.databricks.v1+cell":{"showTitle":false,"cellMetadata":{},"nuid":"e9e7cecf-c2dc-4a0f-aea1-61a323e4ccc4","inputWidgets":{},"title":""}},"outputs":[],"execution_count":0},{"cell_type":"markdown","source":["## Summary Output"],"metadata":{"application/vnd.databricks.v1+cell":{"showTitle":false,"cellMetadata":{},"nuid":"bbe50fde-0bd6-4281-95fd-6a1ec6f17ab2","inputWidgets":{},"title":""}}},{"cell_type":"code","source":["summary_output=pd.read_csv(OUTPUT_DIR + \"rapids_4_spark_qualification_output/rapids_4_spark_qualification_output.csv\")\ndisplay(summary_output)"],"metadata":{"application/vnd.databricks.v1+cell":{"showTitle":false,"cellMetadata":{},"nuid":"fb8edb26-e173-47ff-92a1-463baec7c06b","inputWidgets":{},"title":""}},"outputs":[],"execution_count":0},{"cell_type":"markdown","source":["## Stages Output"],"metadata":{"application/vnd.databricks.v1+cell":{"showTitle":false,"cellMetadata":{},"nuid":"6756159b-30ca-407a-ab6b-9c29ced01ea6","inputWidgets":{},"title":""}}},{"cell_type":"code","source":["stages_output=pd.read_csv(OUTPUT_DIR + \"rapids_4_spark_qualification_output/rapids_4_spark_qualification_output_stages.csv\")\ndisplay(stages_output)"],"metadata":{"application/vnd.databricks.v1+cell":{"showTitle":false,"cellMetadata":{},"nuid":"cdde6177-db5f-434a-995b-776678a64a3a","inputWidgets":{},"title":""}},"outputs":[],"execution_count":0},{"cell_type":"markdown","source":["## Execs Output"],"metadata":{"application/vnd.databricks.v1+cell":{"showTitle":false,"cellMetadata":{},"nuid":"4d7ce219-ae75-4a0c-a78c-4e7f25b8cd6f","inputWidgets":{},"title":""}}},{"cell_type":"code","source":["execs_output=pd.read_csv(OUTPUT_DIR + \"rapids_4_spark_qualification_output/rapids_4_spark_qualification_output_execs.csv\")\ndisplay(execs_output)"],"metadata":{"application/vnd.databricks.v1+cell":{"showTitle":false,"cellMetadata":{},"nuid":"998b0c51-0cb6-408e-a01a-d1f5b1a61e1f","inputWidgets":{},"title":""}},"outputs":[],"execution_count":0}],"metadata":{"application/vnd.databricks.v1+notebook":{"notebookName":"[RAPIDS Accelerator for Apache Spark] Qualification Tool Notebook Template","dashboards":[{"elements":[],"guid":"0ed3c80b-b2f6-4c89-9a92-1af2f168d5ea","layoutOption":{"stack":true,"grid":true},"version":"DashboardViewV1","nuid":"91c1bfb2-695a-4e5c-8a25-848a433108dc","origId":2721260844584915,"title":"Executive View","width":1600,"globalVars":{}},{"elements":[],"guid":"ab4cecf9-0471-4fee-aa33-8927bb7e1bb1","layoutOption":{"stack":true,"grid":true},"version":"DashboardViewV1","nuid":"62243296-4562-4f06-90ac-d7a609f19c16","origId":2721260844584916,"title":"App View","width":1920,"globalVars":{}}],"notebookMetadata":{"pythonIndentUnit":2,"widgetLayout":[{"name":"log_path","width":1152,"breakBefore":false}]},"language":"python","widgets":{"log_path":{"nuid":"88986aa6-6e67-4d09-aeeb-7c96ea1ea8f1","currentValue":"/dbfs/","widgetInfo":{"widgetType":"text","name":"log_path","defaultValue":"","label":null,"options":{"widgetType":"text","validationRegex":null}}}},"notebookOrigID":2721260844584890}},"nbformat":4,"nbformat_minor":0} |