Skip to content

Commit

Permalink
new recipe notebook for sentimentextractor function
Browse files Browse the repository at this point in the history
  • Loading branch information
Nupur Lal committed Feb 13, 2025
1 parent bcb15e4 commit 3ad6f75
Showing 1 changed file with 298 additions and 0 deletions.
298 changes: 298 additions & 0 deletions Recipes/ClearScape_Functions/SentimentExtractor.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,298 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "bc549e6c-0cc4-4188-94a3-a9bdd3ae3dfa",
"metadata": {},
"source": [
"<header>\n",
" <p style='font-size:36px;font-family:Arial; color:#F0F0F0; background-color: #00233c; padding-left: 20pt; padding-top: 20pt;padding-bottom: 10pt; padding-right: 20pt;'>\n",
" SentimentExtractor function in Vantage\n",
" <br>\n",
" <img id=\"teradata-logo\" src=\"https://storage.googleapis.com/clearscape_analytics_demo_data/DEMO_Logo/teradata.svg\" alt=\"Teradata\" style=\"width: 125px; height: auto; margin-top: 20pt;\">\n",
" </p>\n",
"</header>"
]
},
{
"cell_type": "markdown",
"id": "7ae7611a-0795-4168-b716-01fee6880cbd",
"metadata": {},
"source": [
"<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>Introduction</b></p>\n",
"<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Sentiment Extraction is the process of analyzing large volumes of text to determine whether it expresses a positive, negative, or neutral sentiment.<br> Clearscape Analytics SentimentExtractor uses a dictionary model to extract the sentiment (positive, negative, or neutral) of each input document or sentence. The dictionary model consists of WordNet( a lexical database of the English language).The function handles negated sentiments as follows:<ul style = 'font-size:16px;font-family:Arial;color:#00233C'>\n",
" <li>-1 if the sentiment is negated. For example, I am not happy.</li>\n",
" <li>-1 if one word separates the sentiment and a negation word. For example, I am not very happy.</li>\n",
" <li>+1 if two or more words separate the sentiment and a negation word. For example, I am not saying I am happy.</li></ul>\n",
"<p style = 'font-size:16px;font-family:Arial;color:#00233C'>In this notebook we will see how we can use the SentimentExtractor function available in Vantage.</p>"
]
},
{
"cell_type": "markdown",
"id": "6b3a00b4-6661-4c91-9b2d-cb7b0b403140",
"metadata": {},
"source": [
"<hr style=\"height:2px;border:none;background-color:#00233C;\">\n",
"<b style = 'font-size:20px;font-family:Arial;color:#00233C'>1. Initiate a connection to Vantage</b>"
]
},
{
"cell_type": "markdown",
"id": "2346857f-e0d3-488a-8a3f-ac6dff752c2b",
"metadata": {},
"source": [
"<p style = 'font-size:16px;font-family:Arial;color:#00233C'>In the section, we import the required libraries and set environment variables and environment paths (if required)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c5af5af3-29d5-4f6a-8334-9df6924e7787",
"metadata": {},
"outputs": [],
"source": [
"from teradataml import *\n",
"\n",
"# Modify the following to match the specific client environment settings\n",
"display.max_rows = 5"
]
},
{
"cell_type": "markdown",
"id": "ad3dd7b4-831c-4fb3-ab71-719c8c99a71c",
"metadata": {},
"source": [
"<hr style=\"height:1px;border:none;background-color:#00233C;\">\n",
"<p style = 'font-size:18px;font-family:Arial;color:#00233c'><b>1.1 Connect to Vantage</b></p>\n",
"<p style = 'font-size:16px;font-family:Arial;color:#00233C'>You will be prompted to provide the password. Enter your password, press the Enter key, and then use the down arrow to go to the next cell.</p>"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2742444c-4349-4b0f-b4e5-b068a8785cd9",
"metadata": {},
"outputs": [],
"source": [
"%run -i ../../UseCases/startup.ipynb\n",
"eng = create_context(host = 'host.docker.internal', username='demo_user', password = password)\n",
"print(eng)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e14915b0-7932-4e03-94ba-20f0599c3707",
"metadata": {},
"outputs": [],
"source": [
"%%capture\n",
"execute_sql('''SET query_band='DEMO=PP_SentimentExtractor_Python.ipynb;' UPDATE FOR SESSION; ''')"
]
},
{
"cell_type": "markdown",
"id": "efe2fd2d-63ff-4278-9157-8b9110d682e8",
"metadata": {},
"source": [
"<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Begin running steps with Shift + Enter keys. </p>"
]
},
{
"cell_type": "markdown",
"id": "f003f332-7489-4bdd-a740-4af2a0a22280",
"metadata": {},
"source": [
"<hr style='height:1px;border:none;background-color:#00233C;'>\n",
"\n",
"<p style = 'font-size:18px;font-family:Arial;color:#00233c'><b>1.2 Getting Data for This Demo</b></p>\n",
"\n",
"<p style = 'font-size:16px;font-family:Arial;color:#00233C'>We have provided data for this demo on cloud storage. You can either run the demo using foreign tables to access the data without any storage on your environment or download the data to local storage, which may yield faster execution. Still, there could be considerations of available storage. Two statements are in the following cell, and one is commented out. You may switch which mode you choose by changing the comment string.</p>"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "45c86176-734c-4b1c-ace0-d0c88657b4f8",
"metadata": {},
"outputs": [],
"source": [
"%run -i ../../UseCases/run_procedure.py \"call get_data('DEMO_Retail_local');\" # Takes 30 seconds\n",
"#%run -i ../../UseCases/run_procedure.py \"call get_data('DEMO_Retail_cloud');\" "
]
},
{
"cell_type": "markdown",
"id": "2401d6d3-4fcd-46fc-8a94-7cafcd1258b0",
"metadata": {},
"source": [
"<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Next is an optional step – if you want to see the status of databases/tables created and space used.</p>"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "87429200-db02-450d-9472-4d1e2030124d",
"metadata": {},
"outputs": [],
"source": [
"%run -i ../../UseCases/run_procedure.py \"call space_report();\" # Takes 10 seconds"
]
},
{
"cell_type": "markdown",
"id": "2a3762ac-ba27-4fa3-adba-d577262a4290",
"metadata": {},
"source": [
"<hr style=\"height:2px;border:none;background-color:#00233C;\">\n",
"<b style = 'font-size:20px;font-family:Arial;color:#00233C'>2. Data Exploration</b>\n",
"<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Create a \"Virtual DataFrame\" that points to the data set in Vantage. Check the shape of the dataframe as check the datatype of all the columns of the dataframe.</p>"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3d936fab-7ca7-4e94-ba64-95c1da08b74f",
"metadata": {},
"outputs": [],
"source": [
"tdf = DataFrame(in_schema(\"DEMO_Retail\", \"Web_Comment\"))\n",
"print(\"Shape of the data: \", tdf.shape)\n",
"tdf"
]
},
{
"cell_type": "markdown",
"id": "620cc5be-cb9c-4516-9a86-50c25935ae75",
"metadata": {},
"source": [
"<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Let us check the sentiments of the comments made by a particular customer. Detailed help can be found by passing function name to built-in help function. </p>"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9d0a3a5a-b237-4aaa-8fa8-ad1176a16ceb",
"metadata": {},
"outputs": [],
"source": [
"help(SentimentExtractor)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "99b0f458-6c1e-4340-b897-6a9812581d08",
"metadata": {},
"outputs": [],
"source": [
"tdf_cust = tdf[tdf.customer_id ==872]\n",
"tdf_cust.shape"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "665e3ca0-1323-4fff-a3d4-d617b25afeb3",
"metadata": {},
"outputs": [],
"source": [
"sentimentextractor_out = SentimentExtractor(text_column=\"comment_text\",\n",
" data=tdf_cust,\n",
" accumulate=['comment_id','customer_id', 'comment_text']\n",
" )\n",
"\n",
"senti = sentimentextractor_out.result\n",
"senti"
]
},
{
"cell_type": "markdown",
"id": "151d5db4-29a9-49d9-8a61-d53f9627a294",
"metadata": {},
"source": [
"<hr style=\"height:2px;border:none;background-color:#00233C;\">\n",
"<b style = 'font-size:20px;font-family:Arial;color:#00233C'>3. Cleanup</b>"
]
},
{
"cell_type": "markdown",
"id": "a562f058-fb24-4966-a25d-f2960e6ddfb8",
"metadata": {},
"source": [
"<hr style=\"height:1px;border:none;background-color:#00233C;\">\n",
"<p style = 'font-size:18px;font-family:Arial;color:#00233C'> <b>Databases and Tables </b></p>\n",
"<p style = 'font-size:16px;font-family:Arial;color:#00233C'>The following code will clean up tables and databases created above.</p>"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e6b3935b-47c2-4a96-bec2-68106d172116",
"metadata": {},
"outputs": [],
"source": [
"%run -i ../../UseCases/run_procedure.py \"call remove_data('DEMO_Retail');\" # Takes 10 seconds"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "157fe3d4-4e0e-4d92-b343-9f758f3bf690",
"metadata": {},
"outputs": [],
"source": [
"remove_context()"
]
},
{
"cell_type": "markdown",
"id": "4317a6cf-1479-4aa8-b30a-ee0a3b5231a8",
"metadata": {},
"source": [
"<hr style=\"height:1px;border:none;background-color:#00233C;\">\n",
"<p style = 'font-size:16px;font-family:Arial;color:#00233C'><b>Links:</b></p>\n",
"<ul style = 'font-size:16px;font-family:Arial'>\n",
" <li>Teradataml Python reference: <a href = 'https://docs.teradata.com/search/all?query=Python+Package+User+Guide&content-lang=en-US'>here</a></li>\n",
" <li>SentimentExtractor function reference: <a href = 'https://docs.teradata.com/search/all?query=SentimentExtractor&content-lang=en-US'>here</a></li>\n",
"</ul>"
]
},
{
"cell_type": "markdown",
"id": "b2dcca28-5de5-44d7-88cb-45a12153b3f8",
"metadata": {},
"source": [
"<footer style=\"padding-bottom:35px; background:#f9f9f9; border-bottom:3px solid #00233C\">\n",
" <div style=\"float:left;margin-top:14px\">ClearScape Analytics™</div>\n",
" <div style=\"float:right;\">\n",
" <div style=\"float:left; margin-top:14px\">\n",
" Copyright © Teradata Corporation - 2025. All Rights Reserved\n",
" </div>\n",
" </div>\n",
"</footer>"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.10"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

0 comments on commit 3ad6f75

Please sign in to comment.