Login to the IBM Watson Studio, first you need to activate watson services on your account to access the download links below. After downloading the Airbnb dataset, upload it to a COS bucket. The dataset consists of information -like review, reviewer info, coordinates- of the reviews from Airbnb.
- Amsterdam
- Antwerp Belgium
- Athens Europe
- Austin
- Barcelona
- Berlin
- Boston
- Brussels
- Chicago
- Dublin
- London
- Los Angeles
- Madrid
- Palma Mallorca Spain
- Melbourne
- Montreal
- Nashville
- New Orleans
- New York City
- Oakland
- Paris
- Portland
- San Diego
- City of San Francisco
- Santa Cruz
- Seattle
- Sydney
- Toronto
- Trento
- Vancouver
- Venice Italy
- Vienna Austria.
- Washington D.C.
First, you need to install required packages.
import io
import base64
import time
import shutil
import csv
import lithops
import regex
import re
import matplotlib.pyplot as plt
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer
Then you should set the BUCKET variable as the name of the bucket which you uploaded the dataset.
BUCKET = ['<YOUR_BUCKET_NAME>']
There are 2 major functions in this example:
-
analyze_comments: It is used as the map function in map reduce paradigm. The function parses the dataset and classifies the reviews using nltk by their polarity scores and groups them by being positive, negative or neutral.
-
create_map: This method functions as the reduce function in this scenario. It reduces all the intermediate data grouped by sentiments and draws a map displaying the results in different colors accordingly.
This is the original code of the experiments presented in Serverless Data Analytics in the IBM Cloud. You need to configure lithops with your own IBM account keys. You can also see more options about the configuration here.
config = {'lithops': {'backend': 'ibm_cf', 'storage': 'ibm_cos'},
'ibm': {'iam_api_key': '<IAM_API_KEY>'},# If your namespace is IAM based (To reach cloud functions API without cf api key)
'ibm_cf': {'endpoint': '<CLOUD_FUNCTIONS_ENDPOINT>',
'namespace': '<NAME_OF_YOUR_NAMESPACE>',
'namespace_id': '<GUID_OF_YOUR_NAMESPACE>'# If your namespace is IAM based
#'api_key': 'YOUR_API_KEY' #If your namespace is foundary based
},
'ibm_cos': {'storage_bucket': '<YOUR_COS_BUCKET_NAME>',
'region': '<BUCKET_REGION>',
'api_key': '<YOUR_API_KEY>'}}
We also provide a small execution demo in AWS. Input data is publicly available at lithops-applications-data, users should upload it to their custom bucket with import_dataset_aws.sh (incurs into "Requester pays" billing).
To run the AWS example, simply run the map_sentiment_analysis.ipynb notebook following the AWS-specific steps.