Product Review Summarization by Exploiting Phrase Properties

This repository contains data and source code in the paper "Product Review Summarization by Exploiting Phrase Properties".

Aspect Keyword List

In the experiment, we use a list of aspect keyword, as follows:

a1: 外观外形设计外型外壳外表
a2: 质量材质手感质感作工做工
a3: 屏幕触摸屏显示屏分辨率 led 触摸板液晶屏电阻屏显示触屏
a4: 性价比价位价钱价格售价
a5: 系统稳定性性能速度操作系统兼容性
a6: 软件导航 wifi
a7: 操控操控性操作性操作触控
a8: 电池待机电量续航耗电
a9: 键盘按键功能键按钮
a10: 信号网络蓝牙通话天线通信通讯
a11: 短信彩信
a12: 界面画面画质 ui
a13: 输入法手写输入
a14: 机型机身款式样式
a15: 照相摄像照像相机拍照镜头像素闪光灯摄像头照相机录音
a16: 音效音色音质话筒听筒扬声器喇叭话音音响语音立体声
a17: 存储内存内存卡存储卡储存卡扩展卡

The aspect keyword list can also be retrieved in summarizer.model.Aspect.

Data

The original review data is available at data/all_reviews/. Each file corresponds to a cell phone.

The data of phrases with sentiment polarities is available at data/phrases_new/. Each file corresponds a cell phone.

The summaries which are generated by 3 baselines and our system are available at data/summary/. xxxx_reviewSum_summary.txt is generated by our system and xxxx_lexrank_summary.txt, xxxx_opinosis_summary.txt, xxxx_basicSum_summary.txt are generated by the other 3 baseline systems in our paper.

Evaluation Data

Task 1 is pairwise user preference evaluation and Task 2 is user scoring evaluation. In Task 1, we run 6 pairwise comparisons of 4 summaries generated by our system and baseline. In Task 2, we ask annotators to evaluate 4 aspects of each summary.

We asked 20 annotators to do the evaluation task, 10 annotators are assigned to Task 1 and 10 annotators are assigned to Task 2. All of the annotators are native Chinese speakers with experiences of product review writing. We construct the evaluation dataset using customer reviews of 10 cell phones. For each annotator, at least 5 products are annotated. For each product of each task, at least 5 annotations are performed.

The annotation data is available at data/evaluation_data/. task1 subfolder contains annotation data for Task 1 and task2 subfolder contains annotation data for Task 2, respectively. Since the exact summarization algorithm name is hidden to annotators, each task item is assigned with an UUID. evaluation.log is used to store the map between the task item and the UUID.

Code Explanation

summarizer.summarizer.ReviewSummarizer

public String getSummary()

get the summary generated by our system.

summarizer.evaluation.EvaluationDataGen

public static List<Pair> evaluationPairGenerator(int productID)

generate evaluation file for Task 1, where productID denotes the ID of the product in the original review data.

public static Map<String, String> evaluationGenTask2(int productID)

generate evaluation file for Task 2, where productID denotes the ID of the product in the original review data.

public void printTask1Statics(String task1ResultDir)

print the evaluation result of Task 1 on the console. task1ResultDir denotes the directory where the annotation files of Task 1 are contained, e.g., data/evaluation_data/task1/.

public void printTask2Statics(String task2ResultDir)

print the evaluation result of Task 2 on the console, task2ResultDir denotes the directory where the annotation files of Task 2 are contained, e.g., data/evaluation_data/task2/.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.idea		.idea
data		data
lib		lib
src		src
.gitignore		.gitignore
README.md		README.md
ReviewSummarizer.iml		ReviewSummarizer.iml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Product Review Summarization by Exploiting Phrase Properties

Aspect Keyword List

Data

Evaluation Data

Code Explanation

About

Releases

Packages

Languages

libing125/ReviewSummarizer

Folders and files

Latest commit

History

Repository files navigation

Product Review Summarization by Exploiting Phrase Properties

Aspect Keyword List

Data

Evaluation Data

Code Explanation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages