Skip to content

Latest commit

 

History

History
167 lines (147 loc) · 3.72 KB

evaluation_models.md

File metadata and controls

167 lines (147 loc) · 3.72 KB

#Evaluation of Runs of the different Models ##1. Evaluation of Argument-Model ###1.1 Evaluation of NeuralNet-Argument Model with datasplit per image ####1.2.1 Evaluation over all labeled topics

Run P_Strong P_Both
Run 1 0.5625 0.8325
Run 2 0.5375 0.8525
Run 3 0.5475 0.845
Run 4 0.57 0.8425
Run 5 0.585 0.8525
Run 6 0.56 0.8575
Run 7 0.5775 0.8575
Run 8 0.48 0.8175
Run 9 0.5725 0.85
Run 10 0.58 0.865
Average 0.5572 0.8472

####1.2.2 Evaluation over all labeled valid topics

Run P_Strong P_Both
Run 1 0.5944 0.8639
Run 2 0.5583 0.8806
Run 3 0.575 0.875
Run 4 0.5944 0.8694
Run 5 0.6194 0.8833
Run 6 0.5944 0.8917
Run 7 0.6083 0.8889
Run 8 0.4972 0.8389
Run 9 0.6 0.8778
Run 10 0.6083 0.8944
Average 0.585 0.8764

###1.2 Evaluation of NeuralNet-Argument Model with datasplit per topic ####1.2.1 Evaluation over all labeled topics

Run P_Strong P_Both
Run 1 0.55 0.8575
Run 2 0.545 0.8425
Run 3 0.5425 0.8575
Run 4 0.5475 0.86
Run 5 0.555 0.8475
Run 6 0.5625 0.8925
Run 7 0.5575 0.86
Run 8 0.525 0.855
Run 9 0.5725 0.88
Run 10 0.4625 0.815
Average 0.542 0.8568

####1.2.2 Evaluation over all labeled valid topics

Run P_Strong P_Both
Run 1 0.5861 0.8917
Run 2 0.575 0.8722
Run 3 0.5722 0.8806
Run 4 0.5778 0.8861
Run 5 0.575 0.8694
Run 6 0.5917 0.9111
Run 7 0.5806 0.8889
Run 8 0.5444 0.8778
Run 9 0.6028 0.9083
Run 10 0.4861 0.8361
Average 0.5692 0.8822

####1.2.3 Evaluation over all labeled test topics

Run P_Strong P_Both
Run 1 0.575 0.775
Run 2 0.675 0.9625
Run 3 0.4625 0.9
Run 4 0.5167 0.8833
Run 5 0.6 0.8833
Run 6 0.4875 0.9125
Run 7 0.5875 0.9
Run 8 0.6125 0.8625
Run 9 0.5 0.9333
Run 10 0.5333 0.7167
Average 0.555 0.8729

##2. Evaluation of Stance-Model ###2.1 Evaluation of NeuralNet-Stance Model with datasplit per image ####2.1.1 Evaluation over all labeled topics

Run Accuracy
Run 1 0.4442
Run 2 0.4797
Run 3 0.485
Run 4 0.4861
Run 5 0.4272
Run 6 0.4821
Run 7 0.4855
Run 8 0.4817
Run 9 0.4841
Run 10 0.467
Average 0.4723

####2.1.2 Evaluation over all labeled valid topics

Run Accuracy
Run 1 0.4645
Run 2 0.508
Run 3 0.5245
Run 4 0.5118
Run 5 0.4396
Run 6 0.501
Run 7 0.5174
Run 8 0.493
Run 9 0.5132
Run 10 0.4879
Average 0.4961

###2.2 Evaluation of NeuralNet-Stance Model with datasplit per topic ####2.2.1 Evaluation over all labeled topics

Run Accuracy
Run 1 0.5167
Run 2 0.4637
Run 3 0.4834
Run 4 0.4835
Run 5 0.4562
Run 6 0.5117
Run 7 0.4632
Run 8 0.4708
Run 9 0.4698
Run 10 0.4926
Average 0.4812

####2.2.2 Evaluation over all labeled valid topics

Run Accuracy
Run 1 0.5209
Run 2 0.4631
Run 3 0.4783
Run 4 0.4816
Run 5 0.4501
Run 6 0.5055
Run 7 0.453
Run 8 0.4693
Run 9 0.4763
Run 10 0.481
Average 0.4779

####2.2.3 Evaluation over all labeled test topics

Run Accuracy
Run 1 0.3915
Run 2 0.4126
Run 3 0.4165
Run 4 0.4749
Run 5 0.4245
Run 6 0.4076
Run 7 0.3927
Run 8 0.3655
Run 9 0.4639
Run 10 0.465
Average 0.4215