diff --git a/freebase/ComplexWebQuestions.md b/freebase/ComplexWebQuestions.md index 5752d67f..400d27b5 100644 --- a/freebase/ComplexWebQuestions.md +++ b/freebase/ComplexWebQuestions.md @@ -5,14 +5,14 @@ | Model / System | Year | F1 | Hits@1 | Accuracy | Language | Reported by | |:------------------------:|:----:|:----:|:----------:|----------:|:--------:|:---------------------------------------------------------------------------:| -| GMT-KBQA | 2022 | 77.0 | - | 72.2 | EN | [Hu et al.](https://aclanthology.org/2022.coling-1.145.pdf) | -| CBR-KBQA | 2022 | 70.0 | - | 67.1 | EN | [Hu et al.](https://aclanthology.org/2022.coling-1.145.pdf) | -| chatGPT | 2023 | - | - | 64.02 | EN | [Tan et al.](https://arxiv.org/pdf/2303.07992.pdf) | -| GPT-3.5v3 | 2023 | - | - | 57.54 | EN | [Tan et al.](https://arxiv.org/pdf/2303.07992.pdf) | -| GPT-3.5v2 | 2023 | - | - | 53.96 | EN | [Tan et al.](https://arxiv.org/pdf/2303.07992.pdf) | -| GPT-3 | 2023 | - | - | 51.77 | EN | [Tan et al.](https://arxiv.org/pdf/2303.07992.pdf) | -| FLAN-T5 | 2023 | - | - | 46.69 | EN | [Tan et al.](https://arxiv.org/pdf/2303.07992.pdf) | -| BART-large | 2022 | 68.2 | - | - | EN | [Hu et al.](https://aclanthology.org/2022.coling-1.145.pdf) | +| GMT-KBQA | 2022 | 77.0 | - | 72.2 | EN | [Hu et al.](https://aclanthology.org/2022.coling-1.145.pdf) | +| CBR-KBQA | 2022 | 70.0 | - | 67.1 | EN | [Hu et al.](https://aclanthology.org/2022.coling-1.145.pdf) | +| chatGPT | 2023 | - | - | 64.02 | EN | [Tan et al.](https://arxiv.org/pdf/2303.07992.pdf) | +| GPT-3.5v3 | 2023 | - | - | 57.54 | EN | [Tan et al.](https://arxiv.org/pdf/2303.07992.pdf) | +| GPT-3.5v2 | 2023 | - | - | 53.96 | EN | [Tan et al.](https://arxiv.org/pdf/2303.07992.pdf) | +| GPT-3 | 2023 | - | - | 51.77 | EN | [Tan et al.](https://arxiv.org/pdf/2303.07992.pdf) | +| FLAN-T5 | 2023 | - | - | 46.69 | EN | [Tan et al.](https://arxiv.org/pdf/2303.07992.pdf) | +| BART-large | 2022 | 68.2 | - | - | EN | [Hu et al.](https://aclanthology.org/2022.coling-1.145.pdf) | | DECAF (BM25 + FiD-3B) | 2022 | - | 70.4 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | | CBR-KBQA | 2022 | - | 70.4 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | | DECAF (BM25 + FiD-large) | 2022 | - | 68.1 ± 0.5 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | @@ -22,15 +22,15 @@ | ProgramTransfer-o | 2022 | 55.8 | 54.7 | - | EN | [Cao et al.](https://aclanthology.org/2022.acl-long.559.pdf) | | ProgramTransfer-pa | 2022 | 54.5 | 54.3 | - | EN | [Cao et al.](https://aclanthology.org/2022.acl-long.559.pdf) | | NSM+h | 2022 | - | 53.9 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | -| REAREV | 2022 | - | 52.9 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | +| REAREV | 2022 | - | 52.9 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | | QNRKGQA+h | 2022 | - | 51.5 | - | EN | [Ma et al.](https://link.springer.com/chapter/10.1007/978-3-031-10983-6_11) | | SR+NSM | 2022 | - | 50.2 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | | QNRKGQA | 2022 | - | 50.5 | - | EN | [Ma et al.](https://link.springer.com/chapter/10.1007/978-3-031-10983-6_11) | -| NSM-distill | 2022 | - | 48.8 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | -| Rigel | 2022 | - | 48.7 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | -| TransferNet | 2022 | - | 48.6 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | -| PullNet | 2022 | - | 45.9 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | -| MRP-QA-marginal_prob | 2022 | 49.9 | - | - | EN | [Wang et al.](https://aclanthology.org/2022.naacl-main.294.pdf) | +| NSM-distill | 2022 | - | 48.8 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | +| Rigel | 2022 | - | 48.7 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | +| TransferNet | 2022 | - | 48.6 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | +| PullNet | 2022 | - | 45.9 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | +| MRP-QA-marginal_prob | 2022 | 49.9 | - | - | EN | [Wang et al.](https://aclanthology.org/2022.naacl-main.294.pdf) | | ProgramTransfer-f | 2022 | 45.9 | 45.2 | - | EN | [Cao et al.](https://aclanthology.org/2022.acl-long.559.pdf) | | KBIGER | 2022 | 45.5 | 50.2 | - | EN | [Du et al.](https://arxiv.org/pdf/2209.03005.pdf) | | TERP | 2022 | - | 49.2 | - | EN | [Qiao et al.](https://aclanthology.org/2022.coling-1.156.pdf) | @@ -42,15 +42,25 @@ | NSM | 2022 | - | 47.6 | - | EN | [Ma et al.](https://link.springer.com/chapter/10.1007/978-3-031-10983-6_11) | | PullNet | 2019 | - | 47.2 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | | PullNet | 2022 | - | 45.9 | - | EN | [Ma et al.](https://link.springer.com/chapter/10.1007/978-3-031-10983-6_11) | -| EmbedKGQA | 2022 | - | 44.7 | - | EN | [Qiao et al.](https://aclanthology.org/2022.coling-1.156.pdf) | +| EmbedKGQA | 2022 | - | 44.7 | - | EN | [Qiao et al.](https://aclanthology.org/2022.coling-1.156.pdf) | | QGG | 2020 | 40.4 | 44.1 | - | EN | [Lan and Jiang et al.](https://aclanthology.org/2020.acl-main.91.pdf) | | Topic Units | 2019 | 36.5 | - | - | EN | [Lan et al.](https://www.ijcai.org/proceedings/2019/0701.pdf) | -| KBQA-GST | 2022 | 36.5 | - | - | EN | [Wang et al.](https://aclanthology.org/2022.naacl-main.294.pdf) | +| KBQA-GST | 2022 | 36.5 | - | - | EN | [Wang et al.](https://aclanthology.org/2022.naacl-main.294.pdf) | | TextRay | 2019 | 33.9 | 40.8 | - | EN | [Bhutani et al.](https://dl.acm.org/doi/10.1145/3357384.3358033) | -| HR-BiLSTM | 2022 | 31.2 | - | - | EN | [Wang et al.](https://aclanthology.org/2022.naacl-main.294.pdf) | +| HR-BiLSTM | 2022 | 31.2 | - | - | EN | [Wang et al.](https://aclanthology.org/2022.naacl-main.294.pdf) | | KGT5 | 2019 | - | 36.5 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | | GraphNet | 2022 | - | 32.8 | - | EN | [Ma et al.](https://link.springer.com/chapter/10.1007/978-3-031-10983-6_11) | | KV-Mem | 2022 | - | 21.1 | - | EN | [Ma et al.](https://link.springer.com/chapter/10.1007/978-3-031-10983-6_11) | | UHop | 2019 | 29.8 | - | - | EN | [Chen et al.](https://arxiv.org/pdf/1904.01246.pdf) | -| GRAFT-Net | 2022 | 26.0 | - | - | EN | [Wang et al.](https://aclanthology.org/2022.naacl-main.294.pdf) | +| GRAFT-Net | 2022 | 26.0 | - | - | EN | [Wang et al.](https://aclanthology.org/2022.naacl-main.294.pdf) | | ProgramTransfer-p | 2022 | 2.3 | 2.1 | - | EN | [Cao et al.](https://aclanthology.org/2022.acl-long.559.pdf) | +| GoG w/GPT-3.5 | 2024 | - | 55.7 | - | EN | [Xu et al.](https://arxiv.org/pdf/2404.14741) | +| GoG w/GPT-4 | 2024 | - | 75.2 | - | EN | [Xu et al.](https://arxiv.org/pdf/2404.14741) | +| IO prompt w/ChatGPT | 2024 | - | 37.6 | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| CoT prompt w/ChatGPT | 2024 | - | 38.8 | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| SC prompt w/ChatGPT | 2024 | - | 45.4 | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| Prior FT SOTA | 2024 | - | 70.4 | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| EffiQA w/ChatGPT | 2024 | - | 52.1 | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| EffiQA w/Deepseek-V2 | 2024 | - | 61.7 | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| EffiQA w/GPT-4 | 2024 | - | 69.5 | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| Prior tigh-coupling SOTA | 2024 | - | 72.5 | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | diff --git a/freebase/WebQuestionsSP.md b/freebase/WebQuestionsSP.md index 0b99b378..eb56183b 100644 --- a/freebase/WebQuestionsSP.md +++ b/freebase/WebQuestionsSP.md @@ -3,99 +3,110 @@ datasetUrl: https://www.microsoft.com/en-us/download/details.aspx?id=52763 --- -| Model / System | Year | F1 | Hits@1 | Accuracy | Language | Reported by | -| :---------------------------------: | :--: | :--------: | :--------: | :------: | :------: | :-----------------------------------------------------------------------------------: | -| chatGPT | 2023 | - | - | 83.70 | EN | [Tan et al.](https://arxiv.org/pdf/2303.07992.pdf) | -| TIARA | 2022 | 78.9 | 75.2 | - | EN | [Shu et. al.](https://aclanthology.org/2022.emnlp-main.555.pdf) | -| DECAF (DPR + FiD-3B) | 2022 | 78.8 | 82.1 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | -| GPT-3.5v3 | 2023 | - | - | 79.60 | EN | [Tan et al.](https://arxiv.org/pdf/2303.07992.pdf) | -| DECAF (DPR + FiD-large) | 2022 | 77.1 ± 0.2 | 80.7 ± 0.2 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | -| UniK-QA | 2022 | - | 79.1 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | -| TERP | 2022 | - | 76.8 | - | EN | [Qiao et al.](https://aclanthology.org/2022.coling-1.156.pdf) | -| SQALER+GNN | 2022 | - | 76.1 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | -| EmQL | 2020 | - | 75.5 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | -| GMT-KBQA | 2022 | 76.6 | - | 73.1 | EN | [Hu et al.](https://aclanthology.org/2022.coling-1.145.pdf) | -| GPT-3.5v2 | 2023 | - | - | 72.34 | EN | [Tan et al.](https://arxiv.org/pdf/2303.07992.pdf) | -| Program Transfer | 2022 | 76.5 | 74.6 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | -| RnG-KBQA (T5-large) | 2022 | 76.2 ± 0.2 | 80.7 ± 0.2 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | -| RnG-KBQA | 2022 | 75.6 | - | 71.1 | EN | [Hu et al.](https://aclanthology.org/2022.coling-1.145.pdf) | -| ArcaneQA | 2022 | 75.3 | - | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | -| QNRKGQA+h | 2022 | - | 75.7 | - | EN | [Ma et al.](https://link.springer.com/chapter/10.1007/978-3-031-10983-6_11) | -| DECAF (BM25 + FiD-large) | 2022 | 74.9 ± 0.3 | 79.0 ± 0.4 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | -| MRP-QA-marginal_prob | 2022 | 74.9 | - | - | EN | [Wang et al.](https://aclanthology.org/2022.naacl-main.294.pdf) | -| QNRKGQA | 2022 | - | 74.9 | - | EN | [Ma et al.](https://link.springer.com/chapter/10.1007/978-3-031-10983-6_11) | -| ReTrack | 2022 | 74.7 | - | - | EN | [Hu et al.](https://aclanthology.org/2022.coling-1.145.pdf) | -| ReTrack | 2021 | 74.6 | 74.7 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | -| BART-large | 2022 | 74.6 | - | - | EN | [Hu et al.](https://aclanthology.org/2022.coling-1.145.pdf) | -| Subgraph Retrieval | 2022 | 74.5 | 83.2 | - | EN | [Shu et. al.](https://aclanthology.org/2022.emnlp-main.555.pdf) | -| QGG | 2022 | 74.0 | - | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | -| CBR-KBQA | 2021 | 72.8 | - | 69.9 | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | -| GPT-3 | 2023 | - | - | 67.78 | EN | [Tan et al.](https://arxiv.org/pdf/2303.07992.pdf) | -| KGQA-RR(Roberta) | 2023 | - | - | 64.59 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | -| KGQA-RR(Luke) | 2023 | - | - | 64.52 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | -| KGQA-RR(Kepler) | 2023 | - | - | 64.46 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | -| KGQA-RR(Bert) | 2023 | - | - | 64.11 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | -| KGQA-RR(Albert) | 2023 | - | - | 63.89 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | -| KGQA-RR(XLnet) | 2023 | - | - | 63.87 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | -| KGQA-RR(DistilBert) | 2023 | - | - | 63.59 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | -| KGQA-RR(DistilRoberta) | 2023 | - | - | 62.57 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | -| KGQA-CL(Roberta) | 2023 | - | - | 62.32 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | -| KGQA-CL(Luke) | 2023 | - | - | 62.31 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | -| KGQA-CL(Kepler) | 2023 | - | - | 62.02 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | -| KGQA-CL(Bert) | 2023 | - | - | 61.76 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | -| KGQA-CL(DistilBert) | 2023 | - | - | 61.49 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | -| KGQA-CL(Albert) | 2023 | - | - | 61.47 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | -| KGQA-CL(XLnet) | 2023 | - | - | 61.46 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | -| KGQA-CL(DistilRoberta) | 2023 | - | - | 61.05 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | -| KGQA-CL(GPT2) | 2023 | - | - | 60.49 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | -| W. Han et al. | 2023 | - | 75.2 | - | EN | [Han et al.](https://link.springer.com/chapter/10.1007/978-3-031-30672-3_39) | -| NSM | 2021 | - | 74.30 | - | EN | [He et al.](https://arxiv.org/pdf/2101.03737.pdf) | -| Rigel | 2022 | - | 73.3 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | -| SGM | 2022 | 72.36 | - | - | EN | [Ma L et al.](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9747229) | -| CBR-SUBG | 2022 | 72.1 | - | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | -| NPI | 2022 | - | 72.6 | - | EN | [Cao et al.](https://aclanthology.org/2022.acl-long.559.pdf) | -| TextRay | 2022 | - | 72.2 | - | EN | [Cao et al.](https://aclanthology.org/2022.acl-long.559.pdf) | -| CBR-SUBG | 2022 | - | 72.10 | - | EN | [Das et al.](https://arxiv.org/pdf/2202.10610.pdf) | -| KGQA Based on Query Path Generation | 2022 | - | 71.7 | - | EN | [Yang et al.](https://link.springer.com/chapter/10.1007/978-3-031-10983-6_12) | -| STAGG_SP | 2022 | 71.7 | - | - | EN | [Wang et al.](https://aclanthology.org/2022.naacl-main.294.pdf) | -| SSKGQA | 2022 | - | 71.4 | - | EN | [Mingchen Li and Jonathan Shihao Ji](https://arxiv.org/pdf/2204.10194.pdf) | -| TransferNet | 2022 | - | 71.4 | - | EN | [Shi et al.](https://arxiv.org/pdf/2104.07302.pdf) | -| SeqM | 2020 | 71.83 | - | - | EN | [Ma L et al.](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9747229) | -| ReTraCK | 2021 | 71.0 | 71.6 | - | EN | [Shu et. al.](https://aclanthology.org/2022.emnlp-main.555.pdf) | -| REAREV | 2022 | 70.9 | 76.4 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | -| HGNet | 2021 | 70.3 | 70.6 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | -| GrailQA Ranking | 2021 | 70.0 | - | - | EN | [Shu et. al.](https://aclanthology.org/2022.emnlp-main.555.pdf) | -| SQALER | 2022 | - | 70.6 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | -| STAGG | 2015 | 69.00 | - | - | EN | [Ma L et al.](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9747229) | -| UHop | 2019 | 68.5 | - | - | EN | [Ma L et al.](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9747229) | -| KBIGER | 2022 | 68.4 | 75.3 | - | EN | [Du et al.](https://arxiv.org/pdf/2209.03005.pdf) | -| NSM | 2022 | - | 69.0 | - | EN | [Cao et al.](https://aclanthology.org/2022.acl-long.559.pdf) | -| GraftNet-EF+LF | 2018 | - | 68.7 | - | EN | [Sun et al.](https://aclanthology.org/D18-1455.pdf) | -| PullNet | 2019 | - | 68.1 | - | EN | [Sun et al.](https://arxiv.org/pdf/1904.09537.pdf) | -| KBQA-GST | 2022 | 67.9 | - | - | EN | [Wang et al.](https://aclanthology.org/2022.naacl-main.294.pdf) | -| Topic Units | 2019 | 67.9 | - | - | EN | [Ma L et al.](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9747229) | -| NSM | 2022 | 67.4 | 74.3 | - | EN | [Du et al.](https://arxiv.org/pdf/2209.03005.pdf) | -| Relation Learning | 2021 | 64.5 | 72.9 | - | EN | [Shu et. al.](https://aclanthology.org/2022.emnlp-main.555.pdf) | -| SR+NSM | 2022 | 64.1 | 69.5 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | -| NSM | 2022 | 62.8 | 68.7 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | -| ARN_ConvE | 2023 | - | 68.0 | - | EN | [Cui et al.](https://www.sciencedirect.com/science/article/abs/pii/S0020025522013317) | -| GraftNet | 2022 | 62.8 | 67.8 | - | EN | [Du et al.](https://arxiv.org/pdf/2209.03005.pdf) | -| PullNet | 2019 | 62.8 | 67.8 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | -| DCRN | 2021 | - | 67.8 | - | EN | [Cai et al.](https://aclanthology.org/2021.findings-acl.19.pdf) | -| ARN_TuckER | 2023 | - | 67.5 | - | EN | [Cui et al.](https://www.sciencedirect.com/science/article/abs/pii/S0020025522013317) | -| NRQA | 2022 | - | 67.1 | - | EN | [Guo et al.](https://link.springer.com/content/pdf/10.1007/s10489-022-03927-0.pdf) | -| GraftNet | 2022 | - | 66.4 | - | EN | [Mingchen Li and Jonathan Shihao Ji](https://arxiv.org/pdf/2204.10194.pdf) | -| EmbedKGQA | 2020 | - | 66.6 | - | EN | [Saxena et al.](https://aclanthology.org/2020.acl-main.412.pdf) | -| GraftNet | 2022 | 62.4 | 66.7 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | -| HR-BiLSTM | 2022 | 62.3 | - | - | EN | [Wang et al.](https://aclanthology.org/2022.naacl-main.294.pdf) | -| GraftNet-EF+LF | 2018 | 62.30 | - | - | EN | [Sun et al.](https://aclanthology.org/D18-1455.pdf) | -| TextRay | 2019 | 60.3 | - | - | EN | [Bhutani et al.](https://dl.acm.org/doi/pdf/10.1145/3357384.3358033) | -| SGReader | 2022 | 57.3 | 67.2 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | -| ARN_ComplEx | 2023 | - | 65.3 | - | EN | [Cui et al.](https://www.sciencedirect.com/science/article/abs/pii/S0020025522013317) | -| ARN_DistMult | 2023 | - | 61.7 | - | EN | [Cui et al.](https://www.sciencedirect.com/science/article/abs/pii/S0020025522013317) | -| FLAN-T5 | 2023 | - | - | 59.87 | EN | [Tan et al.](https://arxiv.org/pdf/2303.07992.pdf) | -| KGT5 | 2022 | 56.1 | - | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | -| FILM | 2022 | 54.7 | - | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | -| ReifKB | 2020 | - | 52.7 | - | EN | [Cohen et al.](https://arxiv.org/pdf/2002.06115.pdf) | -| KV-Mem | 2022 | 38.6 | 46.7 | - | EN | [Du et al.](https://arxiv.org/pdf/2209.03005.pdf) | -| KGQA-RR(GPT2) | 2023 | - | - | 18.11 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | +| Model / System | Year | F1 | Hits@1 | Accuracy | Language | Reported by | +|:--------------------------------------:|:----:|:----------:|:----------:|:--------:| :------: |:-------------------------------------------------------------------------------------:| +| chatGPT | 2023 | - | - | 83.70 | EN | [Tan et al.](https://arxiv.org/pdf/2303.07992.pdf) | +| TIARA | 2022 | 78.9 | 75.2 | - | EN | [Shu et. al.](https://aclanthology.org/2022.emnlp-main.555.pdf) | +| DECAF (DPR + FiD-3B) | 2022 | 78.8 | 82.1 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | +| GPT-3.5v3 | 2023 | - | - | 79.60 | EN | [Tan et al.](https://arxiv.org/pdf/2303.07992.pdf) | +| DECAF (DPR + FiD-large) | 2022 | 77.1 ± 0.2 | 80.7 ± 0.2 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | +| UniK-QA | 2022 | - | 79.1 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | +| TERP | 2022 | - | 76.8 | - | EN | [Qiao et al.](https://aclanthology.org/2022.coling-1.156.pdf) | +| SQALER+GNN | 2022 | - | 76.1 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | +| EmQL | 2020 | - | 75.5 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | +| GMT-KBQA | 2022 | 76.6 | - | 73.1 | EN | [Hu et al.](https://aclanthology.org/2022.coling-1.145.pdf) | +| GPT-3.5v2 | 2023 | - | - | 72.34 | EN | [Tan et al.](https://arxiv.org/pdf/2303.07992.pdf) | +| Program Transfer | 2022 | 76.5 | 74.6 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | +| RnG-KBQA (T5-large) | 2022 | 76.2 ± 0.2 | 80.7 ± 0.2 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | +| RnG-KBQA | 2022 | 75.6 | - | 71.1 | EN | [Hu et al.](https://aclanthology.org/2022.coling-1.145.pdf) | +| ArcaneQA | 2022 | 75.3 | - | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | +| QNRKGQA+h | 2022 | - | 75.7 | - | EN | [Ma et al.](https://link.springer.com/chapter/10.1007/978-3-031-10983-6_11) | +| DECAF (BM25 + FiD-large) | 2022 | 74.9 ± 0.3 | 79.0 ± 0.4 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | +| MRP-QA-marginal_prob | 2022 | 74.9 | - | - | EN | [Wang et al.](https://aclanthology.org/2022.naacl-main.294.pdf) | +| QNRKGQA | 2022 | - | 74.9 | - | EN | [Ma et al.](https://link.springer.com/chapter/10.1007/978-3-031-10983-6_11) | +| ReTrack | 2022 | 74.7 | - | - | EN | [Hu et al.](https://aclanthology.org/2022.coling-1.145.pdf) | +| ReTrack | 2021 | 74.6 | 74.7 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | +| BART-large | 2022 | 74.6 | - | - | EN | [Hu et al.](https://aclanthology.org/2022.coling-1.145.pdf) | +| Subgraph Retrieval | 2022 | 74.5 | 83.2 | - | EN | [Shu et. al.](https://aclanthology.org/2022.emnlp-main.555.pdf) | +| QGG | 2022 | 74.0 | - | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | +| CBR-KBQA | 2021 | 72.8 | - | 69.9 | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | +| GPT-3 | 2023 | - | - | 67.78 | EN | [Tan et al.](https://arxiv.org/pdf/2303.07992.pdf) | +| KGQA-RR(Roberta) | 2023 | - | - | 64.59 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | +| KGQA-RR(Luke) | 2023 | - | - | 64.52 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | +| KGQA-RR(Kepler) | 2023 | - | - | 64.46 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | +| KGQA-RR(Bert) | 2023 | - | - | 64.11 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | +| KGQA-RR(Albert) | 2023 | - | - | 63.89 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | +| KGQA-RR(XLnet) | 2023 | - | - | 63.87 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | +| KGQA-RR(DistilBert) | 2023 | - | - | 63.59 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | +| KGQA-RR(DistilRoberta) | 2023 | - | - | 62.57 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | +| KGQA-CL(Roberta) | 2023 | - | - | 62.32 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | +| KGQA-CL(Luke) | 2023 | - | - | 62.31 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | +| KGQA-CL(Kepler) | 2023 | - | - | 62.02 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | +| KGQA-CL(Bert) | 2023 | - | - | 61.76 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | +| KGQA-CL(DistilBert) | 2023 | - | - | 61.49 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | +| KGQA-CL(Albert) | 2023 | - | - | 61.47 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | +| KGQA-CL(XLnet) | 2023 | - | - | 61.46 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | +| KGQA-CL(DistilRoberta) | 2023 | - | - | 61.05 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | +| KGQA-CL(GPT2) | 2023 | - | - | 60.49 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | +| W. Han et al. | 2023 | - | 75.2 | - | EN | [Han et al.](https://link.springer.com/chapter/10.1007/978-3-031-30672-3_39) | +| NSM | 2021 | - | 74.30 | - | EN | [He et al.](https://arxiv.org/pdf/2101.03737.pdf) | +| Rigel | 2022 | - | 73.3 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | +| SGM | 2022 | 72.36 | - | - | EN | [Ma L et al.](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9747229) | +| CBR-SUBG | 2022 | 72.1 | - | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | +| NPI | 2022 | - | 72.6 | - | EN | [Cao et al.](https://aclanthology.org/2022.acl-long.559.pdf) | +| TextRay | 2022 | - | 72.2 | - | EN | [Cao et al.](https://aclanthology.org/2022.acl-long.559.pdf) | +| CBR-SUBG | 2022 | - | 72.10 | - | EN | [Das et al.](https://arxiv.org/pdf/2202.10610.pdf) | +| KGQA Based on Query Path Generation | 2022 | - | 71.7 | - | EN | [Yang et al.](https://link.springer.com/chapter/10.1007/978-3-031-10983-6_12) | +| STAGG_SP | 2022 | 71.7 | - | - | EN | [Wang et al.](https://aclanthology.org/2022.naacl-main.294.pdf) | +| SSKGQA | 2022 | - | 71.4 | - | EN | [Mingchen Li and Jonathan Shihao Ji](https://arxiv.org/pdf/2204.10194.pdf) | +| TransferNet | 2022 | - | 71.4 | - | EN | [Shi et al.](https://arxiv.org/pdf/2104.07302.pdf) | +| SeqM | 2020 | 71.83 | - | - | EN | [Ma L et al.](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9747229) | +| ReTraCK | 2021 | 71.0 | 71.6 | - | EN | [Shu et. al.](https://aclanthology.org/2022.emnlp-main.555.pdf) | +| REAREV | 2022 | 70.9 | 76.4 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | +| HGNet | 2021 | 70.3 | 70.6 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | +| GrailQA Ranking | 2021 | 70.0 | - | - | EN | [Shu et. al.](https://aclanthology.org/2022.emnlp-main.555.pdf) | +| SQALER | 2022 | - | 70.6 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | +| STAGG | 2015 | 69.00 | - | - | EN | [Ma L et al.](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9747229) | +| UHop | 2019 | 68.5 | - | - | EN | [Ma L et al.](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9747229) | +| KBIGER | 2022 | 68.4 | 75.3 | - | EN | [Du et al.](https://arxiv.org/pdf/2209.03005.pdf) | +| NSM | 2022 | - | 69.0 | - | EN | [Cao et al.](https://aclanthology.org/2022.acl-long.559.pdf) | +| GraftNet-EF+LF | 2018 | - | 68.7 | - | EN | [Sun et al.](https://aclanthology.org/D18-1455.pdf) | +| PullNet | 2019 | - | 68.1 | - | EN | [Sun et al.](https://arxiv.org/pdf/1904.09537.pdf) | +| KBQA-GST | 2022 | 67.9 | - | - | EN | [Wang et al.](https://aclanthology.org/2022.naacl-main.294.pdf) | +| Topic Units | 2019 | 67.9 | - | - | EN | [Ma L et al.](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9747229) | +| NSM | 2022 | 67.4 | 74.3 | - | EN | [Du et al.](https://arxiv.org/pdf/2209.03005.pdf) | +| Relation Learning | 2021 | 64.5 | 72.9 | - | EN | [Shu et. al.](https://aclanthology.org/2022.emnlp-main.555.pdf) | +| SR+NSM | 2022 | 64.1 | 69.5 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | +| NSM | 2022 | 62.8 | 68.7 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | +| ARN_ConvE | 2023 | - | 68.0 | - | EN | [Cui et al.](https://www.sciencedirect.com/science/article/abs/pii/S0020025522013317) | +| GraftNet | 2022 | 62.8 | 67.8 | - | EN | [Du et al.](https://arxiv.org/pdf/2209.03005.pdf) | +| PullNet | 2019 | 62.8 | 67.8 | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | +| DCRN | 2021 | - | 67.8 | - | EN | [Cai et al.](https://aclanthology.org/2021.findings-acl.19.pdf) | +| ARN_TuckER | 2023 | - | 67.5 | - | EN | [Cui et al.](https://www.sciencedirect.com/science/article/abs/pii/S0020025522013317) | +| NRQA | 2022 | - | 67.1 | - | EN | [Guo et al.](https://link.springer.com/content/pdf/10.1007/s10489-022-03927-0.pdf) | +| GraftNet | 2022 | - | 66.4 | - | EN | [Mingchen Li and Jonathan Shihao Ji](https://arxiv.org/pdf/2204.10194.pdf) | +| EmbedKGQA | 2020 | - | 66.6 | - | EN | [Saxena et al.](https://aclanthology.org/2020.acl-main.412.pdf) | +| GraftNet | 2022 | 62.4 | 66.7 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | +| HR-BiLSTM | 2022 | 62.3 | - | - | EN | [Wang et al.](https://aclanthology.org/2022.naacl-main.294.pdf) | +| GraftNet-EF+LF | 2018 | 62.30 | - | - | EN | [Sun et al.](https://aclanthology.org/D18-1455.pdf) | +| TextRay | 2019 | 60.3 | - | - | EN | [Bhutani et al.](https://dl.acm.org/doi/pdf/10.1145/3357384.3358033) | +| SGReader | 2022 | 57.3 | 67.2 | - | EN | [Costas Mavromatis and George Karypis](https://arxiv.org/pdf/2210.13650.pdf) | +| ARN_ComplEx | 2023 | - | 65.3 | - | EN | [Cui et al.](https://www.sciencedirect.com/science/article/abs/pii/S0020025522013317) | +| ARN_DistMult | 2023 | - | 61.7 | - | EN | [Cui et al.](https://www.sciencedirect.com/science/article/abs/pii/S0020025522013317) | +| FLAN-T5 | 2023 | - | - | 59.87 | EN | [Tan et al.](https://arxiv.org/pdf/2303.07992.pdf) | +| KGT5 | 2022 | 56.1 | - | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | +| FILM | 2022 | 54.7 | - | - | EN | [Yu et al.](https://arxiv.org/pdf/2210.00063.pdf) | +| ReifKB | 2020 | - | 52.7 | - | EN | [Cohen et al.](https://arxiv.org/pdf/2002.06115.pdf) | +| KV-Mem | 2022 | 38.6 | 46.7 | - | EN | [Du et al.](https://arxiv.org/pdf/2209.03005.pdf) | +| KGQA-RR(GPT2) | 2023 | - | - | 18.11 | EN | [Hu et al.](https://arxiv.org/pdf/2303.10368.pdf) | +| GoG w/GPT-3.5 | 2024 | - | 78.7 | - | EN | [Xu et al.](https://arxiv.org/pdf/2404.14741) | +| GoG w/GPT-4 | 2024 | - | 84.4 | - | EN | [Xu et al.](https://arxiv.org/pdf/2404.14741) | +| IO prompt w/ChatGPT | 2024 | - | 63.3 | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| CoT prompt w/ChatGPT | 2024 | - | 62.2 | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| SC prompt w/ChatGPT | 2024 | - | 61.1 | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| Prior FT SOTA | 2024 | - | 82.1 | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| Prior Prompting SOTA | 2024 | - | 74.4 | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| EffiQA w/ChatGPT | 2024 | - | 65.2 | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| EffiQA w/Deepseek-V2 | 2024 | - | 67.4 | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| EffiQA w/GPT-4 | 2024 | - | 82.9 | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| Prior tigh-coupling SOTA | 2024 | - | 82.6 | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | diff --git a/systems.md b/systems.md index b5dbc723..fe7f96f4 100644 --- a/systems.md +++ b/systems.md @@ -1,5 +1,5 @@ | System Name | Reported by | Reported in the paper | Demo/Repo/API available | Link to Demo/Repo/API | Original paper | System description | Reference | -| ------------------------------------ | ------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +|--------------------------------------|--------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Alexandria | Lopez et al. | [Link](https://www.sciencedirect.com/science/article/pii/S157082681300022X?casa_token=NBVj-I48uxAAAAAA:izoYV-LubTYApUYRCtnZFPuvdACyWHHNnwVBjo1S1K24AiXYmMde9vdEBsCxdpAvlfNvPswrzr8#br000150) | not working | [Link](http://alexandria.neofonie.de/) | [Link](https://link.springer.com/chapter/10.1007/978-3-662-46641-4_8) | Alexandria is a German question answering system over a domain ontology that was built primarily with data from Freebase, Authors propose a new formal query building approach that consists of two stages. In the first stage, they predict the query structure of the question and leverage the structure to constrain the generation of the candidate queries and propose a novel graph generation framework to handle the structure prediction task and design an encoder-decoder model to predict the argument of the predetermined operation in each generative step. In the second stage, they follow the previous methods to rank the candidate queries. predict the query structure of the question and leverage the structure to constrain the generation of the candidate queries and propose a novel graph generation framework to handle the structure prediction task and design an encoder-decoder model to predict the argument of the predetermined operation in each generative step. In the second stage, they follow the previous methods to rank the candidate queries. | Chen et al. only one triple, provide a modular, easy-toextend QA pipeline and evaluate it on the SimpleQuestionsWikidata benchmark. Ranking is learned from the training set. | | Aqqu Wikidata (rules) | Thomas Goette | [Link](https://ad-publications.cs.uni-freiburg.de/theses/Master_Thomas_Götte_2021.pdf) | no | - | [Link](https://ad-publications.cs.uni-freiburg.de/theses/Master_Thomas_Götte_2021.pdf) | Author focus on simple questions which means that the corresponding SPARQL query contains only one triple, provide a modular, easy-toextend QA pipeline and evaluate it on the SimpleQuestionsWikidata benchmark. Ranking with a set of weighted features. | Thomas Goette | | AskNow | Diefenbach et al. | [Link](http://www.semantic-web-journal.net/system/files/swj2038.pdf) | yes | [Link](https://github.com/AskNowQA) | [Link](https://www.springerprofessional.de/en/asknow-a-framework-for-natural-language-query-formalization-in-s/10191942) | Authors propose a framework, called AskNow, where users can pose queries in English to a target RDF knowledge base (e.g. DBpedia), which are first normalized into an intermediary canonical syntactic form, called Normalized Query Structure (NQS), and then translated into SPARQL queries. NQS facilitates the identification of the desire (or expected output information) and the user-provided input information, and establishing their mutual semantic relationship. At the same time, it is sufficiently adaptive to query paraphrasing. We have empirically evaluated the framework with respect to the syntactic robustness of NQS and semantic accuracy of the SPARQL translator on standard benchmark datasets. | Dubey et al. | @@ -19,7 +19,7 @@ | gGCN | Wu et al. | [Link](https://arxiv.org/pdf/2101.01510.pdf) | no | - | [Link](https://arxiv.org/pdf/2101.01510.pdf) | Authors present a relational graph convolutional network (RGCN)-based model gRGCN for semantic parsing in KBQA. gRGCN extracts the global semantics of questions and their corresponding query graphs, including structure semantics via RGCN and relational semantics (label representation of relations between entities) via a hierarchical relation attention mechanism.The gGCN model is obtained from gRGCN by replacing RGCN with Graph Convolutional Network (GCN) | Wu et al. | | GGNN | Sorokin and Gurevych | [Link](https://aclanthology.org/C18-1280.pdf) | yes | [Link](https://github.com/UKPLab/coling2018-graph-neural-networks-question-answering) | [Link](https://aclanthology.org/C18-1280.pdf) | Authors address the problem of learning vector representations for complex semantic parses that consist of multiple entities and relations. For each input question, they construct an explicit structural semantic parse (semantic graph). Semantic parses can be deterministically converted to a query to extract the answers from the KB. To investigate ways to encode the structure of a semantic parse and to improve the performance for more complex questions, authors adapt Gated Graph Neural Networks (GGNNs), described in Li et al. (2016), to process and score semantic parses. | Sorokin and Gurevych | | GRAFT-Net | Y Feng et al. | [Link](https://arxiv.org/pdf/2112.06109.pdf) | yes | [Link](https://github.com/haitian-sun/GraftNet) | [Link](https://arxiv.org/abs/1809.00782) | Authors propose a novel graph convolution based neural network, called GRAFT-Net (Graphs of Relations Among Facts and Text Networks), specifically designed to operate over heterogeneous graphs of KB facts and text sentences. First, they propose heterogeneous update rulesthat handle KB nodes differently from the textnodes: for instance, LSTM-based updates are usedto propagate information into and out of text nodes. Second, authors introduce a directed propagation method, inspired by personalized Pagerankin IR (Haveliwala, 2002). | Sun et al. | -| GRAFT-Net + Clocq | Christmann P. et al. | [Link](https://arxiv.org/pdf/2108.08597.pdf) | yes | [Link](https://github.com/PhilippChr/CLOCQ) (demo is available for further work on CLOCQ) | [Link](https://arxiv.org/pdf/2108.08597.pdf) | This work presents CLOCQ, an efficient method that prunes irrelevant parts of the search space using KB-aware signals. CLOCQ uses a top-𝑘 query processor over score-ordered lists of KB items that combine signals about lexical matching, relevance to the question, coherence among candidate items, and connectivity in the KB graph. | Christmann P. et al . | +| GRAFT-Net + Clocq | Christmann P. et al. | [Link](https://arxiv.org/pdf/2108.08597.pdf) | yes | [Link](https://github.com/PhilippChr/CLOCQ) (demo is available for further work on CLOCQ) | [Link](https://arxiv.org/pdf/2108.08597.pdf) | This work presents CLOCQ, an efficient method that prunes irrelevant parts of the search space using KB-aware signals. CLOCQ uses a top-𝑘 query processor over score-ordered lists of KB items that combine signals about lexical matching, relevance to the question, coherence among candidate items, and connectivity in the KB graph. | Christmann P. et al . | | gRGCN | Wu et al. | [Link](https://arxiv.org/pdf/2101.01510.pdf) | no | - | [Link](https://arxiv.org/pdf/2101.01510.pdf) | Authors present a relational graph convolutional network (RGCN)-based model gRGCN for semantic parsing in KBQA. gRGCN extracts the global semantics of questions and their corresponding query graphs, including structure semantics via RGCN and relational semantics (label representation of relations between entities) via a hierarchical relation attention mechanism. | Wu et al. | | Hakimov | Diefenbach et al. | [Link](http://www.semantic-web-journal.net/system/files/swj2038.pdf) | no | - | [Link](https://www.semanticscholar.org/paper/Applying-Semantic-Parsing-to-Question-Answering-the-Hakimov-Unger/126ee532d48302b31f899ab392c51ad982ee5cad) | Authors investigate how much lexical knowledge would need to be added so that a semantic parsing approach can perform well on unseen data. We manually add a set of lexical entries on the basis of analyzing the test portion of the QALD-4 dataset. Further, we analyze if a state-of-the-art tool for inducing ontology lexica from corpora can derive these lexical entries automatically. | Hakimov et al. | | HGNet | Chen et al. | [Link](https://arxiv.org/pdf/2111.00732.pdf) | yes | [Link](https://github.com/Bahuia/HGNet) | [Link](https://arxiv.org/pdf/2111.00732.pdf) | Hierarchical Graph Generation Network (HGNet) focues on generating search query by proposing a new unified query graph grammar to adapt to SPARQL's syntax. FIrstly, the autho ranks top k entity, relation, and value by ralation ranking and pattern matching. Secondly, HGNet is used to encde and decode the natural questions to generate query graph. The project is open-sourced on github. | Chen et al. | @@ -91,7 +91,7 @@ | WDAqua-core1 | Omar et al. | [Link](http://ceur-ws.org/Vol-2980/paper312.pdf) | not working | [Link](https://github.com/WDAqua), demo: [Link](http://wdaqua.eu/qa) | [Link](https://dl.acm.org/doi/pdf/10.1145/3184558.3191541) | WDQqua-core1 is one of the few QA system which are running as web-services. The model is a pipeline model which aims to convert question to SPARQL query. The model includes following sessions: Query Expansion, Query Con- struction, Query Ranking and Answer Decision. The service suppoprts multilingual question, natural question as well as key word questions and is integrated into Qanary framwork. | Diefenbach et al. | | WolframAlpha | Walter et al. | [Link](https://download.hrz.tu-darmstadt.de/pub/FB20/Dekanat/Publikationen/UKP/76500354.pdf) | yes | [Link](https://www.wolframalpha.com/) - only tested on QALD-2 | [Link](https://www.wolframalpha.com/) | WolframAlpha is an engine for computing answers and providing knowledge. It combines curated knowledge, linguistic analysis and dynamic computation. | Only website available m | | Xser | Diefenbach et al. | [Link](http://www.semantic-web-journal.net/system/files/swj2038.pdf) | no | only tested on QALD 4 & 5 | [Link](http://ceur-ws.org/Vol-1180/CLEF2014wn-QA-XuEt2014.pdf) | Authors present a question answering system (Xser) over Linked Data(DBpedia), converting users’ natural language questions into structured queries. There are two challenges involved: recognizing users’ query intention and mapping the involved semantic items against a given knowledge base (KB), which will be in turn assembled into a structured query. In this paper, we propose an efficient pipeline framework to model a user’s query intention as a phrase level dependency DAG which is then instantiated according to a given KB to construct the final structured query. | Xu et al. | -| YodaQA | Diefenbach et al. | [Link](http://www.semantic-web-journal.net/system/files/swj2038.pdf) | not working | [Link](https://github.com/brmson/yodaqa), webservice [Link](http://live.ailao.eu/) not working - only tested on QALD-5 | [Link](https://pasky.or.cz/dev/brmson/yodaqa-clef2015-QALD.pdf) | YodaQA is an pipeline factoid question answering system that can produce answer both from knowledge base and unstructured text. The model answers question in following steps: Question Analysis, Answer Production, Answer Analysis and Answer Merging and Scoring. In the answer production part, the model has different strategy on knolwegde base and on corpora. | Petr Baudiˇs and Jan Sˇedivy ́ | +| YodaQA | Diefenbach et al. | [Link](http://www.semantic-web-journal.net/system/files/swj2038.pdf) | not working | [Link](https://github.com/brmson/yodaqa), webservice [Link](http://live.ailao.eu/) not working - only tested on QALD-5 | [Link](https://pasky.or.cz/dev/brmson/yodaqa-clef2015-QALD.pdf) | YodaQA is an pipeline factoid question answering system that can produce answer both from knowledge base and unstructured text. The model answers question in following steps: Question Analysis, Answer Production, Answer Analysis and Answer Merging and Scoring. In the answer production part, the model has different strategy on knolwegde base and on corpora. | Petr Baudiˇs and Jan Sˇedivy ́ | | Yu et al. | Wu et al. | [Link](https://arxiv.org/pdf/2101.01510.pdf) | no | - | [Link](https://arxiv.org/pdf/2101.01510.pdf) | This model is semantic parsing based KBQA system, where the parsing section is a relational graph convolutional network (RGCN) called gRGCN. gRGCN combines RGCN and hierarchical relation attention mechanism. Therefore, it is capable of exracting both structure semantics and relational semantics from the question and generate corresponsing seach queries. | Wu et al. | | Zhang et. al. | Zhang et. al. | [Link](https://ojs.aaai.org/index.php/AAAI/article/view/10381) | no | - | [Link](https://ojs.aaai.org/index.php/AAAI/article/view/10381) | This paper presents a joint model based on integer linear programming (ILP), uniting alignment construction and query construction into a uniform framework. As a result, the model is able to outperform pipeline model with train the two sectors seperately and be robot to noise propogation. | Zhang et. al. | | Zhu et al. | Zhu et al. | [Link](https://arxiv.org/abs/1510.04780) | no | - | [Link](https://arxiv.org/abs/1510.04780) | Focusing on solving the non-aggregation questions, in this paper, authors construct a subgraph of the knowledge base from the detected entities and propose a graph traversal method to solve both the semantic item mapping problem and the disambiguation problem in a joint way. Compared with existing work, they simplify the process of query intention understanding and pay more attention to the answer path ranking. | Zhu et al. | @@ -141,4 +141,7 @@ | KGQAcl/rr | Hu et al. | [Link](https://arxiv.org/pdf/2303.10368.pdf) | yes | [Link](https://github.com/HuuuNan/PLMs-in-Practical-KBQA) | [Link](https://arxiv.org/pdf/2303.10368.pdf) | KGQA-CL and KGQA-RR are tow frameworks proposed to evaluate the PLM's performance in comparison to their efficiency. Both architectures are composed of mention detection, entity disambiguiation, relation detection and answer query building. The difference lies on the relation detection module. KGQA-CL aims to map question intent to KG relations. While KGQA-RR ranks the related relations to retrieve the answer entity. Both frameworks are tested on common PLM, distilled PLMs and knowledge-enhanced PLMs and achieve high performance on three benchmarks. | Hu et al. | | W. Han et al. | Han et al. | [Link](https://link.springer.com/chapter/10.1007/978-3-031-30672-3_39) | no | - | [Link](https://link.springer.com/chapter/10.1007/978-3-031-30672-3_39) | This model is based on machine reading comprehension. To transform a subgraph of the KG centered on the topic entity into text, the subgraph is sketched through a carefully designed schema tree, which facilitates the retrieval of multiple semantically-equivalent answer entities. Then, the promising paragraphs containing answers are picked by a contrastive learning module. Finally, the answer entities are delivered based on the answer span that is detected by the machine reading comprehension module. | Han et al. | | GAIN | Shu et al. | [Link](https://arxiv.org/pdf/2309.08345.pdf) | no | - | [Link](https://arxiv.org/pdf/2309.08345.pdf) | GAIN is not a KGQA system, but a data augmentation method named Graph seArch and questIon generatioN (GAIN). GAIN applies to KBQA corresponding to logical forms or triples, and scales data volume and distribution through four steps: 1) Graph search: Sampling logical forms or triples from arbitrary domains in the KB, without being restricted to any particular KBQA dataset. 2) Training question generator on existing KBQA datasets, i.e., learning to convert logical forms or triples into natural language questions. 3) Verbalization: Using the question generator from step 2 to verbalize sampled logical forms or triples from step 1, thus creating synthetic questions. 4) Training data expansion: Before fine-tuning any neural models on KBQA datasets, GAIN-synthetic data can be used to train these models or to expand the corpus of in-context samples for LLMs. That is, as a data augmentation method, GAIN is not a KBQA model, but it is used to augment a base KBQA model. | Shu et al. | -| JarvisQALcs | Jaradeh et al. | [Link](https://arxiv.org/pdf/2006.01527) | no | | | same as reporting paper | JarvisQA a BERT based system to answer questions on tabular views of scholarly knowledge graphs. | Jaradeh et al. | \ No newline at end of file +| JarvisQALcs | Jaradeh et al. | [Link](https://arxiv.org/pdf/2006.01527) | no | | | same as reporting paper | JarvisQA a BERT based system to answer questions on tabular views of scholarly knowledge graphs. | Jaradeh et al. | +| GoG | Xu et al. | [Link](https://arxiv.org/pdf/2404.14741) | no | - | [Link](https://arxiv.org/pdf/2404.14741) | Generate-onGraph (GoG) can effectively integrate the external and inherent knowledge of LLMs. | [Link](https://arxiv.org/pdf/2404.14741) | +| EffiQA | Dong et al. | [Link](https://arxiv.org/pdf/2406.01238) | no | - | [Link](https://arxiv.org/pdf/2406.01238) | EffiQA is a new integration paradigm of LLMs and KGs for multi-step reasoning. | [Link](https://arxiv.org/pdf/2406.01238) | +| TACQA | Wang et al. | [Link](https://www.sciencedirect.com/science/article/abs/pii/S0925231224004806) | no | - | [Link](https://www.sciencedirect.com/science/article/abs/pii/S0925231224004806) | a triple alignment-enhanced complex question answering (TACQA) method by incorporating global token alignment, function alignment, and argument alignment. | [Link](https://www.sciencedirect.com/science/article/abs/pii/S0925231224004806) | \ No newline at end of file diff --git a/wikidata/KQA Pro.md b/wikidata/KQA Pro.md index bf7c396b..c6356c1e 100644 --- a/wikidata/KQA Pro.md +++ b/wikidata/KQA Pro.md @@ -3,3 +3,17 @@ datasetUrl: http://thukeg.gitee.io/kqa-pro/ --- +| Model / System | Year | Overall | Multi-hop | Qualifier | Comparison | Logical | Count | Verify | Zero-shot | Language | Reported by | +|:--------------:|:----:|:-------:|:---------:|:---------:|:----------:|:-------:|:-----:|:------:|:---------:|:----------------------------------------:|:-----------------------------------------------------------------------------------:| +| KVMemNet | 2024 | 16.61 | 16.50 | 18.47 | 1.17 | 14.99 | 27.31 | 54.70 | 0.06 | EN | [Wang et al.](https://www.sciencedirect.com/science/article/pii/S0925231224004806) | +| SRN | 2024 | - | 12.33 | - | - | - | - | - | - | EN | [Wang et al.](https://www.sciencedirect.com/science/article/pii/S0925231224004806) | +| EmbedKGQA | 2024 | 28.36 | 26.41 | 25.20 | 11.93 | 23.95 | 32.88 | 61.05 | 0.06 | EN | [Wang et al.](https://www.sciencedirect.com/science/article/pii/S0925231224004806) | +| RGCN | 2024 | 35.07 | 34.00 | 27.61 | 30.03 | 35.85 | 41.91 | 65.88 | 0.00 | EN | [Wang et al.](https://www.sciencedirect.com/science/article/pii/S0925231224004806) | +| RNN SPARQL | 2024 | 41.98 | 36.01 | 19.04 | 66.98 | 37.74 | 50.26 | 58.84 | 26.08 | EN | [Wang et al.](https://www.sciencedirect.com/science/article/pii/S0925231224004806) | +| RNN KoPL | 2024 | 43.85 | 37.71 | 22.19 | 65.90 | 47.45 | 50.04 | 42.13 | 34.96 | EN | [Wang et al.](https://www.sciencedirect.com/science/article/pii/S0925231224004806) | +| BART SPARQL | 2024 | 89.68 | 88.49 | 83.09 | 96.12 | 88.67 | 85.78 | 92.33 | 87.88 | EN | [Wang et al.](https://www.sciencedirect.com/science/article/pii/S0925231224004806) | +| BART KoPL | 2024 | 90.55 | 89.46 | 84.76 | 95.51 | 89.30 | 86.65 | 93.30 | 89.59 | EN | [Wang et al.](https://www.sciencedirect.com/science/article/pii/S0925231224004806) | +| GraphQ IR | 2024 | 91.70 | 90.38 | 84.90 | 97.15 | 92.64 | 89.39 | 94.20 | 94.20 | EN | [Wang et al.](https://www.sciencedirect.com/science/article/pii/S0925231224004806) | +| SAE&SAA | 2024 | 91.72 | - | - | - | - | - | - | - | EN | [Wang et al.](https://www.sciencedirect.com/science/article/pii/S0925231224004806) | +| PER-KBQA | 2024 | 92.57 | 91.35 | 88.01 | 97.06 | 92.27 | 88.12 | 94.17 | 91.46 | EN | [Wang et al.](https://www.sciencedirect.com/science/article/pii/S0925231224004806) | +| TACQA | 2024 | 92.82 | 91.57 | 88.44 | 97.38 | 91.70 | 86.68 | 93.91 | 92.58 | EN | [Wang et al.](https://www.sciencedirect.com/science/article/pii/S0925231224004806) | \ No newline at end of file diff --git a/wikidata/QALD-10.md b/wikidata/QALD-10.md index feabfe53..f0977450 100644 --- a/wikidata/QALD-10.md +++ b/wikidata/QALD-10.md @@ -3,10 +3,18 @@ datasetUrl: https://github.com/KGQA/QALD-10 --- -| Model / System | Year | Precision | Recall | F1 | Language | Reported by | -|:--------------------------:|:------------:|:-------------:|:-------------:|:-------------:|:---------------------------------------------------------------:|:----------------------------------------------------------------------:| -| Borroto et al. (SPARQL-QA) | 2022 | 45.38 (Macro) | 45.74 (Macro) | 59.47 (Macro) | EN | [GERBIL](https://gerbil-qa.aksw.org/gerbil/experiment?id=202205200035) | -| QAnswer | 2022 | 50.68 (Macro) | 52.38 (Macro) | 57.76 (Macro) | EN | [GERBIL](https://gerbil-qa.aksw.org/gerbil/experiment?id=202205120000) | -| Steinmetz et al. | 2022 | 32.06 (Macro) | 33.12 (Macro) | 49.09 (Macro) | EN | [GERBIL](https://gerbil-qa.aksw.org/gerbil/experiment?id=202205260012) | -| Baramiia et al. | 2022 | 42.89 (Macro) | 42.72 (Macro) | 42.81 (Macro) | EN | [GERBIL](https://gerbil-qa.aksw.org/gerbil/experiment?id=202205210032) | -| Gavrilev et al. | 2022 | 14.21 (Macro) | 14.00 (Macro) | 19.48 (Macro) | EN | [GERBIL](https://gerbil-qa.aksw.org/gerbil/experiment?id=202205210032) | +| Model / System | Year | Hits@1 | Precision | Recall | F1 | Language | Reported by | +|:--------------------------:|:----:|:------:|:-------------:|:-------------:|:-------------:|:---------------------------------------------------------------:|:----------------------------------------------------------------------:| +| Borroto et al. (SPARQL-QA) | 2022 | - | 45.38 (Macro) | 45.74 (Macro) | 59.47 (Macro) | EN | [GERBIL](https://gerbil-qa.aksw.org/gerbil/experiment?id=202205200035) | +| QAnswer | 2022 | - | 50.68 (Macro) | 52.38 (Macro) | 57.76 (Macro) | EN | [GERBIL](https://gerbil-qa.aksw.org/gerbil/experiment?id=202205120000) | +| Steinmetz et al. | 2022 | - | 32.06 (Macro) | 33.12 (Macro) | 49.09 (Macro) | EN | [GERBIL](https://gerbil-qa.aksw.org/gerbil/experiment?id=202205260012) | +| Baramiia et al. | 2022 | - | 42.89 (Macro) | 42.72 (Macro) | 42.81 (Macro) | EN | [GERBIL](https://gerbil-qa.aksw.org/gerbil/experiment?id=202205210032) | +| Gavrilev et al. | 2022 | - | 14.21 (Macro) | 14.00 (Macro) | 19.48 (Macro) | EN | [GERBIL](https://gerbil-qa.aksw.org/gerbil/experiment?id=202205210032) | +| IO prompt w/ChatGPT | 2024 | 42.0 | - | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| CoT prompt w/ChatGPT | 2024 | 42.9 | - | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| SC prompt w/ChatGPT | 2024 | 45.3 | - | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| Prior FT SOTA | 2024 | 45.4 | - | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| EffiQA w/ChatGPT | 2024 | 46.2 | - | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| EffiQA w/Deepseek-V2 | 2024 | 50.2 | - | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| EffiQA w/GPT-4 | 2024 | 51.4 | - | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) | +| Prior tigh-coupling SOTA | 2024 | 54.7 | - | - | - | EN | [Dong et al.](https://arxiv.org/pdf/2406.01238) |