Compositional Freebase Questions

Compositional Freebase Questions^[1] is a dataset for measuring compositional generalization in semantic parsing, which targets the task of parsing natural language questions into SPARQL queries executable on Freebase. It includes multiple train/test splits with maximally divergent examples in terms of compounds, generated using the Distribution-Based Compositionality Assessment (DBCA) method, while maintaining a low divergence in terms of primitive elements (atoms). In these splits, the test set is constrained to examples containing novel compounds, i.e., new ways of composing the atoms seen during training. This dataset contains 239357 English question-answer pairs, which encompass 49320 question patterns and 34921 SPARQL query patterns.

This dataset can be downloaded via the link.

Leaderboard

References

[1] Keysers, D., Schärli, N., Scales, N., Buisman, H., Furrer, D., Kashubin, S., Momchev, N., Sinopalnikov, D., Stafiniak, L., Tihon, T. and Tsarkov, D., 2019. Measuring compositional generalization: A comprehensive method on realistic data. arXiv preprint arXiv:1912.09713.

Go back to the README

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

compositional_freebase_questions.md

compositional_freebase_questions.md

Compositional Freebase Questions

Leaderboard

References

Files

compositional_freebase_questions.md

Latest commit

History

compositional_freebase_questions.md

File metadata and controls

Compositional Freebase Questions

Leaderboard

References