RDF Syntax (R-Syn): Combination of several scores where the LLM has to work with syntax of RDF serialization formats
RDF Analytics (R-Ana): Combination of several scores of the different variations of RdfFriendCount task
SPARQL Semantics (S-Sem): Combination of Text2Sparql and several Sparql2Answer scores.
SPARQL Syntax (S-Syn): The results of SparqlSyntaxFixing task
Brevity (Brev): Combination of several scores evaluating whether the LLM returns only the information asked for. Additional text makes the parsing difficult and the generation cost additional computing resources.

In the plots, the mean value is indicated by the solid black line, and the blue area represents the variance.

open LLMs

The following table shows an overview of Capability Compass plots for open LLMs. Each line contains a LLM model family, the columns sort the LLMs according to their parameter count.

Family	0.5B	1B	1.5B	3B/3.8B	7B/8B	MoE Active 6-14B	14B	32B/33B	70B/72B
Llama 3.0
-->					Meta-Llama-3-8B-Instruct				Meta-Llama-3-70B-Instruct
Llama 3.1
-->					Llama-3.1-8B-Instruct				Llama-3.1-70B-Instruct
Llama 3.2
-->		Llama-3.2-1B-Instruct		Llama-3.2-3B-Instruct
Llama 3.3
-->									Llama-3.3-70B-Instruct
Phi 3.0
-->				Phi-3-mini-128k-instruct	Phi-3-small-128k-instruct		Phi-3-medium-128k-instruct
Phi 3.5
-->				Phi-3.5-mini-instruct		Phi-3.5-MoE-instruct
Qwen2
-->	Qwen2-0.5B-Instruct		Qwen2-1.5B-Instruct		Qwen2-7B-Instruct	Qwen2-57B-A14B-Instruct			Qwen2-72B-Instruct
Qwen2.5
-->	Qwen2.5-0.5B-Instruct		Qwen2.5-1.5B-Instruct	Qwen2.5-3B-Instruct	Qwen2.5-7B-Instruct		Qwen2.5-14B-Instruct	Qwen2.5-32B-Instruct	Qwen2.5-72B-Instruct
Qwen2.5-Coder
-->								Qwen2.5-Coder-32B-Instruct
Infly-OpenCoder
-->					OpenCoder-8B-Instruct
Deepseek-coder
-->								deepseek-coder-33b-instruct

Task Plots

Rdf Connection Explain Tasks

plot	caption
	RdfConnectionExplainStatic, graphFormat=jsonld: listTrimF1 score
	RdfConnectionExplainStatic, graphFormat=nt: listTrimF1 score
	RdfConnectionExplainStatic, graphFormat=turtle: listTrimF1 score
	RdfConnectionExplainStatic, graphFormat=xml: listTrimF1 score

RDF Friend Count Tasks

plot	caption
	RdfFriendCount, graphFormat=jsonld, 1 additional link: F1 score
	RdfFriendCount, graphFormat=jsonld, 2 additional links: F1 score
	RdfFriendCount, graphFormat=nt, 1 additional link: F1 score
	RdfFriendCount, graphFormat=nt, 2 additional links: F1 score
	RdfFriendCount, graphFormat=turtle, 1 additional link: F1 score
	RdfFriendCount, graphFormat=turtle, 2 additional links: F1 score
	RdfFriendCount, graphFormat=xml, 1 additional link: F1 score
	RdfFriendCount, graphFormat=xml, 2 additional links: F1 score

RDF Syntax Fix Tasks

plot	caption
	RdfSyntaxFixList, graphFormat=jsonld: max(combined) score
	RdfSyntaxFixList, graphFormat=nt: max(combined) score
	RdfSyntaxFixList, graphFormat=turtle: max(combined) score

SPARQL Syntax Fix Task

plot	caption
	SparqlSyntaxFixingList, dataset=LcQuad: max(combined) score

SPARQL to Answer Tasks

plot	caption
	Sparql2Answer, dataset=Organisational, graphFormat=jsonld: combinedF1 score
	Sparql2Answer, dataset=Organisational, graphFormat=turtle: combinedF1 score

Text to Answer Tasks

plot	caption
	Text2Answer, dataset=Organisational, graphFormat=jsonld: combinedF1 score
	Text2Answer, dataset=Organisational, graphFormat=turtle: combinedF1 score

Text to SPARQL Tasks

plot	caption
	Text2Sparql, dataset=Organizational, graphInfo=turtle graph: max(combined) score
	Text2Sparql, dataset=Orga Numerical, graphInfo=turtle graph + ID-label-mapping: max(combined) score
	Text2Sparql, dataset=Coypu Mini, graphInfo=turtle graph: max(combined) score
	Text2Sparql, dataset=Beastiary, graphInfo=turtle schema: max(combined) score
	Text2Sparql, dataset=Beastiary, graphInfo=turtle subschema: max(combined) score
	Text2Sparql, dataset=Beastiary, graphInfo=turtle subgraph: max(combined) score

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

plots

plots

README.md

Plots on the Results

Navigating LLM Semantic Web Technology Support with Capability Compass

Dimensions of the Capability Compass

open LLMs

Task Plots

Rdf Connection Explain Tasks

RDF Friend Count Tasks

RDF Syntax Fix Tasks

SPARQL Syntax Fix Task

SPARQL to Answer Tasks

Text to Answer Tasks

Text to SPARQL Tasks

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
model_Claude__claude_3_5_haiku_20241022__claude_3_5_haiku_20241022.png		model_Claude__claude_3_5_haiku_20241022__claude_3_5_haiku_20241022.png
model_Claude__claude_3_5_sonnet_20241022__claude_3_5_sonnet_20241022.png		model_Claude__claude_3_5_sonnet_20241022__claude_3_5_sonnet_20241022.png
model_GPT__gpt_3.5_turbo_0125__gpt_3.5_turbo_0125.png		model_GPT__gpt_3.5_turbo_0125__gpt_3.5_turbo_0125.png
model_GPT__gpt_4o_2024_11_20__gpt_4o_2024_11_20.png		model_GPT__gpt_4o_2024_11_20__gpt_4o_2024_11_20.png
model_GPT__gpt_4o_mini_2024_07_18__gpt_4o_mini_2024_07_18.png		model_GPT__gpt_4o_mini_2024_07_18__gpt_4o_mini_2024_07_18.png
model_GPT__o1_mini_2024_09_12__o1_mini_2024_09_12.png		model_GPT__o1_mini_2024_09_12__o1_mini_2024_09_12.png
model_GPT__o1_preview_2024_09_12__o1_preview_2024_09_12.png		model_GPT__o1_preview_2024_09_12__o1_preview_2024_09_12.png
model_Google__gemini_2.0_flash_exp.png		model_Google__gemini_2.0_flash_exp.png
model_Google__models_gemini_1.5_flash_002.png		model_Google__models_gemini_1.5_flash_002.png
model_Google__models_gemini_1.5_pro_002.png		model_Google__models_gemini_1.5_pro_002.png
model_VLLM___ds_models_llms_Deepseek_Coder_33B_Instruct.png		model_VLLM___ds_models_llms_Deepseek_Coder_33B_Instruct.png
model_VLLM___ds_models_llms_Llama_3.3_70B_Instruct.png		model_VLLM___ds_models_llms_Llama_3.3_70B_Instruct.png
model_VLLM___ds_models_llms_OpenCoder_8B_Instruct.png		model_VLLM___ds_models_llms_OpenCoder_8B_Instruct.png
model_VLLM___ds_models_llms_Phi_3.5_MoE_Instruct.png		model_VLLM___ds_models_llms_Phi_3.5_MoE_Instruct.png
model_VLLM___ds_models_llms_Phi_3.5_mini_instruct.png		model_VLLM___ds_models_llms_Phi_3.5_mini_instruct.png
model_VLLM___ds_models_llms_Phi_3_medium_128k_instruct.png		model_VLLM___ds_models_llms_Phi_3_medium_128k_instruct.png
model_VLLM___ds_models_llms_Phi_3_mini_128k_instruct.png		model_VLLM___ds_models_llms_Phi_3_mini_128k_instruct.png
model_VLLM___ds_models_llms_Phi_3_small_128k_instruct.png		model_VLLM___ds_models_llms_Phi_3_small_128k_instruct.png
model_VLLM___ds_models_llms_Qwen2.5_0.5B_Instruct.png		model_VLLM___ds_models_llms_Qwen2.5_0.5B_Instruct.png
model_VLLM___ds_models_llms_Qwen2.5_1.5B_Instruct.png		model_VLLM___ds_models_llms_Qwen2.5_1.5B_Instruct.png
model_VLLM___ds_models_llms_Qwen2.5_14B_Instruct.png		model_VLLM___ds_models_llms_Qwen2.5_14B_Instruct.png
model_VLLM___ds_models_llms_Qwen2.5_32B_instruct.png		model_VLLM___ds_models_llms_Qwen2.5_32B_instruct.png
model_VLLM___ds_models_llms_Qwen2.5_3B_Instruct.png		model_VLLM___ds_models_llms_Qwen2.5_3B_Instruct.png
model_VLLM___ds_models_llms_Qwen2.5_72B_Instruct.png		model_VLLM___ds_models_llms_Qwen2.5_72B_Instruct.png
model_VLLM___ds_models_llms_Qwen2.5_7B_Instruct.png		model_VLLM___ds_models_llms_Qwen2.5_7B_Instruct.png
model_VLLM___ds_models_llms_Qwen2.5_Coder_32B_Instruct.png		model_VLLM___ds_models_llms_Qwen2.5_Coder_32B_Instruct.png
model_VLLM___ds_models_llms_Qwen2_0.5B_Instruct.png		model_VLLM___ds_models_llms_Qwen2_0.5B_Instruct.png
model_VLLM___ds_models_llms_Qwen2_1.5B_Instruct.png		model_VLLM___ds_models_llms_Qwen2_1.5B_Instruct.png
model_VLLM___ds_models_llms_Qwen2_57B_A14B_Instruct.png		model_VLLM___ds_models_llms_Qwen2_57B_A14B_Instruct.png
model_VLLM___ds_models_llms_Qwen2_72B_Instruct.png		model_VLLM___ds_models_llms_Qwen2_72B_Instruct.png
model_VLLM___ds_models_llms_Qwen2_7B_Instruct.png		model_VLLM___ds_models_llms_Qwen2_7B_Instruct.png
model_VLLM___ds_models_llms_Solar_pro_preview_instruct_22B.png		model_VLLM___ds_models_llms_Solar_pro_preview_instruct_22B.png
model_VLLM___ds_models_llms_hf_hub_models_Meta_Llama_3.1_70B_Instruct.png		model_VLLM___ds_models_llms_hf_hub_models_Meta_Llama_3.1_70B_Instruct.png
model_VLLM___ds_models_llms_hf_hub_models_Meta_Llama_3.1_8B_Instruct.png		model_VLLM___ds_models_llms_hf_hub_models_Meta_Llama_3.1_8B_Instruct.png
model_VLLM___ds_models_llms_hf_hub_models_Meta_Llama_3.2_1B_Instruct.png		model_VLLM___ds_models_llms_hf_hub_models_Meta_Llama_3.2_1B_Instruct.png
model_VLLM___ds_models_llms_hf_hub_models_Meta_Llama_3.2_3B_Instruct.png		model_VLLM___ds_models_llms_hf_hub_models_Meta_Llama_3.2_3B_Instruct.png
model_VLLM___ds_models_llms_hf_hub_models_Meta_Llama_3_70B_Instruct.png		model_VLLM___ds_models_llms_hf_hub_models_Meta_Llama_3_70B_Instruct.png
model_VLLM___ds_models_llms_hf_hub_models_Meta_Llama_3_8B_Instruct.png		model_VLLM___ds_models_llms_hf_hub_models_Meta_Llama_3_8B_Instruct.png
task_RdfConnectionExplainStatic-jsonld_listTrimF1.png		task_RdfConnectionExplainStatic-jsonld_listTrimF1.png
task_RdfConnectionExplainStatic-nt_listTrimF1.png		task_RdfConnectionExplainStatic-nt_listTrimF1.png
task_RdfConnectionExplainStatic-turtle_listTrimF1.png		task_RdfConnectionExplainStatic-turtle_listTrimF1.png
task_RdfConnectionExplainStatic-xml_listTrimF1.png		task_RdfConnectionExplainStatic-xml_listTrimF1.png
task_RdfFriendCount-jsonld-1_f1.png		task_RdfFriendCount-jsonld-1_f1.png
task_RdfFriendCount-jsonld-2_f1.png		task_RdfFriendCount-jsonld-2_f1.png
task_RdfFriendCount-nt-1_f1.png		task_RdfFriendCount-nt-1_f1.png
task_RdfFriendCount-nt-2_f1.png		task_RdfFriendCount-nt-2_f1.png
task_RdfFriendCount-turtle-1_f1.png		task_RdfFriendCount-turtle-1_f1.png
task_RdfFriendCount-turtle-2_f1.png		task_RdfFriendCount-turtle-2_f1.png
task_RdfFriendCount-xml-1_f1.png		task_RdfFriendCount-xml-1_f1.png
task_RdfFriendCount-xml-2_f1.png		task_RdfFriendCount-xml-2_f1.png
task_RdfSyntaxFixList-jsonld_max_combined.png		task_RdfSyntaxFixList-jsonld_max_combined.png
task_RdfSyntaxFixList-nt_max_combined.png		task_RdfSyntaxFixList-nt_max_combined.png
task_RdfSyntaxFixList-turtle_max_combined.png		task_RdfSyntaxFixList-turtle_max_combined.png
task_Sparql2AnswerListOrga-jsonld_combinedF1.png		task_Sparql2AnswerListOrga-jsonld_combinedF1.png
task_Sparql2AnswerListOrga-turtle_combinedF1.png		task_Sparql2AnswerListOrga-turtle_combinedF1.png
task_SparqlSyntaxFixingListLcQuad_max_combined.png		task_SparqlSyntaxFixingListLcQuad_max_combined.png
task_Text2AnswerListOrga-jsonld_combinedF1.png		task_Text2AnswerListOrga-jsonld_combinedF1.png
task_Text2AnswerListOrga-turtle_combinedF1.png		task_Text2AnswerListOrga-turtle_combinedF1.png
task_Text2SparqlExecEvalListBeastiary-turtle-schema_max_combined.png		task_Text2SparqlExecEvalListBeastiary-turtle-schema_max_combined.png
task_Text2SparqlExecEvalListBeastiary-turtle-subgraph_max_combined.png		task_Text2SparqlExecEvalListBeastiary-turtle-subgraph_max_combined.png
task_Text2SparqlExecEvalListBeastiary-turtle-subschema_max_combined.png		task_Text2SparqlExecEvalListBeastiary-turtle-subschema_max_combined.png
task_Text2SparqlExecEvalListCoypuMini_max_combined.png		task_Text2SparqlExecEvalListCoypuMini_max_combined.png
task_Text2SparqlExecEvalListOrgaNumerical_max_combined.png		task_Text2SparqlExecEvalListOrgaNumerical_max_combined.png
task_Text2SparqlExecEvalListOrganizational_max_combined.png		task_Text2SparqlExecEvalListOrganizational_max_combined.png

Files

plots

Directory actions

More options

Directory actions

More options

Latest commit

History

plots

Folders and files

parent directory

README.md

Plots on the Results

Navigating LLM Semantic Web Technology Support with Capability Compass

Dimensions of the Capability Compass

open LLMs

Task Plots

Rdf Connection Explain Tasks

RDF Friend Count Tasks

RDF Syntax Fix Tasks

SPARQL Syntax Fix Task

SPARQL to Answer Tasks

Text to Answer Tasks

Text to SPARQL Tasks