docs

ServiceStack · May 13, 2024 · 4044af9 · 4044af9
1 parent 3710c55
commit 4044af9
Show file tree

Hide file tree

Showing 2 changed files with 45 additions and 33 deletions.
diff --git a/MyApp/_pages/about.md b/MyApp/_pages/about.md
@@ -8,27 +8,30 @@ Like most developers we're captivated by the amazing things large language model
 have to transform the way we interact with and use technology. One of the areas they can be immediately beneficial with
 is in getting help in learning how to accomplish a task or solving a particular issue.
 
-Previously we would need to seek out answers by scanning the Internet, reading through documentation and blogs to find
-out answers for ourselves. Forums and particularly Stack Overflow have been a great resource for developers in being able
-to get help from other developers who have faced similar issues. But the timeliness and quality of the responses can vary
+Previously we'd need to seek out answers by scanning the Internet, reading through docs, tutorials and blogs to find
+out answers for ourselves. Forums and particularly Stack Overflow have been great resources for developers in being able
+to get help from other devs who have faced similar issues. But the timeliness and quality of the responses can vary
 based on the popularity of the question and the expertise of the person answering. Answers may also not be 100% relevant
 to our specific situation, potentially requiring reading through multiple answers from multiple questions to get the help
 we want.
 
-But now, with the advent of large language models, we can get help in a more natural way by simply asking a question in
-plain English and getting an immediate response that is tailored to our specific needs.
+But with the advent of large language models, we can get help in a more natural way by simply asking a question in
+plain English and getting an immediate response that's tailored to our specific needs. 
+
+With the rate of progress in both the quality of performance of LLMs and the hardware to run them we expect this to become 
+the new normal for how most people will get answers to their questions in future.
 
 ## Person vs Question
 
-[pvq.app](https://pvq.app) was built to provide a useful platform for other developers in this new age by enlisting the help of the
+[pvq.app](https://pvq.app) was created to provide a useful platform for other developers in this new age by enlisting the help of the
 best Open Source and Proprietary large language models available to provide immediate and relevant answers to specific questions.
 Instead of just using a single LLM to provide answers, we're using multiple models to provide different perspectives
 on the same question that we'll use to analyze the strengths of different LLMs at answering different types of questions.
 
 ## Initial Base Line
 
-For our initial dataset we've started with the top 100k questions from StackOverflow and created answers for them using
-the most popular open LLM's that were ideally suited for answering technical and programming questions:
+PvQ's initial dataset started with the **top 100k questions** from StackOverflow and generated **over 1 million answers**
+for them using the most popular open LLMs that were ideally suited for answering technical and programming questions, including:
 
 - [Gemma 2B](https://ai.google.dev/gemma) (2B) by Google
 - [Qwen 1.5](https://github.com/QwenLM/Qwen1.5) (4B) by Qwen Team
@@ -87,12 +90,15 @@ For new questions asked we'll also include access to the best performing proprie
 - [Claude 3 Haiku](https://www.anthropic.com/news/claude-3-haiku) by Anthropic
 - [Llama3 70B](https://llama.meta.com/llama3/) (70B) by Meta
 - [Command-R](https://cohere.com/blog/command-r) (35B) by Cohere
-- [WizardLM2](https://wizardlm.github.io/WizardLM2/) (8x22B) by Microsoft
+- [WizardLM2](https://wizardlm.github.io/WizardLM2/) (8x22B) by Microsoft (Mistral AI base model)
 - [Claude 3 Sonnet](https://www.anthropic.com/news/claude-3-family) by Anthropic
 - [Command-R+](https://cohere.com/blog/command-r-plus-microsoft-azure) (104B) by Cohere
 - [GPT 4 Turbo](https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo) by OpenAI
 - [Claude 3 Opus](https://www.anthropic.com/claude) by Anthropic
 
+All models were used to answer the **Top 1000 most voted questions** on StackOverflow to evaluate their performance in
+answering technical questions on our [Leaderboard](/leaderboard).
+
 ## Open Questions and Answers for all
 
 All questions, answers and comments is publicly available for everyone to freely use under the same

diff --git a/MyApp/_posts/2024-04-01_pvq-intro.md b/MyApp/_posts/2024-04-01_pvq-intro.md
@@ -12,27 +12,30 @@ Like most developers we're captivated by the amazing things large language model
 have to transform the way we interact with and use technology. One of the areas they can be immediately beneficial with
 is in getting help in learning how to accomplish a task or solving a particular issue.
 
-Previously we would need to seek out answers by scanning the Internet, reading through documentation and blogs to find
-out answers for ourselves. Forums and particularly Stack Overflow have been a great resource for developers in being able
-to get help from other developers who have faced similar issues. But the timeliness and quality of the responses can vary
+Previously we'd need to seek out answers by scanning the Internet, reading through docs, tutorials and blogs to find
+out answers for ourselves. Forums and particularly Stack Overflow have been great resources for developers in being able
+to get help from other devs who have faced similar issues. But the timeliness and quality of the responses can vary
 based on the popularity of the question and the expertise of the person answering. Answers may also not be 100% relevant
 to our specific situation, potentially requiring reading through multiple answers from multiple questions to get the help
 we want.
 
-But now, with the advent of large language models, we can get help in a more natural way by simply asking a question in
-plain English and getting an immediate response that is tailored to our specific needs.
+But with the advent of large language models, we can get help in a more natural way by simply asking a question in
+plain English and getting an immediate response that's tailored to our specific needs.
+
+With the rate of progress in both the quality of performance of LLMs and the hardware to run them we expect this to become
+the new normal for how most people will get answers to their questions in future.
 
 ## Person vs Question
 
-[pvq.app](https://pvq.app) was built to provide a useful platform for other developers in this new age by enlisting the help of the 
-best Open Source and Proprietary large language models available to provide immediate and relevant answers to specific questions. 
-Instead of just using a single LLM to provide answers, we're using multiple models to provide different perspectives 
+[pvq.app](https://pvq.app) was created to provide a useful platform for other developers in this new age by enlisting the help of the
+best Open Source and Proprietary large language models available to provide immediate and relevant answers to specific questions.
+Instead of just using a single LLM to provide answers, we're using multiple models to provide different perspectives
 on the same question that we'll use to analyze the strengths of different LLMs at answering different types of questions.
 
 ## Initial Base Line
 
-For our initial dataset we've started with the top 100k questions from StackOverflow and created answers for them using
-the most popular open LLM's that were ideally suited for answering technical and programming questions:
+PvQ's initial dataset started with the **top 100k questions** from StackOverflow and generated **over 1 million answers**
+for them using the most popular open LLMs that were ideally suited for answering technical and programming questions, including:
 
 - [Gemma 2B](https://ai.google.dev/gemma) (2B) by Google
 - [Qwen 1.5](https://github.com/QwenLM/Qwen1.5) (4B) by Qwen Team
@@ -43,25 +46,25 @@ the most popular open LLM's that were ideally suited for answering technical and
 - [Gemma 7B](https://ai.google.dev/gemma) (7B) by Google
 - [Llama3 8B](https://llama.meta.com/llama3/) (8B) by Meta
 
-For our initial pass we've evaluated how each of these models performed on the StackOverflow dataset and have published 
-the results on our [Leaderboard](/leaderboard) page which we're also comparing against the highest voted and accepted answers on 
+For our initial pass we've evaluated how each of these models performed on the StackOverflow dataset and have published
+the results on our [Leaderboard](/leaderboard) page which we're also comparing against the highest voted and accepted answers on
 StackOverflow to see how well they measure up against the best human answers.
 
 ### Continuously Improving Models
 
-After evaluating the initial results we decided to remove the worst performing **Phi 2**, **Gemma 2B** and **Qwen 1.5 4B** 
-models from our base model lineup and replaced **Phi2** answers with **Phi3**, upgraded **Gemma 2B** to **Gemma 7B** and included the 
+After evaluating the initial results we decided to remove the worst performing **Phi 2**, **Gemma 2B** and **Qwen 1.5 4B**
+models from our base model lineup and replaced **Phi2** answers with **Phi3**, upgraded **Gemma 2B** to **Gemma 7B** and included the
 newly released **Llama3 8B** and **70B** models from Meta to our lineup.
 
 We'll be continuously evaluating and upgrading our active models to ensure we're using the best models available.
 
 ### Answers are Graded and Ranked
 
-In addition to answering questions, we're also enlisting the help of LLMs to help moderate answers, where all answers 
-(including user contributed answers) are graded and ranked based on how well and how relevant they answer the 
-question asked. 
+In addition to answering questions, we're also enlisting the help of LLMs to help moderate answers, where all answers
+(including user contributed answers) are graded and ranked based on how well and how relevant they answer the
+question asked.
 
-This information is used to rank the best answers for each question which are surfaced to the top, with its grade 
+This information is used to rank the best answers for each question which are surfaced to the top, with its grade
 displayed alongside answers to provide a review on the quality, relevance and critiques of the answer.
 
 ::: {.shadow .hover:shadow-lg}
@@ -91,24 +94,27 @@ For new questions asked we'll also include access to the best performing proprie
 - [Claude 3 Haiku](https://www.anthropic.com/news/claude-3-haiku) by Anthropic
 - [Llama3 70B](https://llama.meta.com/llama3/) (70B) by Meta
 - [Command-R](https://cohere.com/blog/command-r) (35B) by Cohere
-- [WizardLM2](https://wizardlm.github.io/WizardLM2/) (8x22B) by Microsoft
+- [WizardLM2](https://wizardlm.github.io/WizardLM2/) (8x22B) by Microsoft (Mistral AI base model)
 - [Claude 3 Sonnet](https://www.anthropic.com/news/claude-3-family) by Anthropic
 - [Command-R+](https://cohere.com/blog/command-r-plus-microsoft-azure) (104B) by Cohere
 - [GPT 4 Turbo](https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo) by OpenAI
 - [Claude 3 Opus](https://www.anthropic.com/claude) by Anthropic
 
+All models were used to answer the **Top 1000 most voted questions** on StackOverflow to evaluate their performance in
+answering technical questions on our [Leaderboard](/leaderboard).
+
 ## Open Questions and Answers for all
 
 All questions, answers and comments is publicly available for everyone to freely use under the same
-[CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/) license used by StackOverflow. 
+[CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/) license used by StackOverflow.
 
 ## Help improve Answers
 
 You can help improve the quality of answers by providing any kind of feedback including asking new questions,
-up voting good answers, down voting bad ones, reporting inappropriate ones, correcting answers with inaccuracies or 
-asking the model for further clarifications on answers that are unclear. 
+up voting good answers, down voting bad ones, reporting inappropriate ones, correcting answers with inaccuracies or
+asking the model for further clarifications on answers that are unclear.
 
-The most active users who help curate and improve the quality of questions and answers will have the opportunity to 
+The most active users who help curate and improve the quality of questions and answers will have the opportunity to
 become moderators where they'll have access to all our models.
 
 We also welcome attempts to **Beat Large Language Models** by providing your own answers to questions. We'll rank
@@ -118,7 +124,7 @@ This feedback will feed back into [LeaderBoard](/leaderboard) and improve the qu
 
 ## Future Work
 
-After having established the initial base line we'll look towards evaluating different strategies and specialized models 
+After having established the initial base line we'll look towards evaluating different strategies and specialized models
 to see if we're able to improve the quality, ranking and grading of answers that can be provided.
 
 ## Feedback ❤️