Structure task output (#174)

# Add Structured Output Feature This PR adds Structured Output support to KaibanJS, allowing users to define and enforce specific output structures for their agent tasks using Zod schemas. ## Changes - Add outputSchema property to Task class - Implement schema validation in ReactChampionAgent - Add OUTPUT_SCHEMA_VALIDATION_ERROR state - Integrate with workflowLogs for error tracking - Add schema validation feedback loop in agent processing ## Implementation Details - Use Zod for schema validation - Add schema validation to agent's output processing - Integrate with existing state management system - Add validation error handling in agent loop - Track validation events in workflowLogs ## Testing - Add tests for schema validation - Add tests for error handling - Add tests for agent recovery from validation errors
kaiban-ai · Dec 21, 2024 · 2d85d7e · 2d85d7e
2 parents 2491618 + e240d19
commit 2d85d7e
Show file tree

Hide file tree

Showing 27 changed files with 8,858 additions and 77 deletions.
diff --git a/packages/tools/CHANGELOG.md b/packages/tools/CHANGELOG.md
@@ -5,6 +5,7 @@ All notable changes to the `@kaibanjs/tools` package will be documented in this
 ## [0.4.1] - 2024-12-19
 
 ### Documentation
+
 - Added missing README files for:
   - Exa Search Tool
   - Firecrawl Tool
@@ -14,38 +15,44 @@ All notable changes to the `@kaibanjs/tools` package will be documented in this
 ## [0.4.0] - 2024-12-19
 
 ### Added
+
 - Zapier Webhook Tool for workflow automation integration
 - Make Webhook Tool for Make (formerly Integromat) integration
 
 ## [0.3.0] - 2024-12-14
 
 ### Added
+
 - Simple RAG Tool for basic RAG implementations
 - Website Search Tool for semantic website content search
 - PDF Search Tool for document analysis
 - Text File Search Tool for plain text processing
 
 ### Enhanced
+
 - Added support for custom vector stores
 - Improved documentation for all tools
 - Added comprehensive examples in tool READMEs
 
 ## [0.2.0] - 2024-11-17
 
 ### Added
+
 - Serper Tool for Google Search API integration
 - WolframAlpha Tool for computational queries
 - Exa Search Tool for neural search capabilities
 - GitHub Issues Tool for repository management
 
 ### Improved
+
 - Enhanced error handling across all tools
 - Better type definitions and input validation
 - Updated documentation with more examples
 
 ## [0.1.0] - Initial Release
 
 ### Added
+
 - Initial package setup
 - Basic tool implementation structure
 - Core utility functions

diff --git a/packages/tools/README.md b/packages/tools/README.md
@@ -23,20 +23,20 @@ npm install @kaibanjs/tools
 
 Here's a list of all available tools. Click on the tool names to view their detailed documentation.
 
-| Tool | Description | Documentation |
-|------|-------------|---------------|
-| Exa | AI-focused search engine using embeddings to organize web data | [README](src/exa/README.md) |
-| Firecrawl | Web scraping service for extracting structured data | [README](src/firecrawl/README.md) |
-| GitHub Issues | GitHub API integration for fetching and analyzing repository issues | [README](src/github-issues/README.md) |
-| PDF Search | Extract and search content from PDF documents | [README](src/pdf-search/README.md) |
-| Serper | Google Search API integration with support for multiple search types | [README](src/serper/README.md) |
-| Simple RAG | Basic Retrieval-Augmented Generation implementation for Q&A | [README](src/simple-rag/README.md) |
-| Tavily Search | AI-optimized search engine for comprehensive and accurate results | [README](src/tavily/README.md) |
-| Text File Search | Search and analyze content within text files | [README](src/textfile-search/README.md) |
-| Website Search | Semantic search within website content using RAG models | [README](src/website-search/README.md) |
-| WolframAlpha | Computational intelligence engine for complex queries and calculations | [README](src/wolfram-alpha/README.md) |
-| Zapier Webhook | Integration with Zapier for workflow automation | [README](src/zapier-webhook/README.md) |
-| Make Webhook | Integration with Make (formerly Integromat) for workflow automation | [README](src/make-webhook/README.md) |
+| Tool             | Description                                                            | Documentation                           |
+| ---------------- | ---------------------------------------------------------------------- | --------------------------------------- |
+| Exa              | AI-focused search engine using embeddings to organize web data         | [README](src/exa/README.md)             |
+| Firecrawl        | Web scraping service for extracting structured data                    | [README](src/firecrawl/README.md)       |
+| GitHub Issues    | GitHub API integration for fetching and analyzing repository issues    | [README](src/github-issues/README.md)   |
+| PDF Search       | Extract and search content from PDF documents                          | [README](src/pdf-search/README.md)      |
+| Serper           | Google Search API integration with support for multiple search types   | [README](src/serper/README.md)          |
+| Simple RAG       | Basic Retrieval-Augmented Generation implementation for Q&A            | [README](src/simple-rag/README.md)      |
+| Tavily Search    | AI-optimized search engine for comprehensive and accurate results      | [README](src/tavily/README.md)          |
+| Text File Search | Search and analyze content within text files                           | [README](src/textfile-search/README.md) |
+| Website Search   | Semantic search within website content using RAG models                | [README](src/website-search/README.md)  |
+| WolframAlpha     | Computational intelligence engine for complex queries and calculations | [README](src/wolfram-alpha/README.md)   |
+| Zapier Webhook   | Integration with Zapier for workflow automation                        | [README](src/zapier-webhook/README.md)  |
+| Make Webhook     | Integration with Make (formerly Integromat) for workflow automation    | [README](src/make-webhook/README.md)    |
 
 ## Development
 

diff --git a/packages/tools/src/exa/README.md b/packages/tools/src/exa/README.md
@@ -56,11 +56,11 @@ const tool = new ExaSearch({
   type: 'neural',
   useAutoprompt: false,
   numResults: 10,
-  category: 'company'
+  category: 'company',
 });
 
-const result = await tool._call({ 
-  query: 'AI companies focusing on natural language processing' 
+const result = await tool._call({
+  query: 'AI companies focusing on natural language processing',
 });
 ```
 
@@ -75,13 +75,13 @@ const tool = new ExaSearch({
   startPublishedDate: '2023-01-01',
   contents: {
     text: { maxCharacters: 1000, includeHtmlTags: false },
-    highlights: { numSentences: 3, highlightsPerUrl: 2 }
-  }
+    highlights: { numSentences: 3, highlightsPerUrl: 2 },
+  },
 });
 
 try {
-  const result = await tool._call({ 
-    query: 'recent developments in quantum computing' 
+  const result = await tool._call({
+    query: 'recent developments in quantum computing',
   });
   console.log(result);
 } catch (error) {
@@ -91,4 +91,4 @@ try {
 
 ### Disclaimer
 
-Ensure you have proper API credentials and respect Exa's usage terms and rate limits. Some features may require specific subscription tiers. 
+Ensure you have proper API credentials and respect Exa's usage terms and rate limits. Some features may require specific subscription tiers.
diff --git a/packages/tools/src/firecrawl/README.md b/packages/tools/src/firecrawl/README.md
@@ -44,11 +44,11 @@ The output is the scraped content from the specified URL, formatted according to
 ```javascript
 const tool = new Firecrawl({
   apiKey: 'your-api-key',
-  format: 'markdown'
+  format: 'markdown',
 });
 
-const result = await tool._call({ 
-  url: 'https://example.com' 
+const result = await tool._call({
+  url: 'https://example.com',
 });
 ```
 
@@ -57,17 +57,17 @@ const result = await tool._call({
 ```javascript
 const tool = new Firecrawl({
   apiKey: process.env.FIRECRAWL_API_KEY,
-  format: 'markdown'
+  format: 'markdown',
 });
 
 try {
-  const result = await tool._call({ 
-    url: 'https://example.com/blog/article' 
+  const result = await tool._call({
+    url: 'https://example.com/blog/article',
   });
-  
+
   // Process the scraped content
   console.log('Scraped content:', result);
-  
+
   // Use the content with an LLM or other processing
   // ...
 } catch (error) {
@@ -77,4 +77,4 @@ try {
 
 ### Disclaimer
 
-Ensure you have proper API credentials and respect Firecrawl's usage terms and rate limits. The service offers flexible pricing plans, including a free tier for small-scale use. When scraping websites, make sure to comply with the target website's terms of service and robots.txt directives. 
+Ensure you have proper API credentials and respect Firecrawl's usage terms and rate limits. The service offers flexible pricing plans, including a free tier for small-scale use. When scraping websites, make sure to comply with the target website's terms of service and robots.txt directives.
diff --git a/packages/tools/src/github-issues/README.md b/packages/tools/src/github-issues/README.md
@@ -26,6 +26,7 @@ The tool uses the following components:
 ## Authentication
 
 The tool supports two authentication modes:
+
 - Unauthenticated: Works with public repositories (60 requests/hour limit)
 - Authenticated: Uses GitHub Personal Access Token (5,000 requests/hour limit)
 
@@ -36,6 +37,7 @@ The input should be a JSON object with a "repoUrl" field containing the GitHub r
 ## Output
 
 The output is a structured JSON object containing:
+
 - Repository information (name, URL, owner)
 - Metadata (total issues, last updated date, limit)
 - Array of issues with details (number, title, URL, labels, description)
@@ -46,11 +48,11 @@ The output is a structured JSON object containing:
 // Basic usage
 const tool = new GithubIssues({
   token: 'github_pat_...', // Optional: GitHub personal access token
-  limit: 20 // Optional: number of issues to fetch (default: 10)
+  limit: 20, // Optional: number of issues to fetch (default: 10)
 });
 
-const result = await tool._call({ 
-  repoUrl: 'https://github.com/owner/repo' 
+const result = await tool._call({
+  repoUrl: 'https://github.com/owner/repo',
 });
 ```
 
@@ -59,20 +61,20 @@ const result = await tool._call({
 ```javascript
 const tool = new GithubIssues({
   token: process.env.GITHUB_TOKEN,
-  limit: 50
+  limit: 50,
 });
 
 try {
-  const result = await tool._call({ 
-    repoUrl: 'https://github.com/facebook/react' 
+  const result = await tool._call({
+    repoUrl: 'https://github.com/facebook/react',
   });
-  
+
   // Access structured data
   console.log('Repository:', result.repository.name);
   console.log('Total Issues:', result.metadata.totalIssues);
-  
+
   // Process issues
-  result.issues.forEach(issue => {
+  result.issues.forEach((issue) => {
     console.log(`#${issue.number}: ${issue.title}`);
     console.log(`Labels: ${issue.labels.join(', ')}`);
     console.log(`URL: ${issue.url}\n`);
@@ -89,4 +91,4 @@ try {
 
 ### Disclaimer
 
-Ensure you have proper API credentials if needed and respect GitHub's API rate limits and terms of service. For private repositories, authentication is required. 
+Ensure you have proper API credentials if needed and respect GitHub's API rate limits and terms of service. For private repositories, authentication is required.
diff --git a/packages/tools/src/serper/README.md b/packages/tools/src/serper/README.md
@@ -16,6 +16,7 @@ The tool uses the following components:
 ## Search Types
 
 The tool supports multiple search types:
+
 - "search" (default): For general search queries
 - "images": For image search
 - "videos": For video search
@@ -39,6 +40,7 @@ The tool supports multiple search types:
 ## Input
 
 The input depends on the search type:
+
 - For webpage scraping: A JSON object with a "url" field
 - For all other search types: A JSON object with a "query" field
 
@@ -52,21 +54,21 @@ The output is a structured JSON response from Serper containing search results b
 // Basic search
 const tool = new Serper({
   apiKey: 'your-api-key',
-  type: 'search'  // Optional, defaults to 'search'
+  type: 'search', // Optional, defaults to 'search'
 });
 
-const result = await tool._call({ 
-  query: 'latest AI developments' 
+const result = await tool._call({
+  query: 'latest AI developments',
 });
 
 // Webpage scraping
 const webScraperTool = new Serper({
   apiKey: 'your-api-key',
-  type: 'webpage'
+  type: 'webpage',
 });
 
-const scrapingResult = await webScraperTool._call({ 
-  url: 'https://example.com' 
+const scrapingResult = await webScraperTool._call({
+  url: 'https://example.com',
 });
 ```
 
@@ -77,14 +79,14 @@ const tool = new Serper({
   apiKey: process.env.SERPER_API_KEY,
   type: 'news',
   params: {
-    num: 10,  // Number of results
-    gl: 'us'  // Geographic location
-  }
+    num: 10, // Number of results
+    gl: 'us', // Geographic location
+  },
 });
 
 try {
-  const result = await tool._call({ 
-    query: 'artificial intelligence breakthroughs' 
+  const result = await tool._call({
+    query: 'artificial intelligence breakthroughs',
   });
   console.log(result);
 } catch (error) {
@@ -94,4 +96,4 @@ try {
 
 ### Disclaimer
 
-Ensure you have proper API credentials and respect Serper's usage terms and rate limits. The webpage scraping feature is in Beta and may be subject to changes. 
+Ensure you have proper API credentials and respect Serper's usage terms and rate limits. The webpage scraping feature is in Beta and may be subject to changes.
diff --git a/packages/tools/src/tavily/README.md b/packages/tools/src/tavily/README.md
@@ -35,11 +35,11 @@ The output is a JSON-formatted string containing an array of search results from
 ```javascript
 const tool = new TavilySearchResults({
   apiKey: 'your-api-key',
-  maxResults: 5  // Optional, defaults to 5
+  maxResults: 5, // Optional, defaults to 5
 });
 
-const result = await tool._call({ 
-  searchQuery: 'What are the latest developments in AI?' 
+const result = await tool._call({
+  searchQuery: 'What are the latest developments in AI?',
 });
 ```
 
@@ -48,17 +48,17 @@ const result = await tool._call({
 ```javascript
 const tool = new TavilySearchResults({
   apiKey: process.env.TAVILY_API_KEY,
-  maxResults: 10
+  maxResults: 10,
 });
 
 try {
-  const result = await tool._call({ 
-    searchQuery: 'recent breakthroughs in quantum computing' 
+  const result = await tool._call({
+    searchQuery: 'recent breakthroughs in quantum computing',
   });
-  
+
   // Parse the JSON string back to an object
   const searchResults = JSON.parse(result);
-  
+
   // Process the results
   searchResults.forEach((item, index) => {
     console.log(`Result ${index + 1}:`, item);
@@ -70,4 +70,4 @@ try {
 
 ### Disclaimer
 
-Ensure you have proper API credentials and respect Tavily's usage terms and rate limits. The search results are optimized for current events and may vary based on the time of the query. 
+Ensure you have proper API credentials and respect Tavily's usage terms and rate limits. The search results are optimized for current events and may vary based on the time of the query.
diff --git a/packages/tools/src/wolfram-alpha/README.md b/packages/tools/src/wolfram-alpha/README.md
@@ -39,11 +39,11 @@ The output is the response from WolframAlpha's computational engine, providing d
 
 ```javascript
 const tool = new WolframAlphaTool({
-  appId: 'your-app-id'
+  appId: 'your-app-id',
 });
 
-const result = await tool._call({ 
-  query: 'solve x^2 + 2x + 1 = 0' 
+const result = await tool._call({
+  query: 'solve x^2 + 2x + 1 = 0',
 });
 ```
 
@@ -56,12 +56,12 @@ const result = await tool._call({
 
 ```javascript
 const tool = new WolframAlphaTool({
-  appId: process.env.WOLFRAM_APP_ID
+  appId: process.env.WOLFRAM_APP_ID,
 });
 
 try {
-  const result = await tool._call({ 
-    query: 'calculate the orbital period of Mars' 
+  const result = await tool._call({
+    query: 'calculate the orbital period of Mars',
   });
   console.log(result);
 } catch (error) {
@@ -71,4 +71,4 @@ try {
 
 ### Disclaimer
 
-Ensure you have proper API credentials and respect WolframAlpha's usage terms and rate limits. 
+Ensure you have proper API credentials and respect WolframAlpha's usage terms and rate limits.