Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: refactor COT for maintainability #58

Merged
merged 2 commits into from
Jan 28, 2025
Merged

Conversation

ponderingdemocritus
Copy link
Contributor

@ponderingdemocritus ponderingdemocritus commented Jan 28, 2025

Summary by CodeRabbit

Release Notes

  • New Features

    • Enhanced logging capabilities with debug-level log support
    • Improved goal management and context gathering in Chain of Thought processing
  • Styling Updates

    • Updated sidebar components with new background gradients and backdrop effects
    • Added visual enhancements for light and dark themes
  • Performance Improvements

    • Refined goal prioritization and processing mechanisms
    • More modular approach to goal decomposition and action handling
  • Model Configuration

    • Updated model source for language model initialization
  • Accessibility Enhancements

    • Made goal management properties publicly accessible and introduced new methods for handling goal failures and prioritization

Copy link

vercel bot commented Jan 28, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
daydreams ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 28, 2025 5:01am

Copy link
Contributor

coderabbitai bot commented Jan 28, 2025

Walkthrough

This pull request introduces comprehensive changes across multiple files in the CodeRabbit project. The modifications span UI styling, logging configuration, and core functionality improvements. Key updates include enhancing the visual presentation of sidebar components with gradient backgrounds, adding debug logging capabilities, changing the language model source, and significantly refactoring the ChainOfThought and GoalManager classes to improve goal decomposition, context gathering, and goal management processes.

Changes

File Change Summary
clients/example-ui/src/components/sidebar-right.tsx Updated className with radial gradient background styles
clients/example-ui/src/routes/__root.tsx Added complex CSS classes for AppSidebar, SidebarInset, and SidebarRight with background and backdrop styling
examples/example-basic.ts Added LogLevel.DEBUG configuration for ChainOfThought
examples/example-goal.ts Changed LLM model from "deepseek/deepseek-r1" to "openrouter:deepseek/deepseek-r1"
packages/core/src/core/chain-of-thought.ts Major refactoring of goal decomposition, context gathering, and thinking process with new private methods
packages/core/src/core/goal-manager.ts Changed goals property visibility, added methods for goal failure processing and priority-based goal retrieval

Sequence Diagram

sequenceDiagram
    participant User
    participant ChainOfThought
    participant GoalManager
    participant LLMClient

    User->>ChainOfThought: Provide objective
    ChainOfThought->>ChainOfThought: Gather relevant context
    ChainOfThought->>GoalManager: Create and link goals
    GoalManager-->>ChainOfThought: Goals created
    ChainOfThought->>LLMClient: Get initial actions
    LLMClient-->>ChainOfThought: Return actions
    ChainOfThought->>ChainOfThought: Process and verify actions
    ChainOfThought->>GoalManager: Update goal status
Loading

Possibly related PRs

Poem

🐰 Hopping through code with glee,
Gradients bloom, goals run free!
Sidebars shine with radiant light,
Debugging turned up just right.
A rabbit's code, a pure delight! 🌈

✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (21)
packages/core/src/core/chain-of-thought.ts (14)

Line range hint 108-147: Handle edge cases in gatherRelevantContext.
This function gracefully combines relevant docs and experiences. If both come back empty, you may want to log a warning or fallback context to prevent possible downstream issues. Otherwise, this naming and flow are clear.


149-152: Use consistent parameter naming and docstrings.
The function signature generateGoalHierarchy(objective, context) is descriptive, but adding a JSDoc block or clarifying docstrings for context would help ensure future maintainers understand the structure of context.gameStateContext and context.experienceContext.

Also applies to: 159-159, 162-162


Line range hint 210-220: Refine retry logging.
The code uses onRetry to log errors when validating the LLM response. This is good, but consider also logging the attempt count inside the error message for quicker correlation.


236-288: Factor out large “createAndLinkGoals” logic.
This method is quite long, containing multiple tasks: creating goals, linking dependencies, emitting events, etc. Consider splitting it further or adding inline comments for better readability.


335-341: Ensure prompt size is appropriate.
Building a prompt with all relevant documents and experiences can lead to large payloads. Consider implementing truncation or summarization if the combined text grows too large.


673-678: Improve failure reason logging.
When a goal fails, you emit an event with the error. Consider additional logging or metadata to help track which step or sub-task caused the failure.


1239-1240: Add a descriptive comment for initializeThinkingContext.
This block starts a new logic path for “thinking.” A short comment describing the function’s responsibility (setup memory context, log user query, etc.) helps future maintainers.


1261-1273: Centralize success/failure handling.
The logic for executeAction, handle result, verify completion, etc., is well-structured. As the code grows, consider a single “action lifecycle” function to unify logging and error handling, avoiding repeated patterns in these lines.


1284-1289: Refine system step naming.
Recording a system step with “continuation” is clear for developers. Consider including more descriptive text about what is being continued to aid debugging.


1294-1306: Limit large iteration loops.
Once the iteration limit is reached, you log an error and emit “think:timeout.” Evaluate whether automatically stopping is suitable or if partial solutions/alternative strategies are needed.


1308-1326: Update JSDoc for initializeThinkingContext.
You retrieve episodes and docs, then log them. The JSDoc could note that it stores the user query as a step for transparency. Minor but clarifies the method’s side effects.


1375-1404: Balance verification cost vs. reactivity.
Frequent calls to validateLLMResponseSchema can be expensive. Evaluate the performance tradeoff between robust LLM-based verification and the overhead it introduces.


Line range hint 1405-1449: Avoid re-building large prompts repeatedly.
buildVerificationPrompt calls this.buildPrompt({ result }). If done frequently, you might want to cache or partially reuse prompt components to reduce overhead.


1464-1474: Enhance final logs on completion.
handleCompletion logs a short message and emits "think:complete". For broader analytics, consider storing final reasoning or success metrics in your logs or memory layer.

examples/example-basic.ts (1)

22-22: Consolidate import usage.
Importing HandlerRole and LogLevel from the same module is good for discoverability. Consider grouping them together in a single import line for brevity.

packages/core/src/core/goal-manager.ts (3)

8-8: Reevaluate public exposure of goals.
Changing goals from private to public can break encapsulation. Consider providing read methods instead of direct public access.


33-53: Promote consistent error/object logging.
processGoalFailure updates the goal status to “failed” and blocks the parent. Consider logging or emitting an event with the failure reason for easy debugging.


54-79: Confirm horizon-based ordering.
The getReadyGoalsByPriority() method prioritizes short > medium > long. Ensure this approach aligns with your strategic or domain-specific requirements.

examples/example-goal.ts (3)

Line range hint 279-380: Consider modularizing the main interaction loop.

The main interaction loop is quite long and handles multiple concerns. Consider extracting the goal execution logic and progress tracking into separate functions for better maintainability.

Example refactor:

async function executeGoals(dreams: ChainOfThought) {
    const stats = {
        completed: 0,
        failed: 0,
        total: 0,
    };
    
    while (true) {
        const { readyGoals, activeGoals, pendingGoals } = getGoalStatus(dreams);
        
        displayProgress(stats, readyGoals, activeGoals, pendingGoals);
        
        if (isAllGoalsComplete(readyGoals, activeGoals, pendingGoals)) {
            console.log(chalk.green("\n✨ All goals completed!"));
            break;
        }
        
        if (isBlockedState(readyGoals, activeGoals)) {
            displayBlockedGoals(dreams, pendingGoals);
            break;
        }
        
        try {
            await dreams.processHighestPriorityGoal();
            stats.completed++;
        } catch (error) {
            stats.failed++;
            if (!await shouldContinueExecution()) {
                break;
            }
        }
        
        stats.total++;
    }
    
    return stats;
}

Line range hint 382-397: Add error handling for success rate calculation.

The success rate calculation could fail if stats.total is zero. Consider adding a guard clause.

-                chalk.blue(
-                    `📈 Success Rate: ${Math.round(
-                        (stats.completed / stats.total) * 100
-                    )}%`
-                )
+                chalk.blue(
+                    `📈 Success Rate: ${
+                        stats.total > 0
+                            ? `${Math.round((stats.completed / stats.total) * 100)}%`
+                            : 'N/A'
+                    }`
+                )

Line range hint 264-278: Extract magic numbers into named constants.

Consider extracting magic numbers like 5 and 3 into named constants for better maintainability and clarity.

+const RECENT_EXPERIENCES_LIMIT = 5;
+const SIMILAR_DOCUMENTS_LIMIT = 3;
+
-            const recentExperiences = await dreams.memory.getRecentEpisodes(5);
+            const recentExperiences = await dreams.memory.getRecentEpisodes(RECENT_EXPERIENCES_LIMIT);
             console.log(chalk.blue("\n🔄 Recent Experiences:"));
             recentExperiences.forEach((exp, index) => {
                 console.log(chalk.blue(`\n${index + 1}. Experience:`));
                 console.log(`   Action: ${exp.action}`);
                 console.log(`   Outcome: ${exp.outcome}`);
                 console.log(`   Importance: ${exp.importance || "N/A"}`);
             });

             const relevantDocs = await dreams.memory.findSimilarDocuments(
                 userInput,
-                3
+                SIMILAR_DOCUMENTS_LIMIT
             );
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a14dbcb and 06690aa.

📒 Files selected for processing (6)
  • clients/example-ui/src/components/sidebar-right.tsx (1 hunks)
  • clients/example-ui/src/routes/__root.tsx (2 hunks)
  • examples/example-basic.ts (2 hunks)
  • examples/example-goal.ts (1 hunks)
  • packages/core/src/core/chain-of-thought.ts (14 hunks)
  • packages/core/src/core/goal-manager.ts (2 hunks)
✅ Files skipped from review due to trivial changes (2)
  • clients/example-ui/src/components/sidebar-right.tsx
  • clients/example-ui/src/routes/__root.tsx
🔇 Additional comments (12)
packages/core/src/core/chain-of-thought.ts (10)

82-83: Ensure debug-level logs add value.
Including { objective } in the debug log is good for traceability. Just confirm that it doesn't clutter logs excessively in production use.


85-106: Validate fallback flow for empty or invalid context.
The newly introduced calls to gatherRelevantContext, generateGoalHierarchy, and createAndLinkGoals significantly improve modularity and readability. However, consider how the method behaves if the retrieved context is partially or entirely empty (e.g., no documents, no experiences). Verify whether additional fallback or error handling might be needed.


314-320: Check for memory returns.
When calling findSimilarDocuments and findSimilarEpisodes, consider verifying the results are non-empty before usage. Logging a warning if no matches are found could help catch potential memory indexing issues.


356-356: Avoid capturing sensitive info in logs.
The retry error log might occasionally include input details. Ensure no user PII or tokens are accidentally exposed in stack traces or logs.
[security]


565-567: Check goal completeness condition.
Filtering out completed goals is logical. Ensure that goals that fail or block are handled separately in the calling code so they’re not missed in priority ordering.


1259-1259: Check for null currentAction.
As you shift and pop from pendingActions, it’s guaranteed to exist but consider defensive checks in case of concurrency or if code merges alter the flow.


1275-1281: Clarify partial vs. full completion behavior.
You call handleCompletion only when isComplete || !completion.shouldContinue. If partial completions are possible, ensure you convey that nuance to the user or logs.


1328-1357: Validate the shape of initialResponse carefully.
When you spread initialResponse.actions, confirm that actions array is always present, even if empty. Also consider robust error handling if the LLM returns unexpected JSON.


1359-1373: Automatic knowledge storage might cause unbounded memory growth.
Storing episodes with large payloads may exhaust memory or degrade performance over time. Consider adding rotation, summarization, or cleanup strategies.
[performance]


1451-1462: Use typed structure for new actions.
extractNewActions transforms data from the LLM into CoTAction[]. Consider stricter type checks in case the LLM returns malformed data.

examples/example-basic.ts (1)

74-83: Good use of logLevel: LogLevel.DEBUG.
Providing a debug level helps clarify workflow steps at runtime. Ensure you switch it to a higher log level in production or external demos to avoid exposing sensitive data.

examples/example-goal.ts (1)

62-62: Verify OpenRouter integration requirements.

The model source has been changed to use OpenRouter as the provider. This change may have implications for:

  1. API compatibility
  2. Authentication
  3. Rate limiting
  4. Costs
  5. Performance characteristics

Run the following script to check for OpenRouter-specific configuration:

✅ Verification successful

OpenRouter integration is properly implemented

The codebase has the necessary OpenRouter support with proper API key configuration and comprehensive provider integration. Consider adding OpenRouter configuration details to the documentation for better maintainability.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify OpenRouter integration requirements and configuration.

# Check for OpenRouter API key configuration
echo "Checking for OpenRouter API key configuration..."
rg -l "OPENROUTER.*KEY|OPENAI.*KEY" .env* || echo "No OpenRouter API key configuration found"

# Check for OpenRouter-specific error handling
echo -e "\nChecking for OpenRouter error handling..."
rg -A 5 "openrouter" packages/core/src/core/llm-client.ts || echo "No OpenRouter-specific error handling found"

# Check for rate limiting configuration
echo -e "\nChecking for rate limiting configuration..."
rg -l "RATE_LIMIT|THROTTLE" .env* || echo "No rate limiting configuration found"

# Check for model configuration documentation
echo -e "\nChecking for model configuration documentation..."
fd -g "*README*" -X grep -l "openrouter" || echo "No OpenRouter documentation found"

Length of output: 3051

Comment on lines 1290 to 1292
this.emit("think:error", { query: userQuery, error });
throw error;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Propagate action errors gracefully.
Throwing the error is fine, but consider providing more actionable messages for the caller, especially if user queries might fail mid-execution.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
packages/core/src/core/chain-of-thought.ts (3)

282-284: Add type safety to dependency mapping.

The type assertion in the filter could be avoided by properly typing the map operation.

-            const realDeps = llmGoal.dependencies
-                .map((dep: string) => llmIdToRealId.get(dep))
-                .filter((id: string | undefined): id is string => !!id);
+            const realDeps = llmGoal.dependencies
+                .map(dep => llmIdToRealId.get(dep))
+                .filter((id): id is string => id !== undefined);

385-428: Consider extracting prompt templates to a separate file.

The prompt template is quite long and could be maintained better in a dedicated file. This would improve maintainability and make it easier to version control prompt changes.

Consider moving the prompt template to a separate file like prompts/validation-prompt.ts and importing it here.


1379-1407: Consider making verification rules configurable.

The verification logic uses hardcoded rules. Consider making these rules configurable to allow for different verification strategies.

Consider adding a configuration parameter to allow customization of verification rules:

interface VerificationConfig {
  rules: string[];
  requiredChecks: string[];
  minConfidenceScore?: number;
}

private async verifyCompletion(
  userQuery: string,
  result: any,
  currentAction: CoTAction,
  config?: VerificationConfig
): Promise<{
  complete: boolean;
  reason: string;
  shouldContinue: boolean;
  newActions: any[];
}>
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 06690aa and 6e6b69a.

📒 Files selected for processing (1)
  • packages/core/src/core/chain-of-thought.ts (16 hunks)
🧰 Additional context used
🪛 Biome (1.9.4)
packages/core/src/core/chain-of-thought.ts

[error] 447-447: void is confusing inside a union type.

Unsafe fix: Use undefined instead.

(lint/suspicious/noConfusingVoidType)


[error] 1239-1239: void is confusing inside a union type.

Unsafe fix: Use undefined instead.

(lint/suspicious/noConfusingVoidType)

🔇 Additional comments (1)
packages/core/src/core/chain-of-thought.ts (1)

314-319: LGTM! Well-structured parallel operations.

The use of Promise.all for concurrent operations improves performance while maintaining readability.

@@ -1267,68 +1236,16 @@ ${availableOutputs
public async think(
userQuery: string,
maxIterations: number = 10
): Promise<void> {
): Promise<void | string> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Replace void with undefined in union types.

Using void in union types can be confusing as it suggests a function that doesn't return anything, while undefined better represents an optional return value.

-    public async think(userQuery: string, maxIterations: number = 10): Promise<void | string>
+    public async think(userQuery: string, maxIterations: number = 10): Promise<undefined | string>
-    ): Promise<void | string>
+    ): Promise<undefined | string>

Also applies to: 447-447

🧰 Tools
🪛 Biome (1.9.4)

[error] 1239-1239: void is confusing inside a union type.

Unsafe fix: Use undefined instead.

(lint/suspicious/noConfusingVoidType)

Comment on lines +1294 to +1296
this.emit("think:error", { query: userQuery, error });
return JSON.stringify({ error });
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve error serialization in catch blocks.

Direct stringification of error objects might lose important error details. Consider creating a proper error response structure.

-                    return JSON.stringify({ error });
+                    return JSON.stringify({
+                        error: {
+                            message: error instanceof Error ? error.message : String(error),
+                            name: error instanceof Error ? error.name : 'Unknown Error',
+                            stack: error instanceof Error ? error.stack : undefined
+                        }
+                    });

Also applies to: 1307-1309

Comment on lines +236 to +240
private async createAndLinkGoals(goals: {
long_term: any[];
medium_term: any[];
short_term: any[];
}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add type definitions for the goals parameter.

The goals parameter uses the any type, which could lead to runtime errors. Consider defining an interface for the goal structure.

-    private async createAndLinkGoals(goals: {
-        long_term: any[];
-        medium_term: any[];
-        short_term: any[];
-    })
+    interface Goal {
+        id: string;
+        description: string;
+        success_criteria: string[];
+        dependencies: string[];
+        priority: number;
+        required_resources: string[];
+        estimated_difficulty: number;
+    }
+    
+    private async createAndLinkGoals(goals: {
+        long_term: Goal[];
+        medium_term: Goal[];
+        short_term: Goal[];
+    })

Committable suggestion skipped: line range outside the PR's diff.

@ponderingdemocritus ponderingdemocritus merged commit 80d06c1 into main Jan 28, 2025
6 checks passed
@ponderingdemocritus ponderingdemocritus deleted the cot-refactor branch January 28, 2025 06:29
@coderabbitai coderabbitai bot mentioned this pull request Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant