Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update ai models. fix bugs #162

Merged
merged 2 commits into from
Nov 19, 2024
Merged

update ai models. fix bugs #162

merged 2 commits into from
Nov 19, 2024

Conversation

thomasdavis
Copy link
Member

@thomasdavis thomasdavis commented Nov 19, 2024

Summary by CodeRabbit

  • New Features

    • Updated models for generating letters and suggestions to a more efficient version, potentially enhancing performance.
  • Bug Fixes

    • Improved error handling in various components to provide clearer feedback and logging during failures.
  • Chores

    • Removed analytics tracking calls from letter and suggestions components, streamlining the analytics functionality.

Copy link

changeset-bot bot commented Nov 19, 2024

⚠️ No Changeset found

Latest commit: 65c4b5c

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link

vercel bot commented Nov 19, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
jsonresume-org-homepage2 ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 19, 2024 11:11am
jsonresume-org-registry ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 19, 2024 11:11am

Copy link
Contributor

coderabbitai bot commented Nov 19, 2024

Important

Review skipped

Review was skipped due to path filters

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including **/dist/** will override the default block on the dist directory, by removing the pattern from both the lists.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The changes in this pull request involve modifications to several components and scripts within the application. Key alterations include the removal of tracking function calls from the @vercel/analytics/server package in the Letter and Suggestions components, affecting analytics functionality. Additionally, updates to various API handler functions involve changing the OpenAI model from gpt-4o-2024-08-06 to gpt-4o-mini, which may influence output characteristics. Enhancements in error handling and default values in calculation functions improve robustness without altering core functionalities.

Changes

File Path Change Summary
apps/registry/app/[username]/letter/page.js Removed track function from @vercel/analytics/server in Letter component.
apps/registry/app/[username]/suggestions/page.js Removed track function from @vercel/analytics/server in Suggestions component.
apps/registry/lib/calculations.js Added default values for workHistory, improved error handling in totalExperience, averageJobDuration, and careerProgression functions.
apps/registry/pages/api/letter.js Updated OpenAI model from gpt-4o-2024-08-06 to gpt-4o-mini in API handler.
apps/registry/pages/api/suggestions-beta.js Updated OpenAI model from gpt-4o-2024-08-06 to gpt-4o-mini in API handler.
apps/registry/pages/api/suggestions.js Updated OpenAI model from gpt-4o-2024-08-06 to gpt-4o-mini in API handler.
apps/registry/scripts/jobs/gpted.js Updated OpenAI model from gpt-4o-2024-08-06 to gpt-4o-mini, enhanced error handling in job processing script.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant SuggestionsAPI
    participant OpenAI
    User->>SuggestionsAPI: Request suggestions
    SuggestionsAPI->>OpenAI: Generate suggestions with gpt-4o-mini
    OpenAI-->>SuggestionsAPI: Return suggestions
    SuggestionsAPI-->>User: Deliver suggestions
Loading

🐇 "In the code, a change we see,
Analytics gone, just let it be.
With models light, our scripts now play,
Errors logged, come what may.
A hop, a skip, through data we glide,
In this update, we take great pride!" 🐇


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (13)
apps/registry/pages/api/letter.js (3)

Line range hint 1-32: Add input validation and rate limiting

The API endpoint lacks crucial security measures:

  1. No validation for required fields (username, jobDescription, tone)
  2. No rate limiting to prevent abuse
  3. No maximum length checks for inputs

Consider implementing the following:

+ import rateLimit from 'express-rate-limit';
+ import { z } from 'zod';

+ const limiter = rateLimit({
+   windowMs: 15 * 60 * 1000, // 15 minutes
+   max: 5 // limit each IP to 5 requests per windowMs
+ });

+ const inputSchema = z.object({
+   username: z.string().min(1).max(100),
+   jobDescription: z.string().max(5000),
+   tone: z.string().max(100)
+ });

  export default async function handler(req, res) {
+   try {
+     await limiter(req, res);
+     const validatedInput = inputSchema.parse(req.body);
+     const { username, jobDescription, tone } = validatedInput;

Line range hint 3-7: Move sensitive configuration to environment variables

The Supabase URL is hardcoded in the source code. This should be moved to environment variables for better security and configuration management.

- const supabaseUrl = 'https://itxuhvvwryeuzuyihpkp.supabase.co';
+ const supabaseUrl = process.env.SUPABASE_URL;

Line range hint 13-62: Improve error handling and add response caching

The current implementation lacks proper error handling for API calls and database operations. Additionally, implementing caching could improve performance for repeated requests.

Consider implementing:

  export default async function handler(req, res) {
+   try {
+     // Add caching check
+     const cacheKey = `letter:${username}:${jobDescription}:${tone}`;
+     const cachedResponse = await redis.get(cacheKey);
+     if (cachedResponse) {
+       return res.status(200).json(JSON.parse(cachedResponse));
+     }

      const { data } = await supabase
        .from('resumes')
        .select()
        .eq('username', username);

+     if (!data || data.length === 0) {
+       return res.status(404).json({ error: 'Resume not found' });
+     }

      // ... rest of the code ...

      const gptRes = await gpt.sendMessage(prompt.join(''));
+     
+     // Cache the response
+     await redis.setex(cacheKey, 3600, JSON.stringify(gptRes.text));
      
      return res.status(200).send(gptRes.text);
+   } catch (error) {
+     console.error('Error generating letter:', error);
+     return res.status(500).json({ error: 'Failed to generate letter' });
+   }
  }
apps/registry/pages/api/suggestions.js (4)

Line range hint 61-65: Improve error handling to prevent information exposure

The current error handling could potentially expose sensitive system information to clients. The error object is returned directly without proper formatting or status code.

Apply this diff to improve error handling:

  try {
    const content = chat.choices[0].message.content;
    return res.status(200).send(content);
  } catch (e) {
    console.error(e);
-   return e;
+   return res.status(500).json({
+     error: 'Failed to generate suggestions',
+     message: process.env.NODE_ENV === 'development' ? e.message : undefined
+   });
  }

Line range hint 1-4: Move Supabase URL to environment variables

The Supabase URL should not be hardcoded in the source code. Move it to environment variables for better security and configuration management.

Apply this diff:

-const supabaseUrl = 'https://itxuhvvwryeuzuyihpkp.supabase.co';
+const supabaseUrl = process.env.SUPABASE_URL;

Line range hint 10-13: Add input validation and rate limiting

The API endpoint lacks input validation for the username parameter and rate limiting, which could lead to security issues and excessive costs.

Consider:

  1. Adding input validation for the username parameter
  2. Implementing rate limiting using a middleware like express-rate-limit
  3. Adding request size limits

Line range hint 15-53: Move prompt template to a separate file

The large prompt template should be moved to a separate file for better maintainability and reusability.

Consider creating a prompts/suggestions.js file and importing the template:

// prompts/suggestions.js
export const getSuggestionsPrompt = (content) => `
  This is a persons resume in the JSON Resume format.
  ${content}
  // ... rest of the prompt
`;
apps/registry/pages/api/suggestions-beta.js (3)

Line range hint 159-164: Improve error handling and status codes.

The current error handling has several issues:

  1. Returns 200 status code for errors, which is misleading
  2. Generic "it failed" message isn't helpful for debugging
  3. Lacks structured error response format

Consider this improvement:

  } catch (e) {
    console.error(e);
-   return res.status(200).send('it failed');
+   return res.status(500).json({
+     error: true,
+     message: 'Failed to process suggestions',
+     details: process.env.NODE_ENV === 'development' ? e.message : undefined
+   });
  }

Line range hint 42-43: Add input validation and rate limiting.

The handler accepts user input without proper validation, which could lead to security issues:

  1. No validation of username format/length
  2. No rate limiting to prevent abuse
  3. No validation of brevity/sentiment values

Consider adding these safeguards:

export default async function handler(req, res) {
+ if (!req.body.username || typeof req.body.username !== 'string' || req.body.username.length > 100) {
+   return res.status(400).json({ error: 'Invalid username' });
+ }
+ if (req.body.brevity && !Object.values(BREVITY).includes(req.body.brevity)) {
+   return res.status(400).json({ error: 'Invalid brevity value' });
+ }
  const username = req.body.username || 'thomasdavis';
  const brevity = req.body.brevity || BREVITY.verbose;

Line range hint 9-23: Improve code structure and documentation.

The TODO comment indicates several pending improvements. Consider:

  1. Moving configuration to a separate file
  2. Adding TypeScript or JSDoc types for better maintainability
  3. Implementing the suggested schema improvements

Example structure:

// config/suggestions.js
export const SUGGESTION_LEVELS = {
  CRITICAL: 'critical',
  WARNING: 'warning',
  INFO: 'info'
};

export const SUGGESTION_TYPES = {
  SPELLING: 'spelling',
  GRAMMAR: 'grammar',
  // ... other types
};

// types/suggestions.js
/**
 * @typedef {Object} SuggestionConfig
 * @property {string} level - One of SUGGESTION_LEVELS
 * @property {string} type - One of SUGGESTION_TYPES
 * @property {string} sentiment - Suggestion tone
 * @property {string} brevity - Suggestion length
 */
apps/registry/lib/calculations.js (1)

Line range hint 1-234: Consider additional architectural improvements.

While the error handling improvements are good, consider these architectural enhancements:

  1. Input Validation: Consider adding a schema validation layer (e.g., using Zod or Joi) to validate the resume object structure at the entry point.
  2. Error Logging: Implement structured error logging instead of silently handling invalid data.
  3. Type Safety: Consider adding TypeScript for better type safety and developer experience.
apps/registry/scripts/jobs/gpted.js (2)

Line range hint 350-370: Enhance error handling robustness

The current error handling could be improved in several ways:

  1. Add specific error types for better debugging
  2. Implement retry logic for transient failures
  3. Preserve error details in the database

Consider this improved implementation:

       } catch (e) {
         console.error(e);
+        const errorDetails = {
+          error: e.message,
+          timestamp: new Date().toISOString(),
+          type: e.name
+        };
         await supabase
           .from('jobs')
           .update({
-            gpt_content: 'FAILED',
+            gpt_content: 'FAILED',
+            error_details: JSON.stringify(errorDetails),
+            retry_count: (job.retry_count || 0) + 1
           })
           .eq('id', job.id);
+        
+        // Retry with exponential backoff if under max retries
+        if ((job.retry_count || 0) < 3) {
+          await new Promise(resolve => 
+            setTimeout(resolve, Math.pow(2, job.retry_count || 0) * 1000)
+          );
+          continue;
+        }
       }

Line range hint 1-400: Consider architectural improvements for better scalability

The current implementation has several areas for improvement:

  1. Replace sequential processing with batch processing
  2. Implement proper rate limiting
  3. Add structured logging
  4. Remove debug console.logs

Consider these improvements:

  1. Add structured logging:
const winston = require('winston');
const logger = winston.createLogger({
  level: 'info',
  format: winston.format.json(),
  transports: [
    new winston.transports.File({ filename: 'error.log', level: 'error' }),
    new winston.transports.File({ filename: 'combined.log' })
  ]
});
  1. Implement batch processing:
async function processBatch(jobs, batchSize = 5) {
  const batches = chunk(jobs, batchSize);
  for (const batch of batches) {
    await Promise.all(batch.map(processJob));
    await rateLimiter.wait(); // Implement proper rate limiting
  }
}
  1. Add progress tracking:
const progress = require('cli-progress');
const bar = new progress.SingleBar({}, progress.Presets.shades_classic);

Would you like me to create a separate issue to track these improvements?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 9184b9b and 627c99b.

📒 Files selected for processing (7)
  • apps/registry/app/[username]/letter/page.js (0 hunks)
  • apps/registry/app/[username]/suggestions/page.js (0 hunks)
  • apps/registry/lib/calculations.js (4 hunks)
  • apps/registry/pages/api/letter.js (1 hunks)
  • apps/registry/pages/api/suggestions-beta.js (1 hunks)
  • apps/registry/pages/api/suggestions.js (1 hunks)
  • apps/registry/scripts/jobs/gpted.js (1 hunks)
💤 Files with no reviewable changes (2)
  • apps/registry/app/[username]/letter/page.js
  • apps/registry/app/[username]/suggestions/page.js
🔇 Additional comments (7)
apps/registry/pages/api/letter.js (1)

28-28: ⚠️ Potential issue

Verify the impact of switching to a smaller model

The change from 'gpt-4o-2024-08-06' to 'gpt-4o-mini' might affect the quality and coherence of generated cover letters. While this could improve response times and costs, it's important to validate the output quality meets expectations.

Run this script to analyze the model change impact across the codebase:

✅ Verification successful

Model change is consistent across the codebase

The change to 'gpt-4o-mini' is already consistently implemented across all GPT-using endpoints:

  • apps/registry/pages/api/letter.js
  • apps/registry/pages/api/suggestions.js
  • apps/registry/pages/api/suggestions-beta.js
  • apps/registry/scripts/jobs/gpted.js

Each file uses the same model with appropriate temperature settings (ranging from 0.7 to 0.85) for their specific use cases. This appears to be an intentional, system-wide model update rather than an isolated change.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for consistency in model usage and any related configurations

# Check for other files using ChatGPT models
echo "Checking for model usage across the codebase:"
rg -l "model.*gpt-4" 

# Check for any model-specific configurations or tests
echo "Checking for model-specific configurations:"
rg -l "temperature.*0.85"

Length of output: 474


Script:

#!/bin/bash
# Check the actual model configurations across files
echo "=== Model configuration in letter.js ==="
rg "model.*gpt-4" apps/registry/pages/api/letter.js -A 2 -B 2

echo -e "\n=== Model configuration in suggestions.js ==="
rg "model.*gpt-4" apps/registry/pages/api/suggestions.js -A 2 -B 2

echo -e "\n=== Model configuration in suggestions-beta.js ==="
rg "model.*gpt-4" apps/registry/pages/api/suggestions-beta.js -A 2 -B 2

echo -e "\n=== Model configuration in gpted.js ==="
rg "model.*gpt-4" apps/registry/scripts/jobs/gpted.js -A 2 -B 2

Length of output: 1223

apps/registry/pages/api/suggestions.js (1)

56-56: ⚠️ Potential issue

Verify the OpenAI model name and configuration

The model name gpt-4o-mini appears to be non-standard and might cause runtime errors. Additionally, the high temperature (0.85) combined with a potentially less capable model could lead to inconsistent suggestions.

Let's verify the available models and their naming patterns:

apps/registry/pages/api/suggestions-beta.js (1)

148-148: Verify the impact of switching to gpt-4o-mini model.

The change from gpt-4o-2024-08-06 to gpt-4o-mini might affect the quality and depth of resume suggestions. Please ensure this change aligns with the expected output quality and performance requirements.

Consider:

  1. Documenting the rationale for this model change
  2. Adding model configuration to environment variables for easier switching
  3. Implementing A/B testing to compare suggestion quality between models
apps/registry/lib/calculations.js (3)

10-10: LGTM! Good defensive programming practice.

The addition of a default empty array for workHistory prevents potential null pointer exceptions when resume.work is undefined.


134-134: LGTM! Consistent error handling.

The addition of a default empty array for workHistory is consistent with the error handling pattern used in other functions.


42-44: Verify the impact of filtering invalid date ranges.

While filtering out invalid ranges is good for robustness, we should ensure this doesn't silently hide data issues.

Let's verify if there are any existing test cases or usages that might be affected:

✅ Verification successful

Let me gather more information about the implementation to understand how invalid dates are handled.


Let me search for test files using a different approach.


Based on the analysis of the codebase, I can now provide the final response:

The filter for invalid date ranges is safe and necessary.

The filtering of invalid date ranges is a defensive measure that handles several edge cases correctly:

  • Empty work history is handled via resume.work || []
  • The filter is applied after merging ranges, so it only removes truly invalid entries
  • The code includes proper logging (console.log({ range, mergedRanges })) to track any filtered ranges
  • The function gracefully handles various resume formats as seen in the theme examples

The change improves robustness without silently hiding data issues since:

  1. Invalid entries are logged for debugging
  2. The function still processes all valid work entries
  3. The dashboard UI shows the total number of jobs (totalJobs: resume.work?.length) separately
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for test cases and usages of totalExperience
rg -A 5 "totalExperience\(" --type js

# Search for any test data with undefined or invalid date ranges
rg -B 5 -A 5 '"work":\s*\[' --type json

Length of output: 4021


Script:

#!/bin/bash
# Check the full implementation of totalExperience and related functions
ast-grep --pattern 'export function totalExperience($_) {
  $$$
}'

# Look for test files related to calculations
fd test calculations --exec cat {}

Length of output: 3717


Script:

#!/bin/bash
# Look for test files related to calculations
fd test --extension js --extension ts --exec grep -l "calculations\|totalExperience" {} || true

# Check if there are any files that pass undefined or null work entries
rg "work.*\[\s*null|\[\s*undefined|\[\s*\]" --type js -B 2 -A 2

Length of output: 16403

apps/registry/scripts/jobs/gpted.js (1)

273-275: Consider adjusting model configuration for more reliable parsing

  1. The change to gpt-4o-mini might affect parsing accuracy. Consider validating the output quality against the previous model's results.
  2. The temperature of 0.8 is high for structured data extraction tasks. Consider lowering it to 0.2-0.3 for more consistent results.

Let's verify the impact of the model change:

mergedRanges.forEach((range) => {
console.log({ range, mergedRanges });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove debugging console.log statement.

Production code should not contain debugging console.log statements as they can pollute logs and impact performance.

Apply this diff:

-    console.log({ range, mergedRanges });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
console.log({ range, mergedRanges });

@thomasdavis thomasdavis changed the title @coderabbit update ai models. fix bugs Nov 19, 2024
Copy link

🤖 No test run has been triggered as your Meticulous project has been deactivated (since you haven't viewed any test results in a while). Click here to reactivate.

Last updated for commit 65c4b5c. This comment will update as new commits are pushed.

@thomasdavis thomasdavis merged commit 650b197 into master Nov 19, 2024
7 checks passed
@thomasdavis thomasdavis deleted the misc-updates branch November 19, 2024 11:12
This was referenced Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant