Project AbstractΒ Β Β
DeliverablesΒ Β Β
DemoΒ Β Β
ContributionsΒ Β Β
Pushing LimitsΒ Β Β
DocumentationΒ Β Β
Mentors
I am the open-source contributor for the Google Summer of Code 2024 working on the AI Programmer project for Rocket.Chat. In this repository I will clarify a final report summary of my GSoC work and a quick guide for all future GSoC aspirants working on this project.
Project Repository: https://github.com/RocketChat/Apps.RC.AI.Programmer
The AI Programmer Rocket.Chat App enables users to generate a short piece of code in C/C++, Java, Javascript, Typescript or Python based on their specification. They can switch between different LLMs (Mistral, CodeLlama, etc.) and ask for a new refinement on the code generation result to make augment or fine-tune. A well-designed interactive UX aiming to simplify user's interactions is also implemented. Finally, the app bridges gap between the generated codes and the community using sharing APIs, allowing users to share the code pieces in RC channel and their Github gists/repositories by configuring OAuth and Restful APIs.
The following were the deliverables of this project:
- Interactive User Interface : Unlock the true potential of the UiKit and add an interactive user interface for each of the added features of the application, eliminating the need to type long slash commands for more complex tasks.
- Minimal Slash Command Interface : Adding slash command to trigger the application to manipulate on the AI programmer's functions with minimal interactions.
- User Configuration: Users can specify their personalized configuration for code generation tasks, for example, use which programming language and LLM.
- Code Generation: Allowing users to generate code pieces by providing their requirements and specifications.
- Code Refinement: Users can further ask for a new variation of the code or augment/fine-tune the system for a more precise result.
- Share Code to Github Repository: By configuring the OAuth2 connection with Github, users can upload their generated code pieces to the target Github repository, branch, file. For a better user interaction and make the function more practical to use, we enable the uploading code pieces via Gist APIs.
Extra Features which were suggested by mentors throughout the Google Summer of Code period:
- Share Code in RC Channel: The generated code is not only viewable by the users, but can also be shared into current RC channels for other users to check.
- Compatible with New Version UI-Kit: Since the RC server and app engine's ui-kit update frequently, the entire user interface is well-designed and implemented with new version features and compatible with RC servers.
It's glad to complete all of the above deliverables and meet expected goals within the GSoC period! π
The Minimal Slash commands allow users to handle ai programmer functions using simple slash commands. Users can generate code, set personalized configuration, connect to Github, and share codes. These commands allow users to access the app's features more conveniently.
- Generate code pieces with specific description (please set language and llm correctly first!) ->
/ai-programmer gen code_content
- Set the language you want to use to generate code ->
/ai-programmer set C++
- Switch to the LLM you want to use to generate code ->
/ai-programmer llm llama3-70b
- List the available LLM options ->
/ai-programmer list
- Use the interactive user interface to handle your operations ->
/ai-programmer ui
- Login to Github (You should set OAuth2 settings first!) ->
/ai-programmer gh_login
. - Logout to Github ->
/ai-programmer gh_logout
.
For users who prefers UX interactions, the app also provides access by UX. The user interface will be invoked directly using command `/ai-programmer`, which will open a contexutal bar for users to set their configurations. After the configurations are set and ready, users can progress to other functions.
The friendly user interface implemented using RC app engine's ui-kit allows interactive operations on all features and functions provided by the ai programmer. All UX manipulation opereations can be seen in the video:
Screen.Recording.2024-08-11.at.23.00.24.mp4
For the code generation tasks, we are supposed to invoke LLMs. In Rocket.Chat community, the LLM APIs are configured together in a safe.dev server provided by @devanshu.sharma . This enables the direct invoking of different LLMs via generic GPT-format LLM APIs. Thus, users can swiftly switch between different LLMs and choose the version that works smoothly with their requirements.
To avoid avoid improper injections and ensure the app function as expected, we should make the prompts of LLMs to be more effective. This practice is done by conducting rigorous prompt engineering. A well-designed prompt can offer the users numerous benefits, including improved output accuracy, enhanced control over model responses, reduced hallucinations, better task alignment, and increased efficiency in utilizing the model's capabilities. Effective prompt engineering also helps mitigate potential security risks, ensures consistent performance across various inputs, and allows for more nuanced and context-aware interactions with the LLM.
To achieve this, we used a method called "chain of thought", which asks an LLM to generate an effective prompt for a specific task itself, and human will intefere with the prompts, adding missing information, auding their correctness and effectiveness.
Our experiments show that the prompts we adopt in this ai-programmer is strong enough to sustain possible attacks or injection attempts from illegal users.
After configuring the correct LLM and corresponding LLMs, users can now ask the LLM to perform tasks specified by the prompt. For functions of Code Generation and Refinement, we set a different prompts respectively. When users ask for a code refinement, the app will automatically read the code result generated in last round and make improvements on it according to user's feedbacks.
To integrate users' input information inside the prompt, we set a few examples for LLMs to learn from. In this manner, the LLM could better understand user's meaning and to avoid potential improper actions caused by injections.
To enable users to share code into GitHub repositories, in this Ai Programmer App I have implemented the Authentication mechanism using OAuth2. To implement this feature, we have used the GitHub OAuth along with the RocketChat Apps Oauth2 Client for OAuth2 authentication.
When registering the OAuth application, the callback URL must be set to the server url of which this app is deployed on. (e.g. http://localhost:3000 for local servers). After the GitHub OAuth is successfully setup, open the settings page of Ai Programmer App and enter the correspnding GitHub App OAuth Client Id and Client Secret.
- The users can then log in to their github accounts simply by entering the slash command :
/ai-programmer login
and clicking on thelogin
button. - As soon as the user is logged in, they receive a message notifying the successful connection with Github. User can now upload their generated code to their github repositories.
- The users are automatically logged out after a period of time and the token is deleted. This was done to ensure the scalability of the feature in case of inactive users and to remove old OAuth tokens from the apps limited persistent storage. To achieve this we use the RocketChat Apps Scheduler API.
By configuring the Github OAuth connections, users can now authenticate and access their Github repositories. Now to upload code content into the Github repository, the Ai Programmer app interacts with Github APIs and uses http put methods to implement this function.
Github has enabled the RESTful APIs for creating files, the app will encapsulate user's code content together with the specified repository name, branch, and file path in data and send this put request using this RESTful API to create new files in the target location. However, sometimes when there's already a same file existed in the repository, this API cannot update the original file directly. For existing files, GitHub requires the current file's SHA when updating. It requires us to provide a sha attribute in the data.
The Sha value is a safety measure GitHub uses to prevent unintended overwrites. Therefore, in the first step our app will try to acquire the file from Github repository, and if that succeed, our app will acquire its sha value and upload the new data containing this sha value. Our approach is: First, try to get the file's current content. If it exists, we'll get its SHA. If the file doesn't exist, we'll create it. If it does exist, we'll update it with the new content.
By detecting the response status code we can tell whether our API calls are successful. Finally this sharing code function has been implemented using RESTful APIs and accessing to Sha values of existing files.
Considering the real use case and for a more practical usage, users might be more interested in sharing their code pieces to the Gist, where they can further manage their code pieces and share with other Github community participants.
Thus, we enable the function of sharing generated code content to Github Gist, which allows users to store their generated code contents in a separate place other than repository for a better file organization and managements. The code pieces will be uploaded to Github Gist platform which is enabled to be shared with other users in Github community or integrate with existing repositories.
The method of invoking Gist APIs is similar to Github APIs, which also requires OAuth configurations. Meanwhile, we should also add the scope of 'gist' when acquiring authentication access tokens from OAuth to ensure the correct connection with Gist resources.
Since the code is generated within the channel, it is reasonable for users to share their code pieces within the current channel. We have enabled this function for users to share what they have generated. However, the users should be able to edit the content they are going to share and verify that's the correct information because the content will become public after sharing.
Screen.Recording.2024-08-12.at.21.12.08.mov
PR Link | Description | Status |
---|---|---|
PR #1 | [New] Setup development environment and make prototype demo. Highlights include:
|
|
PR #2 | [Refactor] Refactor the entire app on new framework supporting open-source LLMs. Highlights include:
|
|
PR #14 | [Feature] Interactive UX to allow users to specify their configuration. Highlights include:
|
|
PR #20 | [Feature] Code Refinement & UX Enhancements Highlights include:
|
|
PR #28 | [Feature] Modal Window Design, Prompt Engineering Highlights include:
|
|
PR #37 | [Feature] Code Sharing in RC Channel and Github via OAuth2 Highlights include:
|
Issue Link | Description | Status |
---|---|---|
ISSUE #5 | [Refactor] Refactor the entire code base to support generic GPT-format LLM APIs | |
ISSUE #10 | [Feature] Add dropdown box on user interface for intuitive user configuration function | |
ISSUE #15 | [Feature] Support the use case of asking for code refinement | |
ISSUE #21 | [Feature] Design effective LLM prompts to avoid prompt injections | |
ISSUE #23 | [BUG] The elements on contextual bar will invoke incorrect roomid | |
ISSUE #29 | [Feature] Allow users to share code to Github repository | |
ISSUE #34 | [Feature] Configure OAuth2 features for authentication for Github connection | |
ISSUE #38 | [Feature] Design option to go back and choose another language |
- In order to make this project a success we have pushed Rocket.Chat to the limit. Since Rocket.Chat app engine and server updates frequently, some of the ui-kit functions become deprecated and requires redesigin UI logis using the new APIs. To improve the user interface with the new version's app engine ui-kit, we researched on the new version ui-kit and figure out the logics behind the invoking of different ui components. This task was not so easy and required a lot of research.
- Designing new components on the new version ui-kit and make it more intuitive and friendly for users to interactive requires us to dive deep into different repositories and understanding how fuselage, ui-kit, fuselage-ui-kit and Rocket.Chat.Apps-engine work together to render components inside Rocket.Chat.
I have documented all of the features mentioned above in the Project README. This documentation is proven to be useful to all future Rocket.Chat contributors working on Rocket.Chat Apps, Apps-engine, LLM, Github API, OAuth authentication, and the ui-kit and progress further on the basis of this existing implementation version.
I'm deeply grateful to my mentors for their invaluable guidance before and during GSoC. Their wisdom has imparted lifelong lessons, particularly in considering the end user's perspective and developing scalable products. My heartfelt thanks go to:
- Sing Li - GitHub. LinkedIn
- John Crisp - LinkedIn
- Samad Yar Khan - GitHub. LinkedIn
- Mustafa Hasan Khan - GitHub. LinkedIn
The entire Rocket.Chat community has been incredibly supportive. Special recognition goes to Devanshu Sharma, whose extensive assistance, especially in hosting the Rocket.Chat server with open-source LLM capabilities on safe dev server, was crucial for my GSoC'24 Demo Day presentation.
- Devanshu Sharma - GitHub.
A sincerely big thank to all of the Rocket.chat participant's help throughout GSoC. π
I am a self-motivated student pursuing Masterβs degree of Computer Science at Texas A&M University in USA, interested in Full-Stack DevOps, Cloud, Data Analytics, Machine Learning (AIGC), Game Development, Vision, Graphics, and Finance.
I have been an open-source contributor at Rocket.Chat since March 2024.
Want to discuss about GSoC / Rocket.Chat / Development ? Let's connect!
Student | Ryan Zhou (@ryanbowz) |
---|---|
Organization | Rocket.Chat |
Personal Page | ryanbowz's page |
Project | Ai Programmer App |
GitHub | @ryanbowz |
ryanbowz | |
[email protected] | |
Rocket.Chat | ryan.zhou |