-
Notifications
You must be signed in to change notification settings - Fork 826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Train or fine-tune models for computer automation agents #11
Comments
Hi @James4Ever0, thanks for getting in touch. We are defenitely interested in training a local model to enable faster inference. Would you minding sharing more context and perhaps a snippet of the dataset you create? We are welcome to cooperation and contribution if this is a good fit. |
The terminal dataset is comprised of an unique trajectory identifier, observations of the terminal, and actions taken by the agent. The observation can either be the full view of the terminal or only the updated lines, with line numbers surrounded by square brackets. The actions taken by the agent is called Preview of the terminal dataset:
|
After extracting the RAR file, you will find a bunch of folders named by timestamps, in which you can find these files:
In hid_record.jsonl you shall find:
|
Even though UFO can handle simple UI interfaces like Microsoft Word and Calculator, would it be possible to handle games like Cyberpunk 2077 or complex professional softwares like Premiere Pro and Photoshop? I doubt it and think it needs extensive training datasets, complex training & evaulation regime and advanced algoritms. |
Hello there Microsoft UFO Team! Excellent work for you to do such remarkable job, bringing AI closer to Windows system. I am doing similar works like training custom GPT2 models on computer automation datasets.
I have created two comprehensive datasets, over terminal and GUI environments. My strategy is to create data by random keyboard and mouse actions, collect observations mixed with other textual datasets.
This naive attempt shows my strong interest over computer agents. I like the idea of GUI agent benchmark systems like WindowsBench, and have thought of building some reward system by program exit codes or VimGolf.
If you ever consider my suggestion useful I would like to hear from your reply! Furthermore, if cooperation is possible I would be thrilled to join your team for building better computer agents!
Update: Google has posted an unsupervised action space training method called Genie. Consider that as highly applicable in the area of computer agents.
The text was updated successfully, but these errors were encountered: