Skip to content

Code and Data for an expanded Tau-Bench with training and test sets in a variety of domains

License

Notifications You must be signed in to change notification settings

Ephibbs/big-tau

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

74 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

τaurus: A world simulation tool for Tool-Agent-User Interaction in Real-World Domains

Built with heavy inspiration from τ-bench by Sierra Research (https://github.com/sierra-research/tau-bench)

Domains

  1. Airline
    Manage schedules, handle customer inquiries, oversee logistics, and track service requests.

  2. Retail
    Handle stock updates, process orders, assist customer interactions, and manage returns and exchanges.

  3. Healthcare
    Coordinate appointments, manage health records, provide information, and assist with insurance processing.

  4. Legal
    Support document creation, assist with research, provide review services, and give access to relevant references.

  5. Events
    Coordinate attendee management, oversee venue options, plan schedules, and manage event-related communications.

  6. Real Estate
    Schedule property tours, provide property details, support documentation, and facilitate communications.

  7. Finance
    Conduct financial modeling, analyze market trends, generate performance reports, assess risk factors, and provide data-driven insights to support investment and business decisions.

  8. Fitness
    Track activities, suggest routines, provide wellness information, and monitor progress.

  9. Logistics
    Track deliveries, optimize routes, update inventory, and log incident reports.

  10. HR
    Post opportunities, manage schedules, oversee employee data, and support inquiries.

  11. Insurance
    Submit claims, verify policy details, confirm coverage, and track claim statuses.

  12. Space
    Track assets, monitor data, forecast conditions, and coordinate schedules.

  13. Pharmaceuticals
    Track participants, ensure compliance, coordinate testing, and generate reports.

  14. Manufacturing
    Monitor production data, manage operations, assure quality, and report on efficiency.

  15. Cybersecurity
    Track security concerns, manage incident responses, conduct assessments, and provide audit reports.

  16. Hospitality
    Handle reservations, manage preferences, oversee requests, and follow up on feedback.

  17. Banking
    Assist with account inquiries, provide transaction data, manage alerts, and offer guidance.

  18. Engineering
    Manage code repositories, review and merge pull requests, track project issues, automate testing, and document technical processes, write unit tests, install dependencies, code across files.

  19. Loans
    Process applications, verify criteria, answer inquiries, and manage payments.

  20. Construction
    Track project timelines, verify credentials, assist with permits, and manage communications.

  21. Agriculture
    Track performance, monitor conditions, manage resources, and assist with analytics.

  22. Advertising
    Track performance, conduct tests, analyze metrics, and adjust targeting.

  23. Sales
    Track engagement, provide follow-ups, update records, and monitor progress.

  24. Education
    Track progress, manage materials, coordinate sessions, and provide learning support.

  25. Accounting Manage financial records, reconcile accounts, prepare financial statements, handle budgeting, process invoices, and ensure compliance with tax regulations.

  26. Data Science Perform data cleaning and preprocessing, run statistical analyses, create visualizations, develop predictive models, and generate insights for decision-making.

  27. Web Automation Automate repetitive web tasks, extract data from websites, fill out and submit online forms, manage web scraping, schedule automated workflows, and monitor web activities for changes or updates.

License

See ./LICENSE.

About

Code and Data for an expanded Tau-Bench with training and test sets in a variety of domains

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 92.6%
  • Jupyter Notebook 7.2%
  • Jinja 0.2%