Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenAPI Support | Integrate ML for Realistic Data Generation #85

Open
iskitsas opened this issue Jul 4, 2024 · 3 comments
Open

OpenAPI Support | Integrate ML for Realistic Data Generation #85

iskitsas opened this issue Jul 4, 2024 · 3 comments
Labels

Comments

@iskitsas
Copy link

iskitsas commented Jul 4, 2024

Parent issue: #72

Integrate a machine learning model to generate realistic input data for the cURL commands.

Tasks:

  • Select an appropriate ML model for data generation.
  • Train the model with sample data to generate realistic inputs.
  • Integrate the model with the cURL command generation function.
@iskitsas iskitsas added the gsoc24 label Jul 4, 2024
@AJun01
Copy link
Collaborator

AJun01 commented Jul 30, 2024

can I use const { faker } = require('@faker-js/faker'); to generate fake data?

@AJun01
Copy link
Collaborator

AJun01 commented Jul 30, 2024

it is not practical creating our own AI model and train it, therefore, I suggest couple of solutions: first ,use faker to generate realistic data every time OpenAPI parsed and generate .flex files, second, implement a API that utilize GPT to generate every time(I am not sure if that is going to charge for money), third solution is that I created huge data file that contains huge fake realistic data for each major field(name, email, address, password, zip etc. ) that generated from GPT at once. Then grap data randomly from this data file everything generating .flex file.

@AJun01
Copy link
Collaborator

AJun01 commented Jul 31, 2024

here is the mapping for fields that npm faker can cover

const fieldFakerMapping = {
'name': () => faker.person.findName(),
'email': () => faker.internet.email(),
'phone': () => faker.phone.phoneNumber(),
'address': () => faker.location.streetAddress(),
'city': () => faker.location.city(),
'state': () => faker.location.state(),
'country': () => faker.location.country(),
'zip': () => faker.location.zipCode(),
'date': () => faker.date.recent().toISOString(),
'time': () => faker.date.recent().toISOString(),
'company': () => faker.company.companyName(),
'product': () => faker.commerce.productName(),
'price': () => faker.commerce.price(),
'transaction': () => faker.finance.transactionDescription(),
'account': () => faker.finance.accountNumber(),
'crypto': () => faker.finance.bitcoinAddress(),
'color': () => faker.color.human(),
'adjective': () => faker.hacker.adjective(),
'noun': () => faker.hacker.noun(),
'username': () => faker.internet.userName(),
'password': () => faker.internet.password(),
'url': () => faker.internet.url(),
'ip': () => faker.internet.ip(),
'ipv6': () => faker.internet.ipv6(),
'uuid': () => faker.string.uuid(),
'text': () => faker.lorem.text(),
'number': () => faker.datatype.number.int({ min: 1, max: 1000 }),
'boolean': () => faker.datatype.boolean()
};

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants