Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend Images API to support Edits and Variations #62

Merged
merged 5 commits into from
Nov 10, 2023

Conversation

SunburstEnzo
Copy link
Contributor

@SunburstEnzo SunburstEnzo commented Apr 24, 2023

What

(I wasn't sure about when it should be called Image or Images, Edit or Edits, etc. but I think I was consistent at least!)

• Added ImageEditsQuery and ImageVariationsQuery (both still use ImagesResult as the response)
• Removed mention of model as a parameter for ImagesQuery in readme as that isn't currently supported

Why

More image generation abilities
Edit: #45 and #46

Affected Areas

Did not alter ImagesQuery but may be best to tweak the wording for consistency? Maybe they should all be "CreateImagesQuery", "CreateImageEditQuery", etc. – not sure

I'm not currently in a situation to use ChatGPT Plus so can't fully test

Happy to make any tweaks 👍

@SunburstEnzo
Copy link
Contributor Author

SunburstEnzo commented Apr 24, 2023

I was thinking, maybe it's best to keep one images() function and pass in an enum with the query as an associated value?

Something like

enum ImageQueryType {
    case create(ImagesQuery)
    case edit(ImageEditsQuery)
    case variation(ImageVariationsQuery)
}
public func images(queryType: .create(ImagesQuery()), completion: @escaping (Result<ImagesResult, Error>) -> Void)

Copy link
Collaborator

@Krivoblotsky Krivoblotsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, @SunburstEnzo
Thanks for the update.
Looks like we need to upload images, not just pass their names.
The documentation is a bit odd https://platform.openai.com/docs/api-reference/images/create-edit

Do you have any working example to demonstrate the approach?

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@SunburstEnzo
Copy link
Contributor Author

Yeah great points, I didn't fully understand it myself, so I was wondering if any image url would work or if it had to be uploaded using the Files API.

I'm now on the pay as you go OpenAI plan so I'll try and see what errors I get back with a valid api key and work backwards.

May be best for me to add basic Files API support and come back?

@SunburstEnzo
Copy link
Contributor Author

Ok I think I've found out how to do it in part

First error was that it isn't json to be sent:

Invalid Content-Type header (application/json), expected multipart/form-data. (HINT: If you're using curl, you can pass -H 'Content-Type: multipart/form-data')

So that meant tweaking the header to:

let boundary = UUID().uuidString
request.setValue("multipart/form-data; boundary=\(boundary)", forHTTPHeaderField: "Content-Type")

Then it was figuring out the formatting of the httpBody to send (as NSMutableData):

'image' is a required property

I found a StackOverflow article that did pretty much what I wanted (https://stackoverflow.com/a/47571172/1241153):

let body = NSMutableData()
let fname = "whitecat.png"

let mimetype = "image/png"
//define the data post parameter
body.append("--\(boundary)\r\n".data(using: String.Encoding.utf8)!)
body.append("Content-Disposition:form-data; name=\"image\"\r\n\r\n".data(using: String.Encoding.utf8)!)
body.append("hi\r\n".data(using: String.Encoding.utf8)!)
body.append("--\(boundary)\r\n".data(using: String.Encoding.utf8)!)
body.append("Content-Disposition:form-data; name=\"image\"; filename=\"\(fname)\"\r\n".data(using: String.Encoding.utf8)!)
body.append("Content-Type: \(mimetype)\r\n\r\n".data(using: String.Encoding.utf8)!)
body.append(imageData)
body.append("\r\n".data(using: String.Encoding.utf8)!)
body.append("--\(boundary)--\r\n".data(using: String.Encoding.utf8)!)
request.httpBody = body as Data

And it worked with the hardcoded image request (using the whitecat image in the readme as the base image)!
Screenshot 2023-04-25 at 4 14 57 pm 2

I haven't figured out how to use the query to add the size and n properties in but I got a result:
img-J1YDYUniLdkjs8vEdZYan4Yu

Verified API usage with OpenAI/Dall-E
Screenshot 2023-04-25 at 4 16 51 pm

@Krivoblotsky
Copy link
Collaborator

@SunburstEnzo, you can use MultipartFormDataBodyEncodable protocol for that.
Please see AudioTranslationQuery as a reference.

@SunburstEnzo
Copy link
Contributor Author

🤦‍♂️ nice one! Didn't spot that, should've read more on what was already in the project. I'll try and get it working like that 👍

@SunburstEnzo
Copy link
Contributor Author

@Krivoblotsky I've updated ImageEditsQuery and ImageVariationsQuery to use MultipartFormDataBodyEncodable as mentioned

.

Had to edit the MultipartFormDataEntry enum to allow for optional filename and file when using a mask image, but instead I could add an if let just for the mask entry in the BodyBuilder if that’s more favourable.


Also another white cat image but sized to 256x256 by Dall-E:
img-SD6nj7ztSNntVx2m7uARyhlS

@Krivoblotsky Krivoblotsky self-requested a review August 22, 2023 11:45
Copy link

sonarcloud bot commented Nov 9, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 6 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@ingvarus-bc ingvarus-bc mentioned this pull request Nov 10, 2023
@ingvarus-bc ingvarus-bc merged commit 41b3515 into MacPaw:main Nov 10, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants