Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: io: add RuneWriter interface #71027

Open
mateusz834 opened this issue Dec 26, 2024 · 8 comments
Open

proposal: io: add RuneWriter interface #71027

mateusz834 opened this issue Dec 26, 2024 · 8 comments
Labels
Milestone

Comments

@mateusz834
Copy link
Member

mateusz834 commented Dec 26, 2024

Proposal Details

Today i was surprised that io does not define a RuneWriter interface, it think that we should add one, considering that it is implemented by types in the std, also we already have a io.RuneReader interface and io.ByteWriter/io.ByteReader.

(*strings.Builder).WriteRune
(*bytes.Buffer).WriteRune
(*bufio.Writer).WriteRune

Proposed API:

type RuneWriter interface {
        // WriteRune writes the UTF-8 encoding of Unicode code point r, 
        // and returns the number of bytes written. In case of an error
        // while writing, the WriteRune method might write part of a
        // UTF-8 representation of that rune.
        // If the rune is out of range, it writes the encoding of [utf8.RuneError].
	WriteRune(r rune) (int, error)
}
@gopherbot gopherbot added this to the Proposal milestone Dec 26, 2024
@gabyhelp
Copy link

Related Issues

Related Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

@Jorropo
Copy link
Member

Jorropo commented Dec 26, 2024

What do we gain from being prescriptive about UTF8 ?
It seems to me we could have UTF16 & UTF32 implementations of this interface forwarding to an underlying ByteWriter or similar.

@mateusz834
Copy link
Member Author

What do we gain from being prescriptive about UTF8 ?

I don't have a good answer for that, but in my use case, i am checking whether an io.Writer implements RuneWriter, if so i use it, otherwise i fallback to utf8.EncodeRune and the Write([]byte). In that case UTF8 makes sense.

@earthboundkid
Copy link
Contributor

If there's an io.RuneWriter interface, I would expect an io.WriteRune(io.Writer) (int, error) function to go with it.

@mateusz834
Copy link
Member Author

If there's an io.RuneWriter interface, I would expect an io.WriteRune(io.Writer) (int, error) function to go with it.

FWIW currently we don't have a io.WriteByte(io.Writer) (int, error) for io.ByteWriter.

@jimmyfrasche
Copy link
Member

I think the argument against WriteByte is that it encourages small writes against a possibly unbuffered output. Though it would be handy for writing code that does a lot of writes to an arbitrary writer where the caller is responsible for ensuring any required buffering.

WriteRune would have to go in unicode/utf8, presumably just called Write in that case. (:+1:)

@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Dec 26, 2024
@mateusz834
Copy link
Member Author

WriteRune would have to go in unicode/utf8, presumably just called Write in that case. (:+1:)

I wonder whether we want to import io in utf8, currently utf8 does not import anything.

@apparentlymart
Copy link

Although it's not exactly the same, this proposal made me think of golang.org/x/text/encoding.

Specifically, encoding.Encoder provides a facility to transcode from UTF-8 to some other character encoding. In principle it could also offer a facility to transcode from a single rune to some other character encoding -- at least for the subset of character encodings that have a lossless roundtrip transform through Unicode -- although I acknowledge that the foundation it's currently built on is not really well-suited for individual rune transformation. (It's a stream transformer rather than a character transformer in particular.)

All of that is to say that I agree with the commentary above that this interface ought not to require that all implementers produce UTF-8. A specific implementation of this interface could decide to use UTF-8 as its output format. bufio.Writer in particular effectively implements this interface for UTF-8 around any other io.Writer, but implementations producing other character encodings would be viable too, and in particular x/text might want to somehow offer an implementation that wraps an io.Writer with an arbitrary character encoding, assuming there were some way to adapt its current UTF-8-bytes-specific design to work efficiently with individual runes. (I cannot say whether the maintainers of that library would actually want to offer that, of course... that would be a separate proposal.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

7 participants