-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update instructions to use -netdev stream #76
Conversation
Since QEMU 7.2 wrappers are no longer needed[0]; update the documentation to instruct users to prefer the new options added in 2022[1] but not documented until 2024[2]. The client can be entirely removed in a future release. [0] https://john-millikin.com/improved-unix-socket-networking-in-qemu-7.2 [1] qemu/qemu@5166fe0 [2] qemu/qemu@178413a Signed-off-by: Tamir Duberstein <[email protected]>
Thanks! this is very useful. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, did you test the modified examples?
I did test, yeah. I also updated the test script which I hope runs in CI. There is one downside to removing the client: QEMU's |
We should move to dgram socket - this simplifies the code and more efficient since we eliminate the length prefix on both sides (qemu/lima and socket_vment). This is mostly important for lima where lima copies every packet read from socket_vmnet unix socket to datagram socket connected to vz. If we support datagram socket in socket_vmnet, we can pass the file descriptor of the datagram socket connected to vz to socket_vmnet, and eliminate the expensive copy in lima. |
I think moving to datagram sockets is going to be a good deal more complex; since dgram communication is inherently unidirectional each VM will need two sockets (one for inbound and one for outbound communication). This is described in the blog I linked above. This means we'll certainly need a client wrapper responsible for allocating an ephemeral address for the receiving socket and we'll also need a way to communicate that socket's address to the All in all it isn't difficult to see why stream sockets were initially chosen. |
datagram sockets are not unidirectional. You can read and write packets from the same file descriptor. I'm using single dgram socket to integrate vfkit with vmnet. I'm creating a datagram sockepair. One end is passed to vfkit using:
This is used to create a file handle device attachment in vz. The other fd is connected to a helper process creating a vmnet interface. The helper process reads packets from the file descriptor and write them to the vmnet interface, and read packets from vmnet interface and writes them to file descriptor. The same should work with lima and qemu. |
Do you use an abstract socket address to bind vm-fd? |
I’m using a sockepair(2) - it returns 2 connected sockets without address. |
I see. That's probably using abstract addresses under the covers. Another question: why do we create a single vmnet interface for all clients rather than one per client? |
The sockets have empty address. Also abstract address namespace is a Linux extension. On macOS there is no such thing.
I'm not sure. I'm creating one vmnet interface per vm, and each helper process forward packets between one vmnet interface and one vm. This model is much simpler and give much better performance. Creating one vmnet interface per vm means that vmnet is responsible for forwarding packets between vms. In socket_vment we forward packets between vms and from vmnet interface to all vms. Because socket_vmnet do not know the vms mac addresses, it copies all packets from every socket to all other sockets and vmnet interface, or from vmnet interface to all vms. This scales very badly (#58). |
Right, this is what I expected. Perhaps changing this would be a good place to start? |
This is a major redesign and I'm not sure it worth the effort because the helper process model when every vm has a small and simple helper process for forwarding packets between the vm and vment is simpler. For example, we don't need launchd service since the helper is create and managed by the program starting the vm (e.g. lima or minikube). If we want to keep the single daemon serving multiple vms moving to vmnet interface per vm seems like the right design. With this model, socket_vmnet need to keep a control socket for passing datagram socket descriptors from other processes. When a datagram socket is passed, it will start a vmnet interface for the socket, and start forwarding packets between the socket and vmnet. When the vmnet interface is created, we get mac address from vmnet. This mac address must be used by the vm so vmnet can forward packets back to the vm. This requires a protocol to return the mac address from vmnet to the vm. This can be implemented using the control socket. Currently lima generate a unique mac address for every instance based on the instance name, so the instance is more likely to get same IP address each time. With vmnet we cannot control the mac address, but we can specify an interface UUID. Using the same UUID will return the same mac address. lima can generate the UUID in the same way it generates mac address so each instance will have a unique and constant UUID. The protocol for starting a vmnet interface will need to access the UUID and return the mac address provided by vmnet for this UUID. This requires changes in programs using socket_vment like lima or minikube. |
This is of course simpler, but it requires running the helper process as root - or have you found a way to avoid that? |
There is no way to avoid that. This is the limitation set by Apple, and the reason we need socket_vment. If you use vz in a program form the app store, you may be able to get the required entitlement and use native networking without vmnet. This is much faster than using vmnet but it cannot work for open source project when the program can be built by anyone without getting permission from Apple. You can add sudoers rule to allow your program to run the helper as root. This is the same solution use by lima when you want to manage socket_vment with lima. Lima creates a sudoers rule for you (try |
Yes, makes sense. I think we should probably head toward the redesign you mentioned. XPC seems to do everything we need. Would an XPC interface work for vz? |
Not sure how do you want to use xpc. For vz the interface is a connected datagram socket file descriptor and mac address when configure the network device. |
The XPC interface would be here. It's the interface over which the client passes the vmnet configuration + the datagram socket. |
The datagram socket is a file descriptor in the process creating the vz virtual machine. We have the other end of the socket pair which need to passed to socket_vmnet, and this requires unix socket. Maybe XPC supports this but requiring it means it will be hard to integrate with other tools that try to work on multiple platforms, like lima and minikube. Lima uses this module to pass fds on unix socket: Same code can be used by minikube, so this seems like the right way to communicate with socket_vmnet. The rest can be very simple json messages and responses that are compatible with anything that can use unix socket and json. I don't think we should use any Apple only technology as the public interface. This is fine for internal implementation, or if you control the entire system. Lets more this discussion to a new issue. |
Sounds good. I'll go ahead and close this since we're almost certainly going to keep a client in place. |
I opened #77. |
Since QEMU 7.2 wrappers are no longer needed[0]; update the
documentation to instruct users to prefer the new options added in
2022[1] but not documented until 2024[2].
The client can be entirely removed in a future release.
[0] https://john-millikin.com/improved-unix-socket-networking-in-qemu-7.2
[1] qemu/qemu@5166fe0
[2] qemu/qemu@178413a