Skip to content

guestagent over grpc broken with WSL2 #2209

Closed
@pendo324

Description

@pendo324

Description

I previously fixed this issue and confirmed it was working (#2118), but I was just testing off of main and found there's a new issue. Seems like the change to use gRPC somehow broke guest agent communication with WSL2 over VSOCK.

I confirmed that communication is indeed working over VSOCK using a simple socat echo server (inside the VM socat VSOCK-LISTEN:1234,fork EXEC:cat) and a simple Go program on my host:

Simple program

Comment out the test code, and then uncomment the grpc code for a simple repro.

package main

import (
	"context"
	"fmt"
	"net"

	"github.com/Microsoft/go-winio"
	"github.com/Microsoft/go-winio/pkg/guid"
	"google.golang.org/protobuf/runtime/protoimpl"
)

type IPPort struct {
	state         protoimpl.MessageState
	sizeCache     protoimpl.SizeCache
	unknownFields protoimpl.UnknownFields

	Ip   string `protobuf:"bytes,1,opt,name=ip,proto3" json:"ip,omitempty"`
	Port int32  `protobuf:"varint,2,opt,name=port,proto3" json:"port,omitempty"`
}

type Info struct {
	state         protoimpl.MessageState
	sizeCache     protoimpl.SizeCache
	unknownFields protoimpl.UnknownFields

	LocalPorts []*IPPort `protobuf:"bytes,1,rep,name=local_ports,json=localPorts,proto3" json:"local_ports,omitempty"`
}

func getCon(ctx context.Context) (net.Conn, error) {
	// This was my VM's VM ID. You can get your's by following the logic in this file:
	// https://github.com/lima-vm/lima/blob/master/pkg/windows/wsl_util_windows.go#L13
	// OR running these commands (basically the same thing)
	// Get-CimInstance Win32_Process -Filter "name = 'wslhost.exe'" | Select CommandLine | ConvertTo-Json
	// --distro-id will correlate to the registry keys under
	// Computer\HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Lxss
	// Most likely, one is for your default distro, and one for your Lima distro.
	VMIDStr := "a2fae325-ef5c-4ecc-a78e-043ab330a3e6"
	// The port of the currently running lima-guestagent. This can be retrieved from logging
	// into the VM and running systemctl status lima-guestagent
	VSockPort := 123456
	VMIDGUID, err := guid.FromString(VMIDStr)
	if err != nil {
		fmt.Printf("Error converting to GUID %v\n", err)
	}
	sockAddr := &winio.HvsockAddr{
		VMID:      VMIDGUID,
		ServiceID: winio.VsockServiceID(uint32(VSockPort)),
	}
	fmt.Printf("Dialing vsock %s with port %d\n", VMIDStr, VSockPort)
	return winio.Dial(ctx, sockAddr)
}

func main() {
	conn, err := getCon(context.Background())
	fmt.Printf("err: %v\n", err)
	fmt.Printf("remoteAddr: %v\n", conn.RemoteAddr())
	buf := []byte("Test!")
	bytesWritten, err := conn.Write(buf)
	if err != nil {
		fmt.Printf("Error reading from conn: %v\n", err)
	}
	fmt.Printf("bytesWritten: %d, buf: %s\n", bytesWritten, buf)
	read := make([]byte, bytesWritten)
	bytesRead, err := conn.Read(read)
	if err != nil {
		fmt.Printf("Error reading from conn: %v\n", err)
	}
	fmt.Printf("bytesRead: %d, read: %s\n", bytesRead, read)

	// opts := []grpc.DialOption{
	// 	grpc.WithContextDialer(func(ctx context.Context, target string) (net.Conn, error) {
	// 		return getCon(ctx)
	// 	}),
	// 	grpc.WithTransportCredentials(insecure.NewCredentials()),
	// }

	// clientConn, err := grpc.Dial("", opts...)
	// if err != nil {
	// 	fmt.Printf("Error converting to GUID %v", err)
	// }
	// out := new(Info)
	// err = clientConn.Invoke(context.Background(), "/GuestService/GetInfo", emptypb.Empty{}, out)
	// fmt.Printf("err: %v\n", err)
	// fmt.Printf("out: %v\n", out)
}

You can monitor established vsock connections by using something like:

watch -n1 "ss | grep 123456"

If you remove the Read from the test program so it only ever writes, then the connection will linger as established, making it easier to see in the output of ss. This works even with the lima-guestagent server, which leads me to believe the issue is somehow with the grpc communication (maybe the client?) not vsock itself.

So I'm not yet sure why running Lima and trying to use the grpc connection gives the following error:

rpc error: code = Unavailable desc = connection error: desc = "error reading server preface: http2: frame too large"

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions