Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scandir_windows.go should use faster code #30

Open
karrick opened this issue May 7, 2019 · 4 comments
Open

scandir_windows.go should use faster code #30

karrick opened this issue May 7, 2019 · 4 comments

Comments

@karrick
Copy link
Owner

karrick commented May 7, 2019

HELP WANTED

If anyone has the knowledge and the cycles to figure out how to more quickly read a file system directory in Windows, I'd love to hear from you, whether it's a PR, or a link to some document I can read to learn more.

@karrick
Copy link
Owner Author

karrick commented Oct 3, 2019

After merging #38, it's not readdir_windows.go that would need faster code, but rather scandir_windows.go.

Right now, scandir_windows.go reads a single directory entry from an open file system handle.

There may even be advantages from reading more than a single entry, and if there are, then reading a block of entries into a structure field slice, and enumerating them on each call to Scan might perform better. In such a case, when the structure field has no more items, the Scan code would attempt to read more entries, and eventually run out of entries.

@karrick karrick changed the title readdir_windows.go should use faster code scandir_windows.go should use faster code Oct 3, 2019
@niallnsec
Copy link

First of all thank you for this project, the behaviour of filepath.Walk on Windows has been driving me up the wall.

I think you are spot on when you say

There may even be advantages from reading more than a single entry

You can potentially reduce the enumeration to a single system call in many cases, but it depends on how low level you are happy to go.

You could use the API NtQueryDirectoryFile which is exposed in ntdll.dll. You can use this API to enumerate a full directory in one go, even allowing for filtering with wildcards.

I am currently using this API as part of a project to enumerate the named pipes on Windows (since named pipes are exposed just like a normal filesystem with directories and files) and it works very well.

I should note, that although this API is only officially documented for Kernel mode (see ZwQueryDirectoryFile) the Nt version is just a User mode wrapper and has been available in Windows since XP. I have tested the use of this function on every version of Windows from XP to 10 (build 1909) and have found no issues whatsoever. As it has stayed the same for more than two decades now I would not expect Microsoft to make any breaking changes to it in the foreseeable future.

The prototype for the function I am using looks like this:

import (
	"syscall"
	"unsafe"

	"golang.org/x/sys/windows"
)

type (
	FILE_DIRECTORY_INFORMATION struct {
		NextEntryOffset uint32
		FileIndex       uint32
		CreationTime    uint64
		LastAccessTime  uint64
		LastWriteTime   uint64
		ChangeTime      uint64
		EndOfFile       uint64
		AllocationSize  uint64
		FileAttributes  uint32
		FileNameLength  uint32
		FileName        [1]uint16
	}

	FILE_NAMES_INFORMATION struct {
		NextEntryOffset uint32
		FileIndex       uint32
		FileNameLength  uint32
		FileName        [1]uint16
	}

	UNICODE_STRING struct {
		Length    uint16
		MaxLangth uint16
		Buffer    uintptr
	}

	IO_STATUS_BLOCK struct {
		StatusPointer uintptr
		Information   uintptr
	}
)

var (
	modntdll = windows.NewLazySystemDLL("ntdll.dll")
	procNtQueryDirectoryFile = modntdll.NewProc("NtQueryDirectoryFile")
)

const (
	STATUS_NO_MORE_FILES = syscall.Errno(0x80000006)

	FileDirectoryInformation = uint32(1)
	FileNamesInformation = uint32(12)
)

func NtQueryDirectoryFile(hFile syscall.Handle, hEvent syscall.Handle, apcRoutine uintptr,
	apcContext uintptr, ioStatusBlock *IO_STATUS_BLOCK, finfo uintptr, length uint32,
	fiClass uint32, singleEntry uint32, fname *UNICODE_STRING, restart uint32) error {
	r1, _, _ := syscall.Syscall12(
		procNtQueryDirectoryFile.Addr(),
		11,
		uintptr(hFile),
		uintptr(hEvent),
		apcRoutine,
		apcContext,
		uintptr(unsafe.Pointer(ioStatusBlock)),
		finfo,
		uintptr(length),
		uintptr(fiClass),
		uintptr(singleEntry),
		uintptr(unsafe.Pointer(fname)),
		uintptr(restart),
		0,
	)
	if r1 != 0 {
		return syscall.Errno(r1)
	}
	return nil
}

Note: You can remove the dependency on x/sys/windows by using syscall.NewLazyDll. The only reason to use windows.NewLazySystemDLL is because it restricts the paths the OS will load the DLL from, ensuring that the system copy is used and not a malicious DLL with the same name placed in the OS search path. This advantage is probably irrelevant to ntdll.dll since it is automatically loaded into every process and so there is no risk of loading an incorrect version here.

You can call it in a loop until you get an error code of STATUS_NO_MORE_FILES. If you provide it with a relatively large buffer then you have a good chance of getting all of the information in one go.

I think the loop would look something like this (pulled out from a test project):

var status IO_STATUS_BLOCK
buf := make([]byte, 1024*64)
for {
	err = NtQueryDirectoryFile(
		hFile,
		0,
		0,
		0,
		&status,
		uintptr(unsafe.Pointer(&buf[0])),
		1024*64,
		FileDirectoryInformation,
		0,
		nil,
		0,
	)
	if err != nil {
		if err.(syscall.Errno) == STATUS_NO_MORE_FILES {
			break
		} else {
			panic(err)
		}
	}
	if status.Information == 0 {
		panic("no data")
	}
	var offset uint32
	fdi := (*FILE_DIRECTORY_INFORMATION)(unsafe.Pointer(&buf[0]))
	for { //Walk the returned buffer
		var name []uint16
		var sname string
		var file FileInformation
		if fdi.FileNameLength > 0 {
			//Convert the variable length field FileName to a slice so it can be passed to UTF16ToString
			h := (*reflect.SliceHeader)(unsafe.Pointer(&name))
			h.Data = uintptr(unsafe.Pointer(&fdi.FileName[0]))
			h.Len = int(fdi.FileNameLength) / 2
			h.Cap = h.Len
			sname = syscall.UTF16ToString(name)
		}
		file.Name = sname
		file.Size = int32(fdi.AllocationSize)

		output = append(output, &file)

		offset += fdi.NextEntryOffset
		if fdi.NextEntryOffset == 0 || offset > uint32(status.Information) {
			break
		}
		fdi = (*FILE_DIRECTORY_INFORMATION)(unsafe.Pointer(&buf[offset]))
	}
}

Note: This example is using FileDirectoryInformation as the information type and so we recieve a buffer of FILE_DIRECTORY_INFORMATION structures. However, if the file name is the ONLY piece of information required the call to NtQueryDirectoryFile could be changed to use FileNamesInformation which returns an array of FILE_NAMES_INFORMATION structures and should be slightly quicker.

The output variable in this example would be a storage buffer for Scan to read the information from.

Scan could be implemented such that it reads from the storage buffer unless it is empty, in which case it makes a single call to NtQueryDirectoryFile. If the call returns STATUS_NO_MORE_FILES then Scan would return false as there are no more items to be read, otherwise it would repopulate the storage buffer with the returned data and allow you to keep iterating.

In theory this should be the fastest possible way to enumerate a directory on Windows as it bypasses the overhead added by both the Go standard library and the FindFirstFile/FindNextFile APIs by going directly to the root data source in the Nt API.

If this is not going too low level I'd be happy to try putting together a PR using this approach, although time is a little tight right now so I probably won't be able to do it right away.

@karrick
Copy link
Owner Author

karrick commented Jan 28, 2020

@niallnsec, thanks for your feedback. I greatly appreciate the efforts you put into the above. I just did a first run improvement on how ReadDirents and ReadDirnames run on Windows, by having them invoke the syscall with a -1 limit argument, which has them return all the os.FileInfo instances. This should be a bit more efficient, but I still need to give your above code examples a thorough read through, and see about adding them.

I appreciate your help, and have not stopped working on this issue.

@kapsh
Copy link

kapsh commented Oct 14, 2021

@karrick hi, was this implemented yet? We noticed that telegraf filecount input doesn't perform well on Windows with large nested files count (millions specifically). I believe that godirwalk 1.16.1 works under the hood there, so speeding this up would benefit many it's users.
Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants