Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more pathlib benchmarks #261

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

zmievsa
Copy link

@zmievsa zmievsa commented Jan 15, 2023

A small extension on current benchmarks related to pathlib.

My current implementation is very barebones and dumb because I am quite new to writing pyperformance benchmarks and benchmarks in general. Any and all opinions on how to improve my work are welcome!

@mdboom
Copy link
Contributor

mdboom commented Feb 7, 2023

I have no comment one way or the other about these changes, but we should rename the benchmark to pathlib2 if we do this so that comparisons won't be misleading.

@zmievsa
Copy link
Author

zmievsa commented Feb 9, 2023

@mdboom could you help me understand what you mean by "so that comparisons won't be misleading"?

@mdboom
Copy link
Contributor

mdboom commented Feb 9, 2023

@mdboom could you help me understand what you mean by "so that comparisons won't be misleading"?

When a benchmark is changed significantly like this, previously run baseline benchmarks are no longer meaningful to compare against, and could lead someone to make the wrong decision based on the comparison.

@barneygale
Copy link

Sorry for taking so long to look at this.

Thanks for putting this together. I'm already using it locally, with some modifications :)

IIUC, pyperformance isn't designed for function-by-function benchmarking. I think we might want to merge some of these test cases. Overall, I think I'm aiming for:

  • pathlib: previous code for this benchmark per @mdboom's comment
  • pathlib_construct: cover PurePath(), Path(), joinpath(), fspath(path)
  • pathlib_normalize: cover drive, root, anchor, parts, name, suffix, suffixes, stem, with_name(), with_stem(), with_suffix(), relative_to(), is_relative_to(), parent, parents, is_reserved(), match()
  • pathlib_string: cover str(path), as_posix(), bytes(path), path.as_uri(), repr(path).
  • pathlib_compare: cover a == b, hash(a), a < b
  • pathlib_fs: cover absolute(), stat(), open(), touch(), mkdir(), unlink(), rmdir(), exists(), is_dir(), is_file(),
  • pathlib_fs_walk: cover iterdir(), glob(), rglob(), walk()

(pyperformance folks, please correct me if I'm doing this wrong)

This PR could add pathlib_construct and pathlib_normalize.

We need more realistic test cases for path construction. We need a variety of paths: POSIX and Windows, absolute and relative, short and (some) long. I'd imagine that the average path length falls off quite rapidly, so most of our paths should be <5 components long, with very few >10. It would be good to generate some realistic-looking file extensions too (the benchmark already does something like this, but for concrete paths).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants