Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: stop using DIRECTORY_SEPARATOR #30

Open
Spreeuw opened this issue Aug 8, 2021 · 5 comments
Open

Suggestion: stop using DIRECTORY_SEPARATOR #30

Spreeuw opened this issue Aug 8, 2021 · 5 comments
Labels

Comments

@Spreeuw
Copy link

Spreeuw commented Aug 8, 2021

I do my development primarily on Windows and have had various path issues with Mozart and Strauss before but it keeps regressing.
The current release does not work on Windows either. I've hacked together a quick fix here, but it feels like a patch at best.

The recurring theme is simple: code assumes forward slashes when in fact the path (on Windows) could have both forward and backward slashes, mainly due to inconsistent usage of DIRECTORY_SEPARATOR.

I propose to stop using DIRECTORY_SEPARATOR altogether and exclusively use forward slashes. PHP itself handles this fine on Windows, and as long as there are no things shell commands where you really need the correct separator, it's simply much easier to work with.

At the same time I think we could benefit from a simple Path helper class, with a few simple functions that replace a lot of recurring operations throughout Strauss (which at the moment also have slightly differing syntaxes, making it harder to debug path related issues):

  • Path::normalize( $path ) to replace all backward slashes with forward slash. For performance I would suggest using a simple str_replace rather than preg_replace. (similar to wp_normalize_path in WordPress).
  • Path::ensureTrailingSlash( $path ) to append a trailing slash if not present. (similar to trailingslashit in WordPress, which is actually referenced in some of the comments!)
  • Path::join( [ $path_1, $path_2 ] ) joining one or more paths with a /, avoiding ugly . '/' . string concatenations.
  • Path::getRelativePath( $path, $basePath ) get the relative path of $path to $basePath.

Function names, class structure etc. don't really matter, but I think this would make all path/file related code a lot more readable and more resistant to OS related issues. I looked for existing libraries handling this but couldn't find something that fits the bill, and also realized that it's pretty simple stuff to begin with.

@Spreeuw
Copy link
Author

Spreeuw commented Aug 11, 2021

while running some tests I discovered one more reason to stop using DIRECTORY_SEPARATOR (and normalize all paths with forward slashes), which is that this is also used in the autoloaders, which should be platform independent.

Random example; when requiring composer/ca-bundle in a project, the Strauss generated autoloader-classmap.php gets this on Linux:

return array(
    'WPO\Straussfogel\Vendor\Composer\CaBundle\CaBundle' => $lib . '/composer/ca-bundle/src/CaBundle.php',
);

but this on windows:

return array(
    'WPO\Straussfogel\Vendor\Composer\CaBundle\CaBundle' => $lib . '\composer\ca-bundle\src\CaBundle.php',
);

the Linux version with forward slashes work fine on Windows.

@BrianHenryIE
Copy link
Owner

I've pushed a change to master that should set all slashes to / in autoload-classmap.php. I'd like to write a test against it before making a new release.

I do my development primarily on Windows and have had various path issues with Mozart and Strauss before but it keeps regressing.

I would really appreciate some failing tests so when I think things are fixed, they don't regress.

I'll take some time to check the DIRECTORY_SEPARATOR issues from Mozart were fixed and remain fixed. I know it's possible to run the tests on Windows on GitHub Actions but I haven't had the time to. Unfortunately, I've been working on an M1 Mac recently, so it's not as easy to run the Windows VM I sometimes used to test with.

The suggested helper functions are alright, but I think can be achieved with convention+testing. Data read in should be normalized, and data printed should be too. I've tried to keep all paths with a trailing slash, and all relative paths without a leading slash. As you've seen, rtrim($path, DIRECTORY_SEPARATOR) . DIRECTORY_SEPARATOR; is trailingslashit() so I'm not mad about adding a utility class for a set of single line functions.

I'm not totally opposed either, TBH I don't 100% understand why the issue is recurring. I've read some documentation telling me most PHP functions can handle either slash. In general, the file handling code needs to be refactored for tests, presumably with some dependency injection. Integration tests could lead the way with unit tests on refactored code following. The flysystem file handling is something form Mozart that I carried on with – I don't totally understand its implications in terms of cross-platform, e.g. could it hide bugs?!

I propose to stop using DIRECTORY_SEPARATOR altogether and exclusively use forward slashes.

I doubt this is true, or there would never have been any issues to begin with, nor would DIRECTORY_SEPARATOR be the PHP solution to this. My guess is DIRECTORY_SEPARATOR is not the problem, but that the problem spurs from places it's not applied. Again, a robust suite of unit tests would be my approach to this.

Function names, class structure etc. don't really matter, but I think this would make all path/file related code a lot more readable

I'm open to ideas around this. I sometimes call it "prefixer", sometimes "renamer". Tests are sometimes camelCase, sometime snake_case. I'm certainly open to any and all suggestions around readability.

@Spreeuw
Copy link
Author

Spreeuw commented Sep 2, 2021

The suggested helper functions are alright, but I think can be achieved with convention+testing

I'm thinking from the contributer perspective here, and helper functions just mean that there's less to worry about and path related bugs can be tackled centrally.

As you've seen, rtrim($path, DIRECTORY_SEPARATOR) . DIRECTORY_SEPARATOR; is trailingslashit() so I'm not mad about adding a utility class for a set of single line functions.

That function is not the same as trailingslashit(), because when applied to a path like some/unix/path/ on Windows, it results in some/unix/path/\.
This in itself is not the issue (or doesn't have to be, it would be a relatively simple change in the oneliner). But most bugs occur when paths are being recombined (code like this: $packagePath = $this->workingDir . $this->vendorDir . $dependency->getPath() . DIRECTORY_SEPARATOR;) and the assumption is being made that they are already fully normalized. Given the different sources of paths and different processing trajectories throughout Strauss, they may not or may not fully be at some point. Using a helper function improves readability and makes sure fixes are applied everywhere in the code instead of needing to replace it. At first I thought it would a simple fix, but when I noticed the sheer amount of different path operations and sources I thought this would benefit from being fixed a little more high level.

I propose to stop using DIRECTORY_SEPARATOR altogether and exclusively use forward slashes.

I doubt this is true, or there would never have been any issues to begin with, nor would DIRECTORY_SEPARATOR be the PHP solution to this.

Forward slashes always work in PHP, backward (Windows style) slashes don't AFAIK. The reason there are issues is because of inconsistent usage of DIRECTORY_SEPARATOR. The current code tries to use DIRECTORY_SEPARATOR everywhere and when this is done correctly, there are indeed no bugs, other than that Windows output is different from *nix output, which I think is not ideal (especially for testability).

My guess is DIRECTORY_SEPARATOR is not the problem, but that the problem spurs from places it's not applied.

Correct (or not applied 'in time').

Again, a robust suite of unit tests would be my approach to this.

In my opinion for code maintainability and readability, having helper functions will simply make these unit tests fail less during development. IF you manage to let the unit tests take both Windows and *nix output into account. Using the same directory separator regardless of the system Strauss runs on will just make this less of a headache (less checks to do).

@BrianHenryIE
Copy link
Owner

Forward slashes always work in PHP, backward (Windows style) slashes don't AFAIK. The reason there are issues is because of inconsistent usage of DIRECTORY_SEPARATOR

Ahh! Now I get it.

@Spreeuw
Copy link
Author

Spreeuw commented Sep 6, 2021

Here's an example implementation of the helper class, as you can see the benefit here is especially in the combined operations where path normalization is automatically done when one of the operations is performed (these things are easily missed and are probably the reason these bugs are being introduced in the first place).

class Path
{
    /**
     * Normalize path to use forward slashes ('/') as directory separator
     * @param string $path path to be normalized
     * @return string $path normalized path
     */
    public static function normalize($path)
    {
        // Standardise all paths to use '/'.
        $path = str_replace( '\\', '/', $path );
    
        // Replace multiple slashes down to a singular
        $path = preg_replace( '|(?<=.)/+|', '/', $path );
    
        // Windows paths should uppercase the drive letter.
        if ( ':' === substr( $path, 1, 1 ) ) {
            $path = ucfirst( $path );
        }
        return $path;
    }

    /**
     * Make sure a path ends with a (forward) trailing slash
     * @param string $path path with or without trailing slash
     * @return string $path path with trailing slash
     */
    public static function ensureTrailingSlash( $path )
    {
        return rtrim( $path, '/\\' ) . '/';
    }

    /**
     * Join partial paths (for example a relative path to its base path)
     * @param array $paths partial paths to be joined with the directory separator
     * @return string $path joined path, normalized
     */
    public static function join( $paths )
    {
        $paths = array_map( array( __CLASS__, 'ensureTrailingSlash' ), $paths );
        $path = self::normalize( implode( "/", $paths ) );
        return $path;
    }

    /**
     * get path relative to a base path
     * @param string $path full (absolute) path
     * @param string $basePath path to which the full path should be relative
     * @return string $path relative path to base path
     */
    public static function getRelativePath( $path, $basePath )
    {
        $path = str_replace( self::normalize( $basePath ), '', self::normalize( $path ) );
        return $path;
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants