Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add filesystem interaction #874

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

minhqdao
Copy link
Contributor

This PR introduces filesystem interactions by pragmatically invoking the system shell. It is part of the broader strategy outlined in #865 (see #865 (review)) to support file zipping and unzipping. These functionalities are essential for handling .npz files.

src/CMakeLists.txt Outdated Show resolved Hide resolved
Copy link
Contributor

@zoziha zoziha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this work, @minhqdao. Looking forward to the Stdlib starting to integrate filesystem-related routines. This is the beginning of a new category. In addition to the two noted issues, the function interfaces introduced by this PR should be documented in the specifications later.

(see also fpm::filesystem)

test/io/test_filesystem.f90 Outdated Show resolved Hide resolved
stat = 0

if (.not. exists(temp_dir)) then
call run('mkdir '//temp_dir, stat)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that under Windows OS, mkdir temp will create a temp folder under the current path pwd. Is this appropriate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a very pragmatic solution, but I think it should be fine for now

src/stdlib_io_filesystem.f90 Outdated Show resolved Hide resolved
!>
!> Run a command in the shell.
!> [Specification](../page/specs/stdlib_io.html#run)
subroutine run(command, iostat, iomsg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe an option to redirect the command output (stdout, stderr) to a string variable would be very useful. It is being done in the fpm version of this function already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it slightly exceeds the scope of this PR but it can easily be done in a follow-up PR

test/io/test_filesystem.f90 Outdated Show resolved Hide resolved
@arjenmarkus
Copy link
Member

arjenmarkus commented Sep 19, 2024 via email

@minhqdao minhqdao marked this pull request as draft September 20, 2024 18:07
@minhqdao
Copy link
Contributor Author

Changes made:

  • Activated cpp by using capitalized .F90 suffix.
  • Check is_windows during runtime.
  • Add platform-agnostic path_separator (/ vs. \).
  • Use rmdir \s\q on Windows to remove directories.
  • Use dir \b on Windows.
  • All tested.
  • Add documentation.

@zoziha @jalvesz @arjenmarkus @perazz

@minhqdao minhqdao marked this pull request as ready for review September 21, 2024 07:29
@jalvesz
Copy link
Contributor

jalvesz commented Oct 2, 2024

Hi I went through the PR again and would like to make a few comments.

The first one is purely cosmetic: I think that all procedures should have end function or end subroutine, same for the module. While a standalone end works, it is not a common practice and it felt somehow unnatural. For the sake of coherence with the rest of the library, I would suggest to keep the full closing statement.

The second point: I would like to come back to the preprocessing... I know that it is a sad state of affairs that gfortran does not expose gcc defined macros. Yet, on one hand, I find terribly worrisome to add all those operations just to detirme the OS and then to get the proper path separator. I find it hard to justify all those runtime operations for something that can be a compile-time constant. So I would like to come back to proposing setting the path separator as a simple character(1), parameter, public constant which value is defined by the C preprocessor mechanism. Same for is_windows which can be a logical, parameter, public constant.

This does indeed impose: either relying on CMake, or documenting that one should pass the appropriate definition flags if using plain Makefiles or fpm. This build-time "burden" seems to me more reasonable compared to the runtime operations.

I've opened a discussion to see if fpm could also help improving this preprocessing situation here fortran-lang/fpm#1084

While I find it "tolerable" for fpm having to rely on the runtime implementation as a tool for building, I'm expecting stdlib to be used for more computationally intensive tasks. Thus, it seems to me better to shift the issue to the building documentation and not the library.

@jvdp1
Copy link
Member

jvdp1 commented Oct 6, 2024

Thank you @minhqdao for this PR. In addition to all other comments, I think that the proposed features in this PR should have their own modules and specs , and not be related to stdlib_io. what about calling it stdlib_filesystem or even stdlib_os?
See also #201 and this initial project

@minhqdao
Copy link
Contributor Author

@zoziha @jalvesz @jvdp1

  • Runtime checks to determine the OS are eliminated.

But please note for future apis that check OS: When building with CMake, we use CMAKE_SYSTEM_NAME and when building with fpm, we use platform.system() (python). The results might not be the same. It seems to work for OS == "Windows", though.

Further:

  • Added end annotations.
  • Updated tests.
  • Extracted stdlib_filesystem.

However, is_windows is not a filesystem feature but a os one. exists gives information on the filesystem. And run is ... neither (?). I don't want to have too many files as it might end up a bit messy (which is what happened in fpm), so maybe we should just generalize it under stdlib_os? Or stdlib_fs_os? What are your thoughts?

@jalvesz
Copy link
Contributor

jalvesz commented Oct 16, 2024

Thanks @minhqdao for the updates. And sorry for the late replay!

I think that generalizing under stdlib_os module is a good idea!

Just one last request, could we keep the preprocessing as cpp instead of fypp? this would enable compiling the same fypp pre-pre-processed code on different OS. I personally do this systematically thanks to the WSL on windows.

With CMake nothing to do (revert change in current PR).

With the python script for fpm fypp_deployment.py add the platform module and change the function fpm_build:

import os
import platform
...

def fpm_build(args,unknown):
    import subprocess
    #==========================================
    # check compilers
    FPM_FC  = os.environ['FPM_FC']  if "FPM_FC"  in os.environ else "gfortran"
    FPM_CC  = os.environ['FPM_CC']  if "FPM_CC"  in os.environ else "gcc"
    FPM_CXX = os.environ['FPM_CXX'] if "FPM_CXX" in os.environ else "gcc"
    #==========================================
    # Filter out flags
    preprocessor = { 'gfortran':'-cpp ' , 'ifort':'-fpp ' , 'ifx':'-fpp ' }
    flags = preprocessor[FPM_FC]
    for idx, arg in enumerate(unknown):
        if arg.startswith("--flag"):
            flags= flags + unknown[idx+1]
    flags = flags + "-D{}".format(platform.system().upper())
    #==========================================
    # build with fpm
    subprocess.run("fpm build"+
                   " --compiler "+FPM_FC+
                   " --c-compiler "+FPM_CC+
                   " --cxx-compiler "+FPM_CXX+
                   " --flag \"{}\"".format(flags), shell=True, check=True)
    return

In stdlib_system.F90 we can change the preprocessing line for:

#if defined(WIN32) || defined(WIN64) || defined(WINDOWS)

(the first two are provided by CMake, the latter is what the python script would add to the flags)
This same idea can be used in the current module.

This enables building with fpm (throught the script) with

python .\config\fypp_deployment --build

I tested it on PowerShell / CMD / Ubuntu (WSL) and it works.

EDIT:
I continued testing and now I'm not sure that CMake is actually exposing gcc macros correctly. To be on the safe side, the following can be used in the CMakeLists.txt to pass the system name as a cpp definition:

# Convert CMAKE_SYSTEM_NAME to uppercase
string(TOUPPER "${CMAKE_SYSTEM_NAME}" SYSTEM_NAME_UPPER)

# Pass the uppercase system name as a macro
add_compile_options(-D${SYSTEM_NAME_UPPER})

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants