Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opcache_is_script_cached_in_file_cache(string $filename) #16979

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

iamacarpet
Copy link
Contributor

A potential change to opcache_is_script_cached(...) that allows checking the file cache instead of SHM.

It's adding a new parameter, bool $file_cache = false, defaulting to false so the default / existing behaviour remains the same: when the second parameter is true, it'll check JUST the file cache, not SHM.

This should give us a better ability to test things relating to file caching, and hopefully enable some better unit testing for use with opcache.file_cache_read_only.

There are a couple of bits missing, like updating the docs, but I wanted to float the idea first.

@dstogov, @iluuu1994 , @arnaud-lb , @nielsdos : what do you think?

@iamacarpet iamacarpet force-pushed the opcache/opcache_is_script_cached_file branch 5 times, most recently from 56f1c82 to 1cad353 Compare November 28, 2024 12:04
@iamacarpet iamacarpet force-pushed the opcache/opcache_is_script_cached_file branch 3 times, most recently from 15dd23c to e77866e Compare November 28, 2024 15:23
@iamacarpet iamacarpet force-pushed the opcache/opcache_is_script_cached_file branch from e77866e to 798a07d Compare November 28, 2024 15:44
@iluuu1994
Copy link
Member

Hi @iamacarpet! No objections in terms of functionality, although I think this is something that should be mentioned on the mailing list. The parameter itself is a bit confusing. It's not obvious whether it additionally checks for file cache, or exclusively file cache. Since it's the latter, it may be better to create a separate method.

@iamacarpet
Copy link
Contributor Author

iamacarpet commented Nov 28, 2024

Thanks @iluuu1994 , I'll do that, and yes, I'm not averse to making it its own function if that's preferable.

I wish I could get the automated test to pass so I knew I was along the right lines - I have no idea where I'm going wrong.

If you've got enough time, does anything stand out to you?

EDIT:

I think I've got it to pass now, will just need to wait for this latest build.

It appears like it's coming up false when opcache.file_cache_only is set.

I'm not 100% this is a bug with what this function is checking, rather than it being a bug with opcache.file_cache_only, as I seem to think when I was testing opcache.file_cache_read_only by hand (generating file cache on the CLI), it wasn't generating the cache correctly with that option set then either.

@iamacarpet iamacarpet force-pushed the opcache/opcache_is_script_cached_file branch from 6cf19dc to 30641d6 Compare November 28, 2024 16:36
Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No objections from me functionality-wise.
I don't have time to deeply look into this now though.

Comment on lines 1021 to 1023
if (zend_parse_parameters(ZEND_NUM_ARGS(), "S|b", &script_name, &file_cache) == FAILURE) {
RETURN_THROWS();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the change away from fast-ZPP to regular ZPP is due to not knowing how this works instead of a deliberate change? (see https://wiki.php.net/rfc/fast_zpp)

Suggested change
if (zend_parse_parameters(ZEND_NUM_ARGS(), "S|b", &script_name, &file_cache) == FAILURE) {
RETURN_THROWS();
}
ZEND_PARSE_PARAMETERS_START(1, 2)
Z_PARAM_STR(script_name)
Z_PARAM_OPTIONAL
Z_PARAM_BOOL(file_cache)
ZEND_PARSE_PARAMETERS_END();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the help with this!

zend_stream_init_filename_ex(&handle, filename);
handle.opened_path = realpath;

zend_persistent_script *persistent_script = zend_file_cache_script_load(&handle, true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this lead to loading from the file cache?
This look weird.

Copy link
Contributor Author

@iamacarpet iamacarpet Nov 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of, the , true parameter was to skip loading into SHM, so it'll effectively fully load the file & unserialize it, just not load it into SHM.

I'll agree this probably wasn't ideal.

How about this latest commit, swapping to zend_file_cache_script_validate(...) ?

I was unhappy with the code re-use, as ideally I'd want to replace the validation code within zend_file_cache_script_load(...) with a call to zend_file_cache_script_validate(...):

However, this had the side effect that I'd either be opening, reading & locking the cache file twice, or, we'd have to return a struct from zend_file_cache_script_validate(...) - something like this:

// New struct to hold validation results and the file descriptor
typedef struct _zend_file_cache_validation_result {
    zend_file_cache_metainfo info;
    int fd;
    off_t file_size;
} zend_file_cache_validation_result;


// Modified validation function to return the file descriptor and file size
static zend_file_cache_validation_result *zend_file_cache_script_validate(zend_file_handle *file_handle)
{
    ...
}

and that felt even more complicated.... So I'm at a loss really.

How would you handle it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a parameter to zend_file_cache_script_load() would be the simplest solution, but a separate function as you did is a better API (IMHO).

I like your suggestion of making the validate function return a struct, but I suggest to pass the struct as parameter so that the caller can allocate it on the stack:

zend_result zend_file_cache_script_validate(zend_file_cache_validation_result *result);

Maybe rename zend_file_cache_script_validate() to zend_file_cache_open(), and zend_file_cache_validation_result to zend_file_cache_handle, and add a small wrapper of zend_file_cache_open() that doesn't take this parameter:

zend_always_inline zend_result zend_file_cache_open(zend_file_handle *file_handle, zend_file_cache_handle *cache_handle);

zend_result zend_file_cache_validate(zend_file_handle *file_handle)
{
    zend_file_cache_handle cache_handle;
    return zend_file_cache_open(file_handle, &cache_handle);
}

// maybe also:
zend_always_inline zend_file_cache_close(zend_file_cache_handle *cache_handle);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @arnaud-lb ,

I've just pushed my first try at implementing the way you've suggested.

I actually really like that approach and it makes a lot more logical sense.

In the end, rather than passing around a file descriptor, since we were already reading the whole file to potentially checksum it, I've opted to pass around the zend_arena_checkpoint and zend_arena_alloc values:

I won't pretend I understand this very well, so if I've done something wrong, please speak up!

Copy link
Member

@arnaud-lb arnaud-lb Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I overlooked the arena usage before my previous comment. Unfortunately, passing around the arena checkpoint would result in fragile code.

Alternatives would be to allocate the buffer in some other way, or to mmap the file, but this is too much changes and this may negatively impact the performance of zend_file_cache_script_load().

At this point I think that adding a validate_only parameter to zend_file_cache_script_load() is a better solution, after all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, getting your originally suggested method to build & work has been a great learning experience 😃

Thanks for your time reviewing and helping to move this along, it's really appreciated.

At this point I think that adding a validate_only parameter to zend_file_cache_script_load() is a better solution, after all.

This is pretty much what I'd done in my first attempt that @dstogov didn't seem to like, so before I switch back:

@dstogov are you happy with that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there still any interest in us being able to implement this please?

And if so, @dstogov , are you able to weigh in on the above about what your preferred implementation is please?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were some questions about consistency. I haven't gotten to them yet.

@iamacarpet
Copy link
Contributor Author

That's building & running the tests fine now at-least.

So as I see it, a few things are outstanding:

  • Do we create a new function instead of opcache_is_script_cached(string $filename, bool $file_cache = false)?
    • If we do this, what would it be called? opcache_is_script_cached_in_file_cache(string $filename)?
  • Is there a preference for implementing via zend_file_cache_script_validate(...) rather than an additional parameter to prevent loading into SHM on zend_file_cache_script_load(...)?
  • What's going on with opcache.file_cache_only?
    • Cache files aren't being created / generated when this option is enabled - why?
    • ZCG(accelerator_enabled) evaluates to false when this option is enabled - is that intentional?

I'd love some help / advice if anyone has time please? @iluuu1994 / @dstogov

@iamacarpet iamacarpet requested a review from dstogov December 2, 2024 13:59
@arnaud-lb
Copy link
Member

arnaud-lb commented Dec 2, 2024

Hi @iamacarpet

  • Do we create a new function instead of opcache_is_script_cached(string $filename, bool $file_cache = false)?

    • If we do this, what would it be called? opcache_is_script_cached_in_file_cache(string $filename)?

I agree that two separate functions is better, and in my opinion this name is good

  • What's going on with opcache.file_cache_only?

    • Cache files aren't being created / generated when this option is enabled - why?
    • ZCG(accelerator_enabled) evaluates to false when this option is enabled - is that intentional?

Could you share your opcache settings ? Anyting in the opcache logs ? I'm not observing this behavior when running php -n -dzend_extension=opcache.so -dopcache.enable_cli=1 -dopcache.file_cache=/tmp/filecache -dopcache.file_cache_only=1 test.php (or with -r 'opcache_compile_file("test.php");').

@iamacarpet iamacarpet changed the title opcache_is_script_cached(string $filename, bool $file_cache = false) opcache_is_script_cached_in_file_cache(string $filename) Dec 2, 2024
@iamacarpet iamacarpet force-pushed the opcache/opcache_is_script_cached_file branch 3 times, most recently from e39e394 to 1882314 Compare December 3, 2024 09:48
@iamacarpet iamacarpet force-pushed the opcache/opcache_is_script_cached_file branch 3 times, most recently from 9d69f45 to 12e7461 Compare December 3, 2024 11:24
@iamacarpet
Copy link
Contributor Author

Just FYI, I've added some tests that cover GH-16551 (opcache.file_cache_read_only).

Hopefully I've done it right, but please let me know if not.

I deliberately made it fail to check it was running as expected:

image

Since then, I've pushed another update to ensure it passes again.

@iamacarpet iamacarpet force-pushed the opcache/opcache_is_script_cached_file branch 2 times, most recently from 97eff1b to 6e819bb Compare December 3, 2024 11:52
cache_it = false;
}

ZCG(mem) = ((char*)mem + info.mem_size);
script = (zend_persistent_script*)((char*)buf + info.script_offset);
ZCG(mem) = ((char*)cache_handle.mem + cache_handle.info.mem_size);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably nothing to worry about if we won't go with this method anyway, but, for my own learning experience, does this look correct?:

ZCG(mem)

I'm not sure that feels right when the other variable is now cache_handle.mem, however, it compiles and runs fine.

@iamacarpet iamacarpet force-pushed the opcache/opcache_is_script_cached_file branch 3 times, most recently from 6799f49 to 6be025b Compare December 3, 2024 17:00
@iamacarpet iamacarpet force-pushed the opcache/opcache_is_script_cached_file branch from 6be025b to 6df1562 Compare December 3, 2024 18:35
opcache_invalidate($uncached_file);

var_dump(
opcache_is_script_cached($uncached_file)
Copy link
Contributor Author

@iamacarpet iamacarpet Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused:

The CircleCI unit test for ARM is failing reliably on this line (it's returning true, when it should be false, as the file shouldn't have been cached yet).

image

I've added in the call to opcache_invalidate(...) above, which should remove it from the SHM if it's already there, and yet, it's still returning true.

Is this potentially a bug on ARM?

I'll sleep on it and see if I can figure out what's happening tomorrow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't re-create a problem with opcache_is_script_cached(...) in the 4 tests I created to probe this, so it's obviously something I was doing wrong.

I've removed the unnecessary tests and re-written the failing test to remove the part causing the issue.

Comment on lines 1816 to 1822
zend_persistent_script *zend_file_cache_script_load(zend_file_handle *file_handle)
zend_result zend_file_cache_validate_and_open(zend_file_handle *file_handle, zend_file_cache_handle *cache_handle)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of all these changes below, I would just add bool validate_only argument to zend_file_cache_script_load() and return true after header validation and before arena allocation and actual script loading.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion @dstogov !

I've made the modifications to do it this way, however, since zend_file_cache_script_load() returns zend_persistent_script *, I couldn't return true - I've had to mock that by returning a fixed, static instance of the zend_persistent_script struct.

Does this fit your expectations?

@iamacarpet iamacarpet force-pushed the opcache/opcache_is_script_cached_file branch from 6bca50d to 7cff70a Compare February 3, 2025 16:15
Copy link
Member

@dstogov dstogov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now this looks fine

@iamacarpet
Copy link
Contributor Author

Thanks @dstogov ,

@kocsismate , are you happy with this now?

If you've got time for a review please, that would be amazing! :).

$uncached_file = __DIR__ . '/gh16551_999.inc';

var_dump(
opcache_is_script_cached_in_file_cache($file)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests look problematic to me. They will fail on first execution (confirmed locally).

========DIFF========
001- bool(true)
001+ bool(false)
     9
     bool(false)
     8
========DONE========
FAIL GH-16551: opcache file cache read only: read file cache with limited checks [ext/opcache/tests/gh16551_003.phpt]

========DIFF========
001- bool(true)
002- bool(true)
001+ bool(false)
002+ bool(false)
     9
========DONE========
FAIL GH-16551: opcache file cache read only: ensure we can't invalidate [ext/opcache/tests/gh16551_004.phpt]

I don't understand why they don't fail on CI. It's also possible they are order dependent, i.e. one job will prime the cache, another will check that the cache is not primed.

One solution might be to set opcache.file_cache to some sub-directory and clean it in the --CLEAN-- section to ensure a consistent state.

zend_persistent_script *zend_file_cache_script_load(zend_file_handle *file_handle)
static zend_persistent_script file_cache_validate_success_script;

zend_persistent_script *zend_file_cache_script_load(zend_file_handle *file_handle, bool validate_only)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer renaming this to zend_file_cache_script_load_ex() and keeping the signature for zend_file_cache_script_load() the same so we don't need to adjust all callers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants