Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New PPC: Attributes v2 #56

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

New PPC: Attributes v2 #56

wants to merge 11 commits into from

Conversation

leonerd
Copy link
Contributor

@leonerd leonerd commented Oct 7, 2024

Abstract

Define a new way that attribute definitions can be introduced such that the parser can invoke the third-party custom logic they provide; and additionally permit them to be attached to more types of target than Perl currently allows.


Still currently in draft as I work out a few more minor questions, but now at the stage where it should be suitable for comments.

};
```

Additionally, the `ver` and `flags` fields are added for future flexibility. The `ver` field must be initialised by the code that defines the structure, to indicate what ABI version it is intended for. A suitable value is derived from the Perl version number; for example Perl version 5.41.3 would set the value `(5 << 16) | (41 << 8) | (3)`. In this way, a later version of the interpreter can operate correctly with code expecting earlier structure layouts. It is likely that a macro would be provided to compute the correct value at compile-time.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of ver it will be better to put it into registration function and struct, eg:

register_attribute_540 (struct Perl_Attribute_Definition_540 *);

that will allow to change structure definition along with version which will be easier to validate using C-compiler capabilities.


Additionally, the `ver` and `flags` fields are added for future flexibility. The `ver` field must be initialised by the code that defines the structure, to indicate what ABI version it is intended for. A suitable value is derived from the Perl version number; for example Perl version 5.41.3 would set the value `(5 << 16) | (41 << 8) | (3)`. In this way, a later version of the interpreter can operate correctly with code expecting earlier structure layouts. It is likely that a macro would be provided to compute the correct value at compile-time.

The `flags` field may contain either of two mutually-exclusive flags. `PERLATTR_NO_VALUE` indicates that the attribute definition does not expect to receive a value at all and asks that it be a parser error if the user supplied anything. Alternatively, `PERLATTR_MUST_VALUE` indicates that a value is required; it shall be a parser error for there not to be a value. If neither flag is supplied then any such value becomes optional - the `attrvalue` parameter may be given a valid SV, or may be `NULL`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how to implement scenario where it make sense to define default value? for example attribute required, default value true

cannot parse be reused for that? (getting NULL when there is no value?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easiest just done in the apply function by noting the lack of a (defined) value being passed in.


The `flags` field may contain either of two mutually-exclusive flags. `PERLATTR_NO_VALUE` indicates that the attribute definition does not expect to receive a value at all and asks that it be a parser error if the user supplied anything. Alternatively, `PERLATTR_MUST_VALUE` indicates that a value is required; it shall be a parser error for there not to be a value. If neither flag is supplied then any such value becomes optional - the `attrvalue` parameter may be given a valid SV, or may be `NULL`.

In order to create a new attribute, a third-party module author would create such a C structure containing a pointers to the C functions to handle the attribute, and wrap it in some kind of SV - whose type is still yet to be determined (see "Open Issues" below). This wrapping SV is then placed into the importing scope's lexical pad, using a `:` sigil and the name the attribute should use. It is important to stress that this SV represents the abstract concept of the attribute _in general_, rather than its application to any particular target. As the definition of an attribute itself is not modified or consumed by any particular application of it, a single SV to represent it can be shared and reused by any module that imports it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about to define universal SV to hold any kind of information identified by U32 ?
Such value can be used as prefix in pad as well and defines implementation pattern for any kind of symbol extension (eg Named regexp patterns, in fact same thing I need for "independent symbol spaces")


When attached to a package variable or package-named function, the target can be the entity itself (or maybe indirectly, its GV). When attached to a lexical, it is important to pass in the abstract concept of the target in general, rather than the current item in its scope pad. There is currently no suitable kind of SV to represent this - this remains another interesting open issue.

Additionally, new callsites can be added to the parser to invoke attribute callbacks in new situations that previously were not permitted; such as package declarations or subroutine parameters.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possible context:

  • regexp attributes
s
    : multiline
    : extended
    : non_capture
    : return
    ( )
    ( )
    ;
  • use statement, for example:
use My::Almighty::Set::Of::Utils
   :attributes(import grammar of attributes to be imported)
   :pattern(C_Comment CPP_Comment Perl_Comment)
   :other_symbol_space(import grammar of attributes to be imported)
   ;
  • require statement
require :module q (Foo);
require :relative q (file-to-require.pl);
  • smartmatch expression
- given ($foo :using(Str_Or_Regex)) { ... }
if ($foo ~~ :using(Str_Or_Regex)) { ... }
  • function invocation scope, for example:
select
    :table(foo)
    :join(bar ON (foo.id = bar.foo_id and bar.valid_from > $valid_from))
    :columns(foo.*)
    :where(foo.column BETWEEN $min AND $max)
;

# with alternative attribute syntax
select (
    := table   => foo
    := join    => bar ON (foo.id = bar.foo_id and bar.valid_from > $valid_from)
    := columns => foo.*
    := where   => foo.column BETWEEN $min AND $max
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are all things to consider in the future, yes. The point of this particular proposal isn't to allow new syntax, but to create a better implementation base such that in future new syntax could be considered.

I've added words to that effect.


Additionally, new callsites can be added to the parser to invoke attribute callbacks in new situations that previously were not permitted; such as package declarations or subroutine parameters.

Note that this specification does not provide a mechanism by which attributes can declare what kinds of targets they are applicable to. Any particular named attribute will be attempted for _any_ kind of target that the parser finds. This is intentional, as it leads to a simpler model both for the interpreter and for human readers of the code, as it means any particular attribute name has at most one definition; its definition does not depend on the type of target to which it is applied. It is the job of the attribute callback itself to enquire what kind of target it has been invoked on, and reject it if necessary - likely with a `croak()` call of some appropriate message.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when parse is invoked, it knows context, so why not to defined enumeration of context identifiers and send it along with text, so it can return value aware of such context?

should this "enumeration be extendable, such mechanism can be reused by 3rd party extensions to declare new contexts for their use (along with core contexts) and use attribute handling provided by core


### Perl-Level API

The mechanisms described in this document exist entirely at the C level, intended for XS modules to make use of. It should be possible to provide wrappings of most of the relevant parts, at least for simple cases, for use directly by Perl authors without writing C code.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to define reusable (from core as well from 3rd party modules) syntax to declare additional symbols (taken from my "independent symbol spaces"). For attributes:

declare attribute Foo as
    on CODE   => { ... do something ... }
    on SCALAR => { ... do something ... }
    on ARRAY | HASH => { ... do something ... }
  ;

ppcs/ppcTODO-attributes-v2.md Outdated Show resolved Hide resolved
ppcs/ppcTODO-attributes-v2.md Outdated Show resolved Hide resolved
ppcs/ppcTODO-attributes-v2.md Outdated Show resolved Hide resolved
Comment on lines 57 to 63
struct PerlAttributeDefinition
{
U32 ver;
U32 flags;
SV * (*parse)(pTHX_ SV *text); /* optional, may be NULL */
void (*apply)(pTHX_ SV *target, SV *attrvalue);
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, a "function value" should be represented in C as a pair of (function pointer, context pointer) to enable closure-like behavior. So I would expect to see something more like

    void *parse_ctx;
    SV *(*parse)(pTHX_ void *ctx, SV *text);
    void *apply_ctx;
    void (*apply)(pTHX_ void *ctx, SV *target, SV *attrvalue);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahyes, some sort of void *ctx pointer in the functions would be useful, though I don't think its value should come from the struct PerlAttributeDefinition itself. If following similar patterns from many of my XS modules, that struct would likely be static const in the .xs file itself. A void *ctx would probably be supplied some kind of registration function, or from the SV in the scope pad, or somesuch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I've added this now in the latest wording. The pointer value is supplied as an argument to newSVinternal_attribute_definition(), and the callback functions in the struct all extended to use it.

{
U32 ver;
U32 flags;
SV * (*parse)(pTHX_ SV *text); /* optional, may be NULL */
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does parse return? If parse isn't supposed to modify the text parameter in place, shouldn't it be marked const?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mauke good point ... in fact, shouldn't parse be rather Perl_keyword_plugin_t ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahyes, good idea. It could just return the SV itself or it could return some new one. Would have to think a bit about refcounts, but yes it shouldn't modify the one passed in.


The mechanisms described in this document exist entirely at the C level, intended for XS modules to make use of. It should be possible to provide wrappings of most of the relevant parts, at least for simple cases, for use directly by Perl authors without writing C code.

Existing experience with [`XS::Parse::Keyword::FromPerl`](https://metacpan.org/pod/XS::Parse::Keyword::FromPerl) shows that while such an interface could easily be provided, in practice it is unlikely anyone would make use of it as it requires understanding details of its operation in sufficient depth that any author would be just as well served by writing the XS code directly. Therefore no such wrapping is yet proposed by this document but could be added later on, either by a core-provided or CPAN module.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for Perl-level operations it makes much more sense to have the opposite of this proposal: Attributes that apply to Perl entities (variables, subroutines), not parser/interpreter artifacts (GVs, pad slots, CVs, protosubs). Since most of these entities (lexical variables, anonymous closures) only exist at runtime, it would mean providing a runtime hook that is invoked when the entity is instantiated.

But that's a very different proposal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since most of these entities (lexical variables, anonymous closures) only exist at runtime, it would mean providing a runtime hook that is invoked when the entity is instantiated.

But that's a very different proposal.

Well, quite ;) That's the "Hooks" part of the idea, which is separate enough (and rather more vague) to warrant its own separate PPC doc. The attributes here aren't directly related, but they're likely useful when combined.

@leonerd
Copy link
Contributor Author

leonerd commented Oct 10, 2024

Many small updates clarifying wording, narrowing down a few open issues, adding more specification in terms of new API support functions and behaviours.

{
U32 ver;
U32 flags;
SV * (*parse)(pTHX_ const SV *text, void *data); /* optional, may be NULL */

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about change parse to custom_grammar returning something that may or may not be OP * ? with following behaviour:

  • when perl starts:
    • push grammar stack unbalanced grammar
    • push grammar stack perl grammar
  • when encountering places where switch to known custom grammar is needed
    • push grammar stack known custom grammar (custom grammar can be perl as well)

there already is similar mechanism in toke.c - Perl_lex_start. This mechanism will just provide generic API to that plus it will provide Perl grammar itself as custom grammar.

That itself allows to implement use VERSION as custom grammar which may allow to introduce changes will require incompatible changes to perly.y.

Example: hypothetical import syntax may reuse this attribute mechanism for itself (by introducing custom attribute context)

use builtin
  :functions (
    foo :as (bar)
    baz
  )
  :attributes (magic)
  ;

U32 ver;
U32 flags;
SV * (*parse)(pTHX_ const SV *text, void *data); /* optional, may be NULL */
void (*apply)(pTHX_ SV *target, SV *attrvalue, void *data);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about replacing this with some set (linked list? tree? ...) of structures:

struct attribute_apply {
    U32 attribute_context_id;
    Attribute_Apply_Function apply;
};

@leonerd
Copy link
Contributor Author

leonerd commented Nov 13, 2024

There's not been any new comments on this for a while, so I'm going to declare it ready-for-review. We'll see how it looks.

@leonerd leonerd marked this pull request as ready for review November 13, 2024 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants