Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow collation to be a part of data type #7748

Merged
merged 9 commits into from
Nov 24, 2023
5 changes: 5 additions & 0 deletions doc/sql.extensions/README.ddl.txt
Original file line number Diff line number Diff line change
Expand Up @@ -618,3 +618,8 @@ ALTER PROCEDURE <name> SQL SECURITY {DEFINER | INVOKER} | DROP SQL SECURITY
(Alexander Zhdanov)

ALTER PACKAGE <name> SQL SECURITY {DEFINER | INVOKER} | DROP SQL SECURITY

27) COLLATE clause can be used as a part of character data type as per SQL standard.
(Dmitry Sibiryakov)

If is used twice, an error is returned.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not correspond to changes below. For parameters you seem to allow both and use the latest one. What about local variables?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is intended for table fields only, I'm not sure if parameters and variables are covered. Should they?

Copy link
Member

@dyemanov dyemanov Sep 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have added the code to ParameterClause::ctor with a comment:
// Explicit collation has precedence over data type collation
This does not match the text about error being raised.

And parameters/variables are declared using:
column_domain_or_non_array_type collate_clause ...
where column_domain_or_non_array_type now includes a collate_clause itself after your changes...
domain_or_non_array_type -> domain_or_non_array_type_name -> non_array_type -> simple_type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment applies to cases when data type is derived from domain or column including collation. Old code silently reset collation to default if no explicit COLLATE was provided so I've added the condition and the comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For parameters you seem to allow both and use the latest one.

Indeed, I'll fix that but it is not really related to the comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about local variables?

Local variables seems to be ok:

SQL> create procedure p2 as
CON> declare a varchar(10) character set win1251 collate pxw_cyrl not null;
CON> declare b varchar(10) character set win1251 not null collate pxw_cyrl;
CON> declare c vu collate unicode_ci_ai;
CON> declare d vu;
CON> declare e vc not null collate win1251;
CON> declare f vc;
CON> begin end^
SQL> create procedure p3 as
CON> declare a varchar(10) character set win1251 collate pxw_cyrl not null collate win1251;
CON> begin end^
Statement failed, SQLSTATE = 42000
SQL error code = -637
-duplicate specification of COLLATE - not supported

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is pity that setClause() is not available outside of parser. Can they be made static?

IMHO, they (along with setClauseFlag / checkDuplicateClause / isDuplicateClause) should be static, as they don't access the parser state. @asfernandes , do you agree?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would question the need in collate_clause after domain_or_non_array_type and column_domain_or_non_array_type, as now it's also included inside these parser rules. The only regression would be that NOT NULL must be written after COLLATE, not before as now. Is it an acceptable breakage, given the low probability that both co-exist in input/output parameters / return values in real-world scripts? And there will be no need to check for duplicated definitions then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If nodes are going to call Parser::setClause, I don't agree. It's layer violation.
In this case I would move them to Node class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collation can be removed from ParameterClause constructor parameters and assigned later. Is it ok?

18 changes: 8 additions & 10 deletions src/dsql/DdlNodes.epp
Original file line number Diff line number Diff line change
Expand Up @@ -1119,14 +1119,13 @@ void DdlNode::storeGlobalField(thread_db* tdbb, jrd_tra* transaction, MetaName&
//----------------------


ParameterClause::ParameterClause(MemoryPool& pool, dsql_fld* field, const MetaName& aCollate,
ParameterClause::ParameterClause(MemoryPool& pool, dsql_fld* field,
ValueSourceClause* aDefaultClause, ValueExprNode* aParameterExpr)
: name(pool, field ? field->fld_name : ""),
type(field),
defaultClause(aDefaultClause),
parameterExpr(aParameterExpr)
{
type->collate = aCollate;
}

string ParameterClause::internalPrint(NodePrinter& printer) const
Expand Down Expand Up @@ -6454,13 +6453,18 @@ void RelationNode::defineField(thread_db* tdbb, DsqlCompilerScratch* dsqlScratch
computedSource, computedValue);
}

field->collate = clause->collate;
field->resolve(dsqlScratch);

// Generate a domain.
storeGlobalField(tdbb, transaction, fieldDefinition.fieldSource, field,
computedSource, computedValue);
}
else
{
// Resolve possible additional collation for domains. For plain types it is already resolved above.
if (field->collate.hasData())
DDL_resolve_intl_type(dsqlScratch, field, field->collate);
}

if ((relation->rel_flags & REL_external) &&
(field->dtype == dtype_blob || field->dtype == dtype_array || field->dimensions))
Expand All @@ -6473,9 +6477,6 @@ void RelationNode::defineField(thread_db* tdbb, DsqlCompilerScratch* dsqlScratch
Arg::Gds(isc_dsql_type_not_supp_ext_tab) << typeName << name << field->fld_name);
}

if (clause->collate.hasData())
DDL_resolve_intl_type(dsqlScratch, field, clause->collate);

if (clause->identityOptions)
{
if (clause->identityOptions->increment.value_or(1) == 0)
Expand Down Expand Up @@ -6519,10 +6520,7 @@ void RelationNode::defineField(thread_db* tdbb, DsqlCompilerScratch* dsqlScratch
}

fieldDefinition.defaultValue = defaultValue;

if (clause->collate.hasData())
fieldDefinition.collationId = field->collationId;

fieldDefinition.collationId = field->collationId;
fieldDefinition.store(tdbb, transaction);

// Define the field constraints.
Expand Down
2 changes: 1 addition & 1 deletion src/dsql/DdlNodes.h
Original file line number Diff line number Diff line change
Expand Up @@ -171,7 +171,7 @@ class ExternalClause : public Printable
class ParameterClause : public Printable
{
public:
ParameterClause(MemoryPool& pool, dsql_fld* field, const MetaName& aCollate,
ParameterClause(MemoryPool& pool, dsql_fld* field,
ValueSourceClause* aDefaultClause = NULL, ValueExprNode* aParameterExpr = NULL);

public:
Expand Down
5 changes: 5 additions & 0 deletions src/dsql/Parser.h
Original file line number Diff line number Diff line change
Expand Up @@ -339,6 +339,11 @@ class Parser : public Firebird::PermanentStorage
return clause.hasData();
}

void setCollate(TypeClause* fld, MetaName* name)
{
if (name)
setClause(fld->collate, "COLLATE", *name);
}
void checkTimeDialect();

// start - defined in btyacc_fb.ske
Expand Down
2 changes: 1 addition & 1 deletion src/dsql/parse-conflicts.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
69 shift/reduce conflicts, 21 reduce/reduce conflicts.
71 shift/reduce conflicts, 21 reduce/reduce conflicts.
51 changes: 34 additions & 17 deletions src/dsql/parse.y
Original file line number Diff line number Diff line change
Expand Up @@ -1449,7 +1449,7 @@ arg_desc_list($parameters)
arg_desc($parameters)
: udf_data_type param_mechanism
{
$parameters->add(newNode<ParameterClause>($1, MetaName()));
$parameters->add(newNode<ParameterClause>($1));
$parameters->back()->udfMechanism = $2;
}
;
Expand All @@ -1472,7 +1472,7 @@ return_value1($function)
return_value($function)
: udf_data_type return_mechanism
{
$function->returnType = newNode<ParameterClause>($1, MetaName());
$function->returnType = newNode<ParameterClause>($1);
$function->returnType->udfMechanism = $2;
}
| PARAMETER pos_short_integer
Expand Down Expand Up @@ -1740,13 +1740,12 @@ domain_clause
{
$3->fld_name = *$1;
$<createDomainNode>$ = newNode<CreateDomainNode>(
newNode<ParameterClause>($3, MetaName(), $4));
newNode<ParameterClause>($3, $4));
}
domain_constraints_opt($5) collate_clause
{
$$ = $5;
if ($7)
$$->nameType->type->collate = *$7;
setCollate($3, $7);
}
;

Expand Down Expand Up @@ -2346,8 +2345,7 @@ column_def($relationNode)
}
column_constraint_clause(NOTRIAL($<addColumnClause>4)) collate_clause
{
if ($6)
$<addColumnClause>4->collate = *$6;
setCollate($2, $6);
}
| symbol_column_name data_type_or_domain identity_clause
{
Expand All @@ -2360,8 +2358,7 @@ column_def($relationNode)
}
column_constraint_clause(NOTRIAL($<addColumnClause>4)) collate_clause
{
if ($6)
$<addColumnClause>4->collate = *$6;
setCollate($2, $6);
}
| symbol_column_name non_array_type def_computed
{
Expand Down Expand Up @@ -2797,7 +2794,10 @@ input_proc_parameters($parameters)
%type input_proc_parameter(<parametersClause>)
input_proc_parameter($parameters)
: column_domain_or_non_array_type collate_clause default_par_opt
{ $parameters->add(newNode<ParameterClause>($1, optName($2), $3)); }
{
setCollate($1, $2);
$parameters->add(newNode<ParameterClause>($1, $3));
}
;

%type output_proc_parameters(<parametersClause>)
Expand All @@ -2809,7 +2809,10 @@ output_proc_parameters($parameters)
%type output_proc_parameter(<parametersClause>)
output_proc_parameter($parameters)
: column_domain_or_non_array_type collate_clause
{ $parameters->add(newNode<ParameterClause>($1, optName($2))); }
{
setCollate($1, $2);
$parameters->add(newNode<ParameterClause>($1));
}
;

%type <legacyField> column_domain_or_non_array_type
Expand Down Expand Up @@ -2881,7 +2884,8 @@ function_clause_start
RETURNS domain_or_non_array_type collate_clause deterministic_clause_opt
{
$$ = $2;
$$->returnType = newNode<ParameterClause>($5, optName($6));
$$->returnType = newNode<ParameterClause>($5);
setCollate($5, $6);
$$->deterministic = $7;
}
;
Expand Down Expand Up @@ -3220,7 +3224,8 @@ local_declaration_subfunc_start
RETURNS domain_or_non_array_type collate_clause deterministic_clause_opt
{
$$ = $4;
$$->dsqlBlock->returns.add(newNode<ParameterClause>($<legacyField>7, optName($8)));
setCollate($7, $8);
$$->dsqlBlock->returns.add(newNode<ParameterClause>($<legacyField>7));
$$->dsqlDeterministic = $9;
}
;
Expand All @@ -3235,8 +3240,10 @@ local_declaration_item
var_declaration_item
: column_domain_or_non_array_type collate_clause var_declaration_initializer
{
// Set collate before node allocation to prevent memory leak on throw
setCollate($1, $2);
DeclareVariableNode* node = newNode<DeclareVariableNode>();
node->dsqlDef = newNode<ParameterClause>($1, optName($2), $3);
node->dsqlDef = newNode<ParameterClause>($1, $3);
$$ = node;
}
;
Expand Down Expand Up @@ -3916,7 +3923,10 @@ block_parameters($parameters)
%type block_parameter(<parametersClause>)
block_parameter($parameters)
: column_domain_or_non_array_type collate_clause '=' parameter
{ $parameters->add(newNode<ParameterClause>($1, optName($2), (ValueSourceClause*) NULL, $4)); }
{
setCollate($1, $2);
$parameters->add(newNode<ParameterClause>($1, (ValueSourceClause*) NULL, $4));
}
;

// CREATE VIEW
Expand Down Expand Up @@ -4899,20 +4909,27 @@ array_range
%type <legacyField> simple_type
simple_type
: non_charset_simple_type
| character_type charset_clause
| character_type charset_clause collate_clause
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether it works in our codebase, but accordingly to the SQL specification collate_clause must also be supported for text blobs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but Firebird has no standard character blobs (CLOB or CHARACTER LARGE OBJECT), only binary blobs so this part is not for this PR.

$$ = $1;
if ($2)
{
$$->charSet = *$2;
$$->flags |= FLD_has_chset;
}
if ($3)
$$->collate = *$3;
}
;

%type <legacyField> non_charset_simple_type
non_charset_simple_type
: national_character_type
: national_character_type collate_clause
{
$$ = $1;
if ($2)
$$->collate = *$2;
}
| binary_character_type
| numeric_type
| float_type
Expand Down