Skip to content

Commit

Permalink
HTML API: Fix CDATA lookalike matching invalid CDATA
Browse files Browse the repository at this point in the history
When `next_token()` was introduced to the HTML Tag Processor, it started
classifying comments that look like they were intended to be CDATA sections.
In one of the changes made during development, however, a typo slipped
through code review that treated comments as CDATA even if they only
ended in `]>` and not the required `]]>`.

The consequences of this defect were minor because in all cases these are
treated as HTML comments from invalid syntax, but this patch adds the
missing check to ensure the proper reporting of CDATA-lookalikes.

Follow-up to [57348]

Props jonsurrell
Fixes #60406



git-svn-id: https://develop.svn.wordpress.org/trunk@57506 602fd350-edb4-49c9-b593-d223f7449a82
  • Loading branch information
dmsnell committed Feb 1, 2024
1 parent 5e33f4b commit 4a2aa99
Show file tree
Hide file tree
Showing 2 changed files with 40 additions and 1 deletion.
3 changes: 2 additions & 1 deletion src/wp-includes/html-api/class-wp-html-tag-processor.php
Original file line number Diff line number Diff line change
Expand Up @@ -1762,7 +1762,8 @@ private function parse_next_tag() {
'T' === $html[ $this->token_starts_at + 6 ] &&
'A' === $html[ $this->token_starts_at + 7 ] &&
'[' === $html[ $this->token_starts_at + 8 ] &&
']' === $html[ $closer_at - 1 ]
']' === $html[ $closer_at - 1 ] &&
']' === $html[ $closer_at - 2 ]
) {
$this->parser_state = self::STATE_COMMENT;
$this->comment_type = self::COMMENT_AS_CDATA_LOOKALIKE;
Expand Down
38 changes: 38 additions & 0 deletions tests/phpunit/tests/html-api/wpHtmlTagProcessor-token-scanning.php
Original file line number Diff line number Diff line change
Expand Up @@ -347,6 +347,38 @@ public function test_basic_assertion_cdata_section() {
);
}

/**
* Ensures that normative CDATA sections are properly parsed.
*
* @ticket 60406
*
* @since 6.5.0
*
* @covers WP_HTML_Tag_Processor::next_token
*/
public function test_cdata_comment_with_incorrect_closer() {
$processor = new WP_HTML_Tag_Processor( '<![CDATA[this is missing a closing square bracket]>' );
$processor->next_token();

$this->assertSame(
'#comment',
$processor->get_token_name(),
"Should have found comment token but found {$processor->get_token_name()} instead."
);

$this->assertSame(
WP_HTML_Processor::COMMENT_AS_INVALID_HTML,
$processor->get_comment_type(),
'Should have detected invalid HTML comment.'
);

$this->assertSame(
'[CDATA[this is missing a closing square bracket]',
$processor->get_modifiable_text(),
'Found incorrect modifiable text.'
);
}

/**
* Ensures that abruptly-closed CDATA sections are properly parsed as comments.
*
Expand All @@ -366,6 +398,12 @@ public function test_basic_assertion_abruptly_closed_cdata_section() {
"Should have found a bogus comment but found {$processor->get_token_name()} instead."
);

$this->assertSame(
WP_HTML_Processor::COMMENT_AS_INVALID_HTML,
$processor->get_comment_type(),
'Should have detected invalid HTML comment.'
);

$this->assertNull(
$processor->get_tag(),
'Should not have been able to query tag name on non-element token.'
Expand Down

0 comments on commit 4a2aa99

Please sign in to comment.