Skip to content

Commit

Permalink
AbbrTreeprocessor now sorts the abbreviation
Browse files Browse the repository at this point in the history
list by length before processing the element tree

This ensures that multi-word abbreviations are
implemented even if an abbreviation exists for
one of those component words.

To test this,
added `test_abbr_superset_vs_subset()` to
`tests.test_syntax.extensions.test_abbr`.
  • Loading branch information
nbanyan committed May 22, 2024
1 parent ec8c305 commit 37922a7
Show file tree
Hide file tree
Showing 3 changed files with 25 additions and 1 deletion.
4 changes: 4 additions & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,10 @@ better reflects what it is. `AbbrPreprocessor` has been deprecated.

A call to `Markdown.reset()` now clears all previously defined abbreviations.

Abbreviations are now sorted by length before executing `AbbrTreeprocessor`
to ensure that multi-word abbreviations are implemented even if an abbreviation
exists for one of those component words.

### Fixed

* Fixed links to source code on GitHub from the documentation (#1453).
Expand Down
4 changes: 3 additions & 1 deletion markdown/extensions/abbr.py
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,9 @@ def run(self, root: etree.Element) -> etree.Element | None:
# No abbreviations defined. Skip running processor.
return
# Build and compile regex
self.RE = re.compile(f"\\b(?:{ '|'.join(re.escape(key) for key in self.abbrs) })\\b")
abbr_list = list(self.abbrs.keys())
abbr_list.sort(key=len, reverse=True)
self.RE = re.compile(f"\\b(?:{ '|'.join(re.escape(key) for key in abbr_list) })\\b")
# Step through tree and modify on matches
self.iter_element(root)

Expand Down
18 changes: 18 additions & 0 deletions tests/test_syntax/extensions/test_abbr.py
Original file line number Diff line number Diff line change
Expand Up @@ -383,6 +383,24 @@ def test_abbr_with_attr_list(self):
extensions=['abbr', 'attr_list']
)

def test_abbr_superset_vs_subset(self):
self.assertMarkdownRenders(
self.dedent(
"""
abbr, SS, and abbr-SS should have different definitions.
*[abbr]: Abbreviation Definition
*[abbr-SS]: Abbreviation Superset Definition
*[SS]: Superset Definition
"""
),
self.dedent(
"""
<p><abbr title="Abbreviation Definition">abbr</abbr>, <abbr title="Superset Definition">SS</abbr>, and <abbr title="Abbreviation Superset Definition">abbr-SS</abbr> should have different definitions.</p>
"""
)
)

def test_abbr_reset(self):
ext = AbbrExtension()
md = Markdown(extensions=[ext])
Expand Down

0 comments on commit 37922a7

Please sign in to comment.