Skip to content

HTML image handling #27

Closed
Closed
@eneroth

Description

@eneroth

Possibly related to #7, or maybe not.

I'm trying to deal with the fact that images are sometimes inlined as <img … HTML tags rather than e.g. ![some alt text](some.png "Some title").

So I would like to attempt conversion of <img …'s encountered into proper AST.

The first thing I'm coming up against is getting HTML emitted at all. When I run the following example, the two imgs are lost along the way.

(def image-test
  "### Images
   <img src=\"https://www.example.com/image1.jpg\" alt=\"High-Efficiency Antenna\">
   <img src=\"https://www.example.com/image2.jpg\" alt=\"5G Network Coverage Map\">")

(md.parser/parse md.parser/empty-doc
  (md/tokenize image-test))

;; =>
{:footnotes []
 :type      :doc
 :title     "Images"
 :content   [{:type          :heading
              :heading-level 3
              :attrs         {:id "images"}
              :content       [{:type :text
                               :text "Images"}]}]
 :toc       {:type     :toc
             :children [{:type     :toc
                         :children [{:type     :toc
                                     :children [{:type          :toc
                                                 :content       [{:type :text :text "Images"}]
                                                 :heading-level 3
                                                 :attrs         {:id "images"}
                                                 :path          [:content 0]}]}]}]}}

If I look at the tokenization, they are registered as an HTML block:

{:children nil
 :block    true
 :meta     nil
 :content  "   <img src=\"https://www.example.com/image1.jpg\" alt=\"High-Efficiency Antenna\">
            <img src=\"https://www.example.com/image2.jpg\" alt=\"5G Network Coverage Map\">"
 :type     "html_block"
 :markup   ""
 :level    0
 :hidden   false
 :info     ""
 :attrs    nil
 :tag      ""
 :nesting  0
 :map      [1 3]}

So, it looks like it maybe gets lost along the way somehow. Any advice would be great!

But my scope for now remains to convert HTML images -> AST. If there's a better way to do this than the one I'm heading down, I'd greatly appreciate being steered in another direction as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions