Skip to content

Latest commit

 

History

History
105 lines (91 loc) · 2.9 KB

repl.livemd

File metadata and controls

105 lines (91 loc) · 2.9 KB

Untitled notebook

Section

xml = """
<SteamedHam price="1">
  <ReadyDate>2021-09-11</ReadyDate>
  <ReadyTime>15:50:07.123Z</ReadyTime>
  <Sauce Name="burger sauce">spicy</Sauce>
  <Type>medium rare</Type>
  <Salads>
    <Salad Name="ceasar">
      <Cheese Mouldy="true">Blue</Cheese>
    </Salad>
    <Salad Name="cob">
      <Leaf type="lambs lettuce">washed</Leaf>
    </Salad>
  </Salads>
</SteamedHam>
"""

{:ok, x} = Saxy.XmerlMap.parse_string(xml, atom_fun: &String.to_atom/1)
x
p = "/SteamedHam/ReadyDate/text()"
splits = String.split(p, "/", trim: true)

# This needs to be close enough to xpath so that we can benchmark it vs sweet_xml
# and compare memory bits too.

# Being able to lose the parents is huge on itself. But strongly suspect we could make the
# data format a lot better to make this xpath stuff easier. The easiest thing to be able to
# query is to have dynamic keys - so keys are node names. But that comes with problems
# when creating the stuff.

# This version expects X to look something like:

# %{
#   attributes: [%{name: "price", value: "1"}],
#   content: [
#     %{attributes: [], content: ["2021-09-11"], name: "ReadyDate"},
#     %{attributes: [], content: ["15:50:07.123Z"], name: "ReadyTime"},
#     %{attributes: [%{name: "Name", value: "burger sauce"}], content: ["spicy"], name: "Sauce"},
#     %{attributes: [], content: ["medium rare"], name: "Type"},
#     %{
#       attributes: [],
#       content: [
#         %{
#           attributes: [%{name: "Name", value: "ceasar"}],
#           content: [
#             %{attributes: [%{name: "Mouldy", value: "true"}], content: ["Blue"], name: "Cheese"}
#           ],
#           name: "Salad"
#         },
#         %{
#           attributes: [%{name: "Name", value: "cob"}],
#           content: [
#             %{
#               attributes: [%{name: "type", value: "lambs lettuce"}],
#               content: ["washed"],
#               name: "Leaf"
#             }
#           ],
#           name: "Salad"
#         }
#       ],
#       name: "Salads"
#     }
#   ],
#   name: "SteamedHam"
# }

defmodule XxxPath do
  def query(data, path) do
    Enum.reduce_while(path, data, fn
      "text()", acc = %{} ->
        {:halt, Map.fetch!(acc, :content)}

      k, acc = [_ | _] ->
        case Enum.find(acc, :nawt_found, fn map -> Map.fetch!(map, :name) == k end) do
          :nawt_found -> {:halt, :not_found}
          map -> {:cont, map}
        end

      k, acc = %{} ->
        if Map.fetch!(acc, :name) == k do
          {:cont, Map.fetch!(acc, :content)}
        else
          {:halt, :not_found}
        end
    end)
  end
end

# We can make the keys for the content `text()` and the key for each attr `/@the_attr`,
# making it easy to query for. 

# We probably want to be able to support the // and the `.` though we can be a bit flexible with 
# it imo. Enforcing a subset of xpath is perfectly fine.

XxxPath.query(x, splits)