Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicated edges from Graph.edges(g, v) when v has a self-reference? #65

Open
dsschneidermann opened this issue Jan 25, 2023 · 5 comments

Comments

@dsschneidermann
Copy link

dsschneidermann commented Jan 25, 2023

Hi and first of all, thanks for maintaining this library. It has turned out to be essential to me since I started using it.

I'm a big fan of pattern matching over edges in the graph and it's a very intuitive way to write complicated case handling. However I just discovered that I'm getting duplicated edges when I am matching on, eg.

case graph |> Graph.edges(metric_key) do
    %{v1: ^metric_key, v2: ^metric_key} -> ...
    ... other cases
end

From a glance at the code, a possible cause could be if a self-referencing edge could be described by both v_in and v_out:

v_all = MapSet.union(v_in, v_out)

But the description would have to differ, otherwise the MapSet would make it unique.. Do you think this is a correct cause and if so, is it an intended one?

I don't suppose there is any inherent disadvantage in just matching instead on all edges as such:

case graph |> Graph.edges() do
    %{v1: ^metric_key, v2: ^metric_key} -> ...
    ... other cases
    _ -> [] # not the stuff we're interested in
end |> List.flatten()

Looking at the code, it seems to be pretty much the same amount of work being done -- and in my situation, I always need to flat_map anyway in what I am doing.

However, someone else may be tripped up on it and have a situation where getting duplicates could cause confusion down the line.

Best regards,
Dennis

@stevensonmt
Copy link
Contributor

Can you give an example of this issue? I am not sure that I understand your explanation of the issue. That said, I think any duplicate edge representation is caused not by the code you referenced but by this portion of the code:

    e_in =
      Enum.flat_map(v_all, fn v2_id ->
        case Map.get(meta, {v2_id, v_id}) do
          nil ->
            []

          edge_meta when is_map(edge_meta) ->
            v2 = Map.get(vs, v2_id)

            for {label, weight} <- edge_meta do
              Edge.new(v2, v, label: label, weight: weight)
            end
        end
      end)

    e_out =
      Enum.flat_map(v_all, fn v2_id ->
        case Map.get(meta, {v_id, v2_id}) do
          nil ->
            []

          edge_meta when is_map(edge_meta) ->
            v2 = Map.get(vs, v2_id)

            for {label, weight} <- edge_meta do
              Edge.new(v, v2, label: label, weight: weight)
            end
        end
      end)

    e_in ++ e_out

Concatenating the two lists allows for duplicate entries. This makes sense as intended behavior for a directed graph but maybe not for an undirected graph. I suspect if we wanted to avoid duplicates the solution is something like:

  Enum.reduce(v_all, MapSet.new(), fn v2_id, es ->  get_edges(v2_id, v_id, meta) |> MapSet.union(es) end) |> Enum.to_list()
...

defp get_edges(v2_id, v_id, meta) do 
  [{v2_id, v_id}, {v_id, v2_id}]
  |> Enum.reduce(MapSet.new(), fn {vi_id, vo_id} = vs, edges ->
  case Map.get(meta, vs) do
          nil ->
            edges

          edge_meta when is_map(edge_meta) ->
            v = Map.get(vs, vi_id)
           v2 = Map.get(vs, vo_id)
            
            for {label, weight} <- edge_meta do
              Edge.new(v, v2, label: label, weight: weight)
            end
            |> MapSet.new()
            |> MapSet.union(edges)
        end
    end
  end

Haven't tested the above just sort of thinking out loud. Can put a PR together if this approach seems to address a real problem, but I'm not sure that it does for a directed graph.

@dsschneidermann
Copy link
Author

dsschneidermann commented Mar 19, 2024

Hello, yes sorry I can see it is not very clear :)

Here is a reproduction:

defmodule GraphingTest do
  use ExUnit.Case, async: true

  test "graphlib" do
    {graph, edges} =
      for metric_key <- ["a", "b"], reduce: {Graph.new(), []} do
        acc ->
          input_edges =
            for input <- ["a"] do
              {input, metric_key}
            end

          {elem(acc, 0) |> Graph.add_vertex(metric_key), input_edges ++ elem(acc, 1)}
      end

    IO.puts("expected: ")

    graph =
      for {v1, v2} <- edges, reduce: graph do
        acc ->
          {v1, v2} |> inspect |> IO.puts()
          Graph.add_edge(acc, %Graph.Edge{v1: v1, v2: v2})
      end

    IO.puts("actual: ")

    for edge <- graph |> Graph.edges("a") do
      case edge do
        %{v1: v1, v2: v2} -> {v1, v2} |> inspect |> IO.puts()
      end
    end
  end
end

Output:

expected: 
{"a", "b"}
{"a", "a"}
actual: 
{"a", "a"}
{"a", "a"}
{"a", "b"}

Double "a"->"a" is unexpected here. I ended up matching on the source information instead of using the graph, but I can tell from small testing that it only happens when asking specifically for Graph.edges("a") edges - not when getting all Graph.edges()

@stevensonmt
Copy link
Contributor

I think that is the expected behavior given docs for Graph.edges/2:

Returns a list of all edges inbound or outbound from vertex v.

For an edge with a vertex incident upon itself you would have the duplication, whereas with Graph.edges/1 it is only considering the out-edges and not yielding the duplicate. I still think this makes sense for directed graphs since you need to consider a -> a as both directions being valid.

@dsschneidermann
Copy link
Author

I have to disagree with the expectation here. It may be technically correct according to the wording of the docs, but it's a really weird edge case (hah) to actually support. What's the beneficial use of having the same edge returned twice? There simply isn't two edges in a directed graph A -> A, or in an undirected graph A - A.

Ultimately it is your choice to make, but I would definitely ask it flagged it in the docs specifically. IMO any reading of the docs will always assume that Graph.edges/1 returns a subset of Graph.edges/0.

Also, for what its worth, I definitely understand any hesitation on your part to add any fixes as a breaking change, but maybe it could be scheduled as a future change.

@stevensonmt
Copy link
Contributor

Just to be clear, I'm not the maintainer, just an interested user who is happy to contribute pull requests for issues. I think this is an easy enough fix that would not be a breaking change. If there is agreement that A -> A is always considered a single edge I can try to work on the fix. My real job is a bit crazy at the moment, so I probably wouldn't have time until after 2-3 weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants