Skip to content
jsichi edited this page Jun 20, 2012 · 3 revisions

equals/hashCode for graphs

Since version 0.8.4 JGraphT supplies equals(Object) and hashCode() methods for graphs.

The hash code of graph is defined based on the sum of the hash codes of vertices and edges in the graph. It is also based on graph topology and edges weights. It is very important to know the general contract of hash codes - "equality of two graphs means that they have the same hash codes, however inequality of two graphs doesn't mean that they have different hash codes".

An equality of two graphs means they are instances of the same graph class, and have equal vertex and edges sets with the same weights. Note that set equality here is based on equality of the component vertices and edges across the two graphs. For example, if your vertex class does not override equals, then the same vertex objects must be used in both graphs, since in that case Java object identity (System.identityHashCode) is used for the vertex equality test. The same goes for edges.

In general, an exact copy of a graph object via Graphs.addGraph will be equal to the original. However, for copy via serialization followed by deserialization, this won't hold unless both the vertex and edge classes override equals/hashCode.

Also note that graph equality comparison is not the same as a graph isomorphism detection, which would be prohibitively expensive.

There is one exception you should know about regarding equality of weighted graphs with a custom edge class which overrides equals/hashCode. (If your custom edge class does not override equals/hashCode, you do not need to worry about this exception.) Here is an example of broken equals(Object) logic.

// user defined weighted edge class 
class CustomWeightedEdge
	extends DefaultWeightedEdge
{
	private static final long serialVersionUID = 1L;
	private String label;

	public CustomWeightedEdge(String label)
	{
		this.label = label; 
	}

	public int hashCode()
	{
		return label.hashCode();
	}

	public boolean equals(Object obj)
	{
		if (this == obj) {
			return true;
		}
		if (obj == null) {
			return false;
		}
		if (!(obj instanceof CustomWeightedEdge)) {
			return false;
		}

		CustomWeightedEdge edge = (CustomWeightedEdge) obj;
		return label.equals(edge.label);
	}
}

// test weighted graphs with custom edges
WeightedGraph<String, CustomWeightedEdge> g1 =
	new DefaultDirectedWeightedGraph<String, CustomWeightedEdge>(
	CustomWeightedEdge.class);
g1.addVertex("v1");
g1.addVertex("v2");
CustomWeightedEdge e112 = new CustomWeightedEdge("v1-v2"); 
g1.addEdge(v1, v2, e112);
g1.setEdgeWeight(e112, 10.0);

WeightedGraph<String, CustomWeightedEdge> g2 = 
	new DefaultDirectedWeightedGraph<String, CustomWeightedEdge>(
	CustomWeightedEdge.class);
g2.addVertex("v2");
g2.addVertex("v1");
CustomWeightedEdge e212 = new CustomWeightedEdge("v1-v2");
g3.addEdge(v1, v2, e212);
g3.setEdgeWeight(e212, 20.0);

g1.equals(g2); // returns "true" where you would expect it to return "false"

This case is one of the pitfalls already associated with DefaultWeightedEdge (e.g. you can't reuse an instance in two graphs, otherwise the weight will change in both), even without custom edge sub-classing. The main reason for this behavior is that DefaultWeightedEdge stores edge weight itself and method AbstractGraph.getEdgeWeight(E) just retrieves its weight. So a comparison of weight for two identical edges in equals(Object) in fact compares weights of the same edge.

So if you need graph equals to work correctly, then either your custom edge class should not override equals/hashCode, or else it should not subclass DefaultWeightedEdge.

equals/hashCode for vertex/edge objects

There are many articles available on the web (e.g. http://www.ibm.com/developerworks/java/library/j-jtp05273.html) explaining the need to keep equals and hashCode methods consistent for each class, and other important rules such as avoiding the use of mutable classes as keys.

This is especially important for any Vertex and Edge objects used with JGraphT since the default graph implementations rely on HashSet/HashMap for efficiently storing and accessing graph structures.

If you make a mistake with these methods, the observed behavior will typically be very confusing. In many cases, bugs have been logged against JGraphT in situations where investigation eventually determined that the cause of the problem was actually in the user-defined Vertex/Edge classes.

For example, suppose your Vertex class overrides equals, but forgets to override hashCode. You have vertices v1, v2, v3, v4, v5, where v3.equals(v5), and you add them all to a graph, with edges v1→v2→v3→v4→v5. If hashCode had been defined correctly, then the attempt to add v5 would have discarded it, since an equal vertex (v3) was already present in the graph.

However, the hash codes for v3 and v5 are different (since they are different objects, and the default hashCode implementation is based on object identity). Suppose these hash codes do not collide when folded down into the array used by the HashMap inside of the graph. As a result, when you add v5 to the graph, HashMap won't even compare it against v3, since it only calls equals in case of a collision. Now you have a corrupt graph, since it contains two vertex objects which are equals; this violates the contract of the graph.

Under these circumstances, you may not notice anything wrong at first. However, if you use the CycleDetector class on this graph, you'll be in for a surprise, since it relies on ArrayList.indexOf, and this doesn't do any hashing, so it will see that v3 and v5 are equals, and detect a cycle (whereas you are thinking v3 and v5 are different vertices).

Tracking these problems down can be time-consuming, so JGraphT supplies a class ParanoidGraph to help you. You can use it to wrap one of your own graphs; by calling the addVertex/addEdge methods on ParanoidGraph (rather than on your graph directly), you get automatic verification as part of each object addition; an IllegalArgumentException will be thrown if a bad object is detected. Note that this checking is very expensive, so only use it during debugging or sanity checking.

Example from a unit test:

	public void testParanoidGraph()
	{
	   BrokenVertex v1 = new BrokenVertex(1);
	   BrokenVertex v2 = new BrokenVertex(2);
	   BrokenVertex v3 = new BrokenVertex(1);

	   SimpleGraph<BrokenVertex, DefaultEdge> g =
	       new SimpleGraph<BrokenVertex, DefaultEdge>(DefaultEdge.class);
	    ParanoidGraph<BrokenVertex, DefaultEdge> pg =
	       new ParanoidGraph<BrokenVertex, DefaultEdge>(g);
	   pg.addVertex(v1);
	   pg.addVertex(v2);
	   try {
	       pg.addVertex(v3);
	       // should not get here
	       assertFalse();
	   } catch (IllegalArgumentException ex) {
	       // expected, swallow
	   }
	}

	private class BrokenVertex
	{
	   private int x;

	   BrokenVertex(int x)
	   {
	       this.x = x;
	   }

	   public boolean equals(Object other)
	   {
	       if (!(other instanceof BrokenVertex)) {
	           return false;
	       }
	       return x == ((BrokenVertex) other).x;
	   }
	}
Clone this wiki locally