Yeonsily/l2l test (#280)

yeonsily · avijit-nervana · commit 0f0c3671bf40 · 2018-10-29T17:24:05.000-07:00
* Modify the model test script
  - Change input numbers between 0-255
  - Load both pb and pbtxt type graph
  - Set back_end as parameter
  - Save test result and tensor array to .npy file
diff --git a/diagnostics/model_test/README.md b/diagnostics/model_test/README.md
@@ -1,21 +1,23 @@
-# Compare model output between Tensorflow and NGraph
+# Compare model output between two different backends
 
-### This model_test tool will run the model inference seperately on TF and NGraph, and the desired output from TF and NGraph should match given the same inputs. It can be used as a debugging tool, and also a verification that NGraph produces the same output as Tensorflow. 
+### This model_test tool will run the model inference seperately on two specified backends(e.g.Tensorflow and nGraph) in json file and the desired outputs from each backend should match given the same inputs. It can be used as a debugging tool for layer by layer comparison, and also a verification that nGraph produces the same output as Tensorflow. 
 
 # Required files to use the tool:
 * A json file: Provide model specific parameters. Look at the example ```mnist_cnn.json```. You can start with the ```template.json``` and modify it to match your model
-* A tensorflow frozen graph: A model frozen graph(.pb file) with trained weights and model architecture
+* A tensorflow model file: Either .pb or .pbtxt file
 
-## To prepare the required json file:"
+## To prepare the required json file:
+* Specify the ```reference_backend``` and ```testing_backend```. For tensolrflow on CPU, use 'CPU' and for nGraph, use 'NGRAPH_[desired backend name]' (e.g. Use 'NGAPH_CPU' for nGraph on CPU)
 * You will need the names of the input/output tensors of the model. Currently we are supporting
-multiple input tensors and one output tensor. Put the input tensor names as a list in the ```input_tensor_name``` field of the json file, and the output tensor name as a string in the ```output_tensor_name``` field of the json file
+multiple input tensors and output tensors. Put the input tensor names as a list in the ```input_tensor_name``` field of the json file, and the output tensor name as a list in the ```output_tensor_name``` field of the json file. If no outputs are specified in the ```output_tensor_name```, then it will compare all output tensors
 * You will need the input dimensions for all the input tensors provided. Put the dimensions information as a list in the ```input_dimension``` field of the json file, and the corresponding order of ```input_tensor_name``` list should match the ```input_dimension``` list. Therfore, the length of ```input_tensor_name``` list should match the length of ```input_dimension``` list
-* Specify the the location of the frozen graph in the ```frozen_graph_location``` of the json file
+* Specify the the location of the graph file in the ```graph_location``` of the json file
 * Specify the ```batch_size``` field in the json file to the desired batch size for inference
 * Specify the tolerance between the TF and NGraph outputs at ```l1_norm_threshold```, ```l2_norm_threshold``` and ```inf_norm_threshold``` in the json file 
+* Specify the ```random_val_range``` used to generate the input within 0 to random_val_range
 
 # To run the model test tool:
 	python verify_model.py --json_file="/path/to/your/json/file"
 
 # Result Metrics
-### The model_test tool will run the model inference and compare the outputs from TF and NGraph in terms of L1, L2 and Inf norm. If the corresponding norm is smaller than the matching tolerance specified in the json file, then the test passes. Otherwise, the test failes. In the situation of test failure, feel free to report the problem at the ngraph-tf github issue section.
+### The model_test tool will run the model inference and compare the outputs in terms of L1, L2 and Inf norm. If the corresponding norm is smaller than the matching tolerance specified in the json file, then the test passes. Otherwise, the test failes. Each output tensors will be saved as .npy file in a folder named as '[reference_backend_name]-[testing_backend_name]'(e.g. CPU-NGRAPH_CPU). In the situation of test failure, feel free to report the problem at the ngraph-tf github issue section.
diff --git a/diagnostics/model_test/mnist_cnn.json b/diagnostics/model_test/mnist_cnn.json
@@ -1,8 +1,11 @@
 {
     "model_name": "mnist_cnn",
-    "frozen_graph_location": "/nfs/site/home/mingshan/frozen_graphs/frozen_mnist.pb",
+    "reference_backend": "CPU",
+    "testing_backend": "NGRAPH_CPU",
+    "graph_location": "/nfs/site/home/mingshan/frozen_graphs/frozen_mnist.pb",
     "input_tensor_name": ["import/Placeholder:0"],
     "output_tensor_name": ["import/Placeholder:0", "import/fc2/yconv:0"],
+    "random_val_range": 255,
     "l1_norm_threshold": 0.01,
     "l2_norm_threshold": 0.01,
     "inf_norm_threshold": 0.01,
diff --git a/diagnostics/model_test/template.json b/diagnostics/model_test/template.json
@@ -1,8 +1,11 @@
 {
     "model_name": "name_of_your_model",
+    "reference_backend": "your_reference_backend",
+    "testing_backend": "your_testing_backend",
     "frozen_graph_location": "/path/to/the/frozen_graph.pb",
     "input_tensor_name": ["input_tesor_1","input_tensor_2"],
-    "output_tensor_name":"output_tensor_name",
+    "output_tensor_name":["output_tensor_name1","output_tensor_name_2"],
+    "random_val_range": 255,
     "l1_norm_threshold": 0.01,
     "l2_norm_threshold": 0.01,
     "inf_norm_threshold": 0.01,
diff --git a/diagnostics/model_test/verify_model.py b/diagnostics/model_test/verify_model.py
@@ -18,14 +18,37 @@
 import argparse
 import numpy as np
 import ngraph
+from google.protobuf import text_format
 import json
 import os
 
 
+def createFolder(directory):
+    try:
+        if not os.path.exists(directory):
+            os.makedirs(directory)
+    except OSError:
+        print('Error: Creating directory. ' + directory)
+
+
+def set_os_env(select_device):
+    if select_device == 'CPU':
+        # run on TF only
+        ngraph.disable()
+    else:
+        if not ngraph.is_enabled():
+            ngraph.enable()
+
+        assert select_device[:
+                             7] == "NGRAPH_", "Expecting device name to start with NGRAPH_"
+        back_end = select_device.split("NGRAPH_")
+        os.environ['NGRAPH_TF_BACKEND'] = back_end[1]
+
+
 def calculate_output(param_dict, select_device, input_example):
-    """Calculate the output of the imported frozen graph given the input.
+    """Calculate the output of the imported graph given the input.
 
-    Load the graph def from frozen_graph_file on selected device, then get the tensors based on the input and output name from the graph,
+    Load the graph def from graph file on selected device, then get the tensors based on the input and output name from the graph,
     then feed the input_example to the graph and retrieves the output vector.
 
     Args:
@@ -36,23 +59,22 @@ def calculate_output(param_dict, select_device, input_example):
     Returns:
         The output vector obtained from running the input_example through the graph.
     """
-    frozen_graph_filename = param_dict["frozen_graph_location"]
+    graph_filename = param_dict["graph_location"]
     output_tensor_name = param_dict["output_tensor_name"]
 
-    if not tf.gfile.Exists(frozen_graph_filename):
-        raise Exception("Input graph file '" + frozen_graph_filename +
+    if not tf.gfile.Exists(graph_filename):
+        raise Exception("Input graph file '" + graph_filename +
                         "' does not exist!")
 
-    with tf.gfile.GFile(frozen_graph_filename, "rb") as f:
-        graph_def = tf.GraphDef()
-        graph_def.ParseFromString(f.read())
-
-    if select_device == 'CPU':
-        ngraph.disable()
+    graph_def = tf.GraphDef()
+    if graph_filename.endswith("pbtxt"):
+        with open(graph_filename, "r") as f:
+            text_format.Merge(f.read(), graph_def)
     else:
-        # run on NGRAPH
-        if not ngraph.is_enabled():
-            ngraph.enable()
+        with open(graph_filename, "rb") as f:
+            graph_def.ParseFromString(f.read())
+
+    set_os_env(select_device)
 
     with tf.Graph().as_default() as graph:
         tf.import_graph_def(graph_def)
@@ -111,8 +133,12 @@ def calculate_norm(ngraph_output, tf_output, desired_norm):
     if desired_norm not in [1, 2, np.inf]:
         raise Exception('Only L2, L2, and inf norms are supported')
 
-    return np.linalg.norm((ngraph_output_flatten - tf_output_flatten),
-                          desired_norm)
+    n = np.linalg.norm((ngraph_output_flatten - tf_output_flatten),
+                       desired_norm)
+    if desired_norm is np.inf:
+        return n
+    else:
+        return n / len(ngraph_output_flatten)
 
 
 def parse_json():
@@ -139,6 +165,23 @@ def parse_json():
 
     parameters = parse_json()
 
+    # Get reference/testing backend to compare
+    device1 = parameters["reference_backend"]
+    device2 = parameters["testing_backend"]
+
+    # Get L1/L2/Inf threshold value
+    l1_norm_threshold = parameters["l1_norm_threshold"]
+    l2_norm_threshold = parameters["l2_norm_threshold"]
+    inf_norm_threshold = parameters["inf_norm_threshold"]
+
+    # Create a folder to save output tensor arrays
+    output_folder = device1 + "-" + device2
+    createFolder(output_folder)
+    os.chdir(output_folder)
+    print("Model name: " + parameters["model_name"])
+    print("L1/L2/Inf norm configuration: {}, {}, {}".format(
+        l1_norm_threshold, l2_norm_threshold, inf_norm_threshold))
+
     # Generate random input based on input_dimension
     np.random.seed(100)
     input_dimension = parameters["input_dimension"]
@@ -149,26 +192,33 @@ def parse_json():
         input_tensor_name
     ), "input_tensor_name dimension should match input_dimension in json file"
 
+    # Get random value range
+    rand_val_range = parameters["random_val_range"]
+
     # Matches the input tensors name with its required dimensions
     input_tensor_dim_map = {}
     for (dim, name) in zip(input_dimension, input_tensor_name):
-        random_input = np.random.random_sample([bs] + dim)
+        random_input = np.random.randint(
+            rand_val_range, size=[bs] + dim).astype('float32')
         input_tensor_dim_map[name] = random_input
 
-    # Run the model on tensorflow
+    # Run the model on reference backend
     result_tf_graph_arrs, out_tensor_names_cpu = calculate_output(
-        parameters, "CPU", input_tensor_dim_map)
-    # Run the model on ngraph
+        parameters, device1, input_tensor_dim_map)
+    # Run the model on testing backend
     result_ngraph_arrs, out_tensor_names_ngraph = calculate_output(
-        parameters, "NGRAPH", input_tensor_dim_map)
+        parameters, device2, input_tensor_dim_map)
 
     assert all(
         [i == j for i, j in zip(out_tensor_names_cpu, out_tensor_names_ngraph)])
-    l1_norm_threshold = parameters["l1_norm_threshold"]
-    l2_norm_threshold = parameters["l2_norm_threshold"]
-    inf_norm_threshold = parameters["inf_norm_threshold"]
     for tname, result_ngraph, result_tf_graph in zip(
             out_tensor_names_cpu, result_ngraph_arrs, result_tf_graph_arrs):
+        new_out_layer = tname.replace("/", "_")
+        nparray_tf = np.array(result_tf_graph)
+        nparray_ngraph = np.array(result_ngraph)
+        np.save(device1 + "-" + new_out_layer + ".npy", nparray_tf)
+        np.save(device2 + "-" + new_out_layer + ".npy", nparray_ngraph)
+
         l1_norm = calculate_norm(result_ngraph, result_tf_graph, 1)
         l2_norm = calculate_norm(result_ngraph, result_tf_graph, 2)
         inf_norm = calculate_norm(result_ngraph, result_tf_graph, np.inf)