Description
Backgrouond
I have a dataframe with dicom file path. I would like to create a keras layer, where it will read dicom file and do some preocessing (resizing , normalization etc). And return 2D input tensor. I am trying to add this layer with actual model to build a final model. The purpose is to get leverage the image processing on GPU.
Issue
In the call
method of this layer, I tried to use tf.vectorized_map
, but after running on some files, it gave an error. That is
InvalidArgumentError: PartialTensorShape: Incompatible shapes during merge: [1,5355,4915,1] vs. [1,2776,2082,1]:Decode DICOMI mage/for/TensorList ConcatV2}}]]
It might be due to the variable image array after decoding the dicom file. And tensorflow vectorized map might have an issue with concat function, as it has known limitation.
class DicomReadAndResizer(layers.Layer):
def __init__(
self,
input_height,
input_width,
**kwargs
):
super().__init__(**kwargs)
self.input_height = input_height
self.input_width = input_width
def read_and_resize_dicom(self, x):
image_bytes = tf.io.read_file(x)
image = tfio.image.decode_dicom_image(
image_bytes,
dtype=tf.float32
)
image = tf.image.grayscale_to_rgb(image)
image = tf.image.resize(
image,
size=[self.input_height, self.input_width],
)
return image
def call(self, inputs):
# Works some-what
x = tf.map_fn(
self.read_and_resize_dicom, inputs, fn_output_signature=tf.float32,
)
# Doesn't work
x = tf.vectorized_map(self.read_dicom, inputs)
return x
Note, to make tf.map_fn
to work, I have to place image resizer function and dicom reader in the same method (above self.read_and_resize
), otherwise this tf.map_fn
function also creates issue with multi-scale input tensor. But that doesn't work with tf.vectorized_map
.
Questions
- Could be there any fix in
tf.io
side? (maybe usetf.stack
instead oftf.concat
!) ? Is there any hacks that can be used to make vectorize function work? - Even if I am using such layer inside a model (as a hope to leverage GPU), will this layer still somehow swipe to CPU mode? Because, when I run the code (with
tf.map_fn
), the CPU processing still gets very high and memory consumption increases eventually.
About the data-loader, it is a very simple tf.data
API. I pass a dataframe, where the path of dicom file is kept. So, there is not any heavy load for CPU side. Could you please give some pointer? What could be the reason? Thanks.
model = keras.Sequential(
[
keras.layers.InputLayer(dtype=tf.string),
DicomReadAndResizer(
*INP_SIZE,
), # output [None, h, w, 3]
trained_model # input [None, h, w, 3]
]
)