Skip to content

Commit 8832c50

Browse files
author
ematejska
authored
Merge pull request #417 from abin-thomas-by/master
RFC: Modify BulkInferrer TFX component
2 parents 73e791e + c401894 commit 8832c50

File tree

1 file changed

+229
-0
lines changed

1 file changed

+229
-0
lines changed
Lines changed: 229 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,229 @@
1+
# Title of RFC
2+
3+
| Status | Proposed |
4+
:-------------- |:----------------------------------------------------------------------------------------------|
5+
| **RFC #** | [NNN](https://github.com/tensorflow/community/pull/NNN) (update when you have community PR #) |
6+
| **Author(s)** | Abin Thomas ([email protected]), Iain Stitt ([email protected]) |
7+
| **Sponsor** | Robert Crowe ([email protected]) |
8+
| **Updated** | 2020-06-20 |
9+
10+
## Objective
11+
12+
Modify [BulkInferrer](https://github.com/tensorflow/tfx/tree/master/tfx/components/bulk_inferrer) TFX component.
13+
14+
Changes :-
15+
* Store only a subset of features in `output_examples` artifact.
16+
* Support inference on multiple models.
17+
18+
## Motivation
19+
20+
A BulkInferrer TFX component is used to perform batch inference on unlabeled tf.Examples.
21+
The generated output examples contains the original features and the prediction results.
22+
Keeping all original features in the output is troubling when dealing with feature heavy models.
23+
For most of the use cases we only require example identifiers and the predictions in the output.
24+
25+
In machine learning, it is a common practice to train multiple models using the same feature set to perform different tasks (sometimes same tasks).
26+
It will be convenient to have a multimodel inference feature in bulk-inferrer. The component should take a list of models and produce predictions for all models.
27+
28+
## User Benefit
29+
30+
Filtering down the number of features in the output helps to reduce storage space for artifcats.
31+
It allows us to use larger batch sizes in downstream processing and reduces the chance of OOM issue.
32+
33+
Multimodel inference when done separately requires joining of the outputs on some identifiers, which is computationally and otherwise expensive.
34+
With this update the user can do post-processing directly without joining different outputs.
35+
36+
## Design Proposal
37+
38+
39+
### Filter Output Features
40+
41+
The component decides whether to keep all the features or not based on an additional field in `OutputExampleSpec` proto.
42+
The updated proto will look like this:-
43+
```protobuf
44+
message OutputExampleSpec {
45+
// Defines how the inferrence results map to columns in output example.
46+
repeated OutputColumnsSpec output_columns_spec = 3;
47+
repeated string example_features = 5;
48+
49+
reserved 1, 2, 4;
50+
}
51+
```
52+
`example_features` expects a list of feature names to be persisted in the output. Component will not filter if an empty list is provided.
53+
The check and filtering will be performed in the [prediction_to_example_utils.py](https://github.com/tensorflow/tfx/blob/master/tfx/components/bulk_inferrer/prediction_to_example_utils.py#L86).
54+
55+
Check:-
56+
```python
57+
def convert(prediction_log: prediction_log_pb2.PredictionLog,
58+
output_example_spec: _OutputExampleSpecType) -> tf.train.Example:
59+
60+
61+
62+
if len(output_example_spec.example_features) > 0:
63+
example = _filter_columns(example, output_example_spec)
64+
65+
return _add_columns(example, output_features)
66+
```
67+
`_filter_columns` function:-
68+
```python
69+
def _filter_columns(example: tf.train.Example,
70+
output_example_spec: _OutputExampleSpecType) -> tf.train.Example:
71+
"""Remove features not in output_example_spec.example_features"""
72+
all_features = list(example.features.feature)
73+
for feature in all_features:
74+
if feature not in output_example_spec.example_features:
75+
del example.features.feature[feature]
76+
return example
77+
```
78+
79+
### Mulitmodel Inference
80+
81+
For muli-model inference, the component will expect a union channel of models as input.
82+
[RunInference](https://github.com/tensorflow/tfx/blob/master/tfx/components/bulk_inferrer/executor.py#L253) will be performed using [RunInferencePerModel](https://github.com/tensorflow/tfx-bsl/blob/master/tfx_bsl/public/beam/run_inference.py#L101) method from tfx-bsl.
83+
This method will return a tuple of prediction logs instead of one single log.
84+
In subsequent steps these multiple logs will be merged to produce one single tf.Example.
85+
If raw inference_results are expected, then the component will save the predictions logs in inference_result subdirectories.
86+
87+
#### Changes to input protos
88+
89+
`model_spec` and `output_example_spec` parameters expect `ModelSpec` and `OutputExampleSpec` protos respectively.
90+
For supporting multiple models and also keeping in mind backward compatibility, self referencing proto definitions can be used.
91+
92+
`model_spec` : -
93+
```protobuf
94+
message ModelSpec {
95+
// Specifies the signature name to run the inference with. If multiple
96+
// signature names are specified (ordering doesn't matter), inference is done
97+
// as a multi head model. If nothing is specified, default serving signature
98+
// is used as a single head model.
99+
repeated string model_signature_name = 2;
100+
101+
// Tags to select metagraph from the saved model. If unspecified, the default
102+
// tag selects metagraph to run inference on CPUs. See some valid values in
103+
// tensorflow.saved_model.tag_constants.
104+
repeated string tag = 5;
105+
106+
// handle multiple ModelSpec
107+
repeated ModelSpec model_specs = 7;
108+
109+
reserved 1, 3, 4, 6;
110+
}
111+
```
112+
113+
`output_example_spec` : -
114+
```protobuf
115+
message OutputExampleSpec {
116+
// Defines how the inferrence results map to columns in output example.
117+
repeated OutputColumnsSpec output_columns_spec = 3;
118+
119+
// List of features to maintain in the output_examples
120+
repeated string example_features = 5;
121+
122+
// handle multiple OutputExampleSpec
123+
repeated OutputExampleSpec output_example_specs = 6;
124+
125+
reserved 1, 2, 4;
126+
}
127+
```
128+
Parsing both protos requires additional validation checks to figure out single model spec or multiple model spec.
129+
130+
#### Changes to input channels
131+
132+
`model` and `model_blessing` parameters can be either of the type [BaseChannel](https://github.com/tensorflow/tfx/blob/master/tfx/types/channel.py#L51) or [UnionChannel](https://github.com/tensorflow/tfx/blob/master/tfx/types/channel.py#L363).
133+
If BaseChannel is passed as input, the component will convert it to a single item UnionChanel before invoking the executor.
134+
```python
135+
if model and (not isinstance(model, types.channel.UnionChannel)):
136+
model = types.channel.union([model])
137+
if model_blessing and (not isinstance(model_blessing, types.channel.UnionChannel)):
138+
model_blessing = types.channel.union([model_blessing])
139+
```
140+
If any of the model is not blessed the executor will return without doing inference.
141+
142+
143+
#### Changes to write `inference_result` beam pipeline
144+
145+
If raw inference_results are expected, then the component will save the predictions logs in inference_result subdirectories.
146+
```python
147+
if inference_result:
148+
data = (
149+
data_list
150+
| 'FlattenInferenceResult' >> beam.Flatten(pipeline=pipeline))
151+
for i in range(len(inference_endpoints)):
152+
_ = (
153+
data
154+
| 'SelectPredictionLog[{}]'.format(i) >> beam.Map(lambda x: x[i])
155+
| 'WritePredictionLogs[{}]'.format(i) >> beam.io.WriteToTFRecord(
156+
os.path.join(inference_result.uri, str(i), _PREDICTION_LOGS_FILE_NAME),
157+
file_name_suffix='.gz', coder=beam.coders.ProtoCoder(prediction_log_pb2.PredictionLog)))
158+
```
159+
160+
#### Changes to prediction to examples convert function
161+
162+
In case of multiple prediction logs, the features are extracted from the first one.
163+
```python
164+
def convert(prediction_logs: Tuple[prediction_log_pb2.PredictionLog, ...],
165+
output_example_spec: _OutputExampleSpecType) -> tf.train.Example:
166+
"""Converts given `prediction_log` to a `tf.train.Example`.
167+
168+
Args:
169+
prediction_logs: The input prediction log.
170+
output_example_spec: The spec for how to map prediction results to columns
171+
in example.
172+
173+
Returns:
174+
A `tf.train.Example` converted from the given prediction_log.
175+
Raises:
176+
ValueError: If the inference type or signature name in spec does not match
177+
that in prediction_log.
178+
"""
179+
is_single_output_example_spec = bool(output_example_spec.output_columns_spec)
180+
is_multiple_output_example_spec = bool(output_example_spec.output_example_specs)
181+
182+
if (not is_single_output_example_spec) and (not is_multiple_output_example_spec):
183+
raise ValueError('Invalid output_example spec')
184+
elif is_single_output_example_spec and (not is_multiple_output_example_spec):
185+
specs = [output_example_spec]
186+
elif (not is_single_output_example_spec) and is_multiple_output_example_spec:
187+
specs = output_example_spec.output_example_specs
188+
if len(prediction_logs) != len(specs):
189+
raise ValueError('inference result, spec length mismatch '
190+
'output_example_spec: %s' % output_example_spec)
191+
else:
192+
raise ValueError('Invalid output_example spec')
193+
194+
example = _parse_examples(prediction_logs[0])
195+
output_features = [_parse_output_feature(prediction_log, example_spec.output_columns_spec)
196+
for prediction_log, example_spec in zip(prediction_logs, specs)]
197+
198+
if len(output_example_spec.example_features) > 0:
199+
example = _filter_columns(example, output_example_spec)
200+
201+
return _add_columns(example, output_features)
202+
```
203+
204+
### Alternatives Considered
205+
206+
### Performance Implications
207+
Neutral
208+
209+
### Dependencies
210+
No new dependencies introduced.
211+
212+
### Engineering Impact
213+
214+
### Platforms and Environments
215+
No special considerations across different platforms and environments.
216+
217+
### Best Practices
218+
No change in best practices.
219+
220+
### Tutorials and Examples
221+
API docs will be updated.
222+
223+
### Compatibility
224+
Proto and input changes are backward compatible.
225+
226+
### User Impact
227+
228+
## Questions and Discussion Topics
229+
* Is it okay to use self-referencing proto definitions for backward compatibility?

0 commit comments

Comments
 (0)