- N/A
- Depends on
tensorflow 2.17 - Depends on
protobuf>=4.25.2,<6.0.0for Python 3.11 and onprotobuf>4.21.6,<6.0.0for 3.9 and 3.10. - Depends on
apache-beam[gcp]>=2.53.0,<3for Python 3.11 and onapache-beam[gcp]>=2.50.0,<2.51.0for 3.9 and 3.10. - macOS wheel publishing is temporarily paused due to missing ARM64 support.
- N/A
- N/A
- N/A
- Depends on
tensorflow 2.16 - Relax dependency on Protobuf to include version 5.x
- N/A
- N/A
- Added support for sparse labels in AMI vocabulary computation.
- Bumped the Ubuntu version on which
tensorflow_transformis tested to 20.04 (previously was 16.04). - Explicitly use Keras 2 or `tf_keras`` if Keras 3 is installed.
- Added python 3.11 support.
- Depends on
tensorflow 2.15. - Enable passing
tf.saved_model.SaveOptionsto model saving functionality. - Census and sentiment examples updated to only use Keras instead of estimator.
- Depends on
apache-beam[gcp]>=2.53.0,<3for Python 3.11 and onapache-beam[gcp]>=2.47.0,<3for 3.9 and 3.10. - Depends on
protobuf>=4.25.2,<5for Python 3.11 and onprotobuf>3.20.3,<5for 3.9 and 3.10.
- Existing analyzer cache is automatically invalidated.
- Deprecated python 3.8 support.
- Adds a
reserved_tokensparameter to vocabulary APIs, a list of tokens that must appear in the vocabulary and maintain their order at the beginning of the vocabulary.
approximate_vocabularynow returns tokens with the same frequency in reverse lexicographical order (similarly totft.vocabulary).- Transformed data batches are now sliced into smaller chunks if their size exceeds 200MB.
- Depends on
pyarrow>=10,<11. - Depends on
apache-beam>=2.47,<3. - Depends on
numpy>=1.22.0. - Depends on
tensorflow>=2.13.0,<3.
- Vocabulary related APIs now require passing non-positional parameters by key.
- N/A
RaggedTensors can now be automatically inferred for variable length features by settingrepresent_variable_length_as_ragged=truein TFMD schema.- New experimental APIs added for annotating sparse output tensors:
tft.experimental.annotate_sparse_output_shapeandtft.experimental.annotate_true_sparse_output. DatasetKey.non_cacheableadded to allow for some datasets to not produce cache. This may be useful for gradual cache generation when operating on a large rolling range of datasets.- Vocabularies produced by
compute_and_apply_vocabularycan now store frequencies. Controlled by thestore_frequencyparameter.
- Depends on
numpy~=1.22.0. - Depends on
tensorflow>=2.12.0,<2.13. - Depends on
protobuf>=3.20.3,<5. - Depends on
tensorflow-metadata>=1.13.1,<1.14.0. - Depends on
tfx-bsl>=1.13.0,<1.14.0. - Modifies
get_vocabulary_size_by_nameto return a minimum of 1.
- N/A
- Deprecated python 3.7 support.
- N/A
- Depends on
tensorflow>=2.11,<2.12 - Depends on
tensorflow-metadata>=1.12.0,<1.13.0. - Depends on
tfx-bsl>=1.12.0,<1.13.0.
- N/A
- N/A
-
This is the last version that supports TensorFlow 1.15.x. TF 1.15.x support will be removed in the next version. Please check the TF2 migration guide to migrate to TF2.
-
Introduced
tft.experimental.document_frequencyandtft.experimental.idfwhich map each term to its document frequency and inverse document frequency in the same order as the terms in documents. -
schema_utils.schema_as_feature_specnow supports struct features as a way to describetf.SequenceExampledata. -
TensorRepresentations in schema used for
schema_utils.schema_as_feature_speccan now share name with their source features. -
Introduced
tft_beam.EncodeTransformedDatasetwhich can be used to easily encode transformed data in preparation for materialization.
- Depends on
tensorflow>=1.15.5,<2ortensorflow>=2.10,<2.11 - Depends on
apache-beam[gcp]>=2.41,<3.
- N/A
- N/A
- N/A
- Assign different close_to_resources resource hints to both original and cloned PTransforms in deep copy optimization. The reason of adding these resource hints is to prevent root Reads that are generated from deep copy being merged due to common subexpression elimination.
- Depends on
apache-beam[gcp]>=2.40,<3. - Depends on
pyarrow>=6,<7. - Depends on
tensorflow-metadata>=1.10.0,<1.11.0. - Depends on
tfx-bsl>=1.10.0,<1.11.0.
- N/A
- N/A
- Adds element-wise scaling support to
scale_by_min_max_per_key,scale_to_0_1_per_keyandscale_to_z_score_per_keyforkey_vocabulary_filename = None.
- Depends on
tensorflow>=1.15.5,<2ortensorflow>=2.9,<2.10 - Depends on
tensorflow-metadata>=1.9.0,<1.10.0. - Depends on
tfx-bsl>=1.9.0,<1.10.0.
- N/A
- N/A
- Adds
tft.DatasetMetadataand its factory methodfrom_feature_specas public APIs to be used when using the "instance dict" data format.
- Depends on
apache-beam[gcp]>=2.38,<3. - Depends on
tensorflow-metadata>=1.8.0,<1.9.0. - Depends on
tfx-bsl>=1.8.0,<1.9.0.
- N/A
- N/A
- Introduced
tft.experimental.compute_and_apply_approximate_vocabularywhich computes and applies an approximate vocabulary.
- Fix an issue when
tft.experimental.approximate_vocabularywithtextoutput format would not filter out tokens with newline characters. - Add a dummy value to the result of
tft.experimental.approximate_vocabularyas is done for the exact variant, in order for downstream code to easily handle it. - Update
tft.get_analyze_input_columnsto ensure its output includespreprocessing_fninputs which are not used in any TFT analyzers, but end up in a control dependency (automatic control dependencies are not present in TF1, hence this change will only affect the native TF2 implementation). - Assign different resource hint tags to both original and cloned PTransforms in deep copy optimization. The reason of adding these tags is to prevent root Reads that are generated from deep copy being merged due to common subexpression elimination.
- Fixed an issue when large int64 values would be incorrectly bucketized in
tft.apply_buckets. - Depends on
apache-beam[gcp]>=2.36,<3. - Depends on
tensorflow>=1.15.5,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,<2.9. - Depends on
tensorflow-metadata>=1.7.0,<1.8.0. - Depends on
tfx-bsl>=1.7.0,<1.8.0.
- N/A
- N/A
- N/A
- Depends on
tensorflow>=1.15.5,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,<2.9.
- N/A
- N/A
- Introduced
tft.experimental.get_vocabulary_size_by_namethat can retrieve the size of a vocabulary computed usingtft.vocabularywithin thepreprocessing_fn. tft.experimental.ptransform_analyzernow supports analyzer cache using the newly addedtft.experimental.CacheablePTransformAnalyzercontainer.tft.bucketize_per_keynow supports weights.
- Depends on
numpy>=1.16,<2. - Depends on
apache-beam[gcp]>=2.35,<3. - Depends on
absl-py>=0.9,<2.0.0. - Depends on
tensorflow-metadata>=1.6.0,<1.7.0. - Depends on
tfx-bsl>=1.6.0,<1.7.0. - Depends on
tensorflow>=1.15.5,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,<2.8.
- N/A
- N/A
- Introduced
tft.experimental.approximate_vocabularyanalyzer that is an approximate version oftft.vocabularywhich is more efficient with smaller number of unique elements ortop_kthreshold.
- Raise a RuntimeError if order of analyzers in traced Tensorflow Graph is non-deterministic in TF2.
- Fix issue where a
tft.experimental.ptransform_analyzer's output dtype could be propagated incorrectly if it was a primitive as opposed tonp.ndarray. - Depends on
apache-beam[gcp]>=2.34,<3. - Depends on
tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,<2.8. - Depends on
tensorflow-metadata>=1.5.0,<1.6.0. - Depends on
tfx-bsl>=1.5.0,<1.6.0.
- N/A
- N/A
- N/A
- Depends on
futurepackage.
- N/A
- N/A
- Added
tf.RaggedTensorsupport to all analyzers and mappers withreduce_instance_dims=True.
- Fix re-loading a transform graph containing pyfuncs exported as a TF1
SavedModel(added using
tft.apply_pyfunc) in TF2. - Depends on
pyarrow>=1,<6. - Depends on
tensorflow-metadata>=1.4.0,<1.5.0. - Depends on
tfx-bsl>=1.4.0,<1.5.0. - Depends on
apache-beam[gcp]>=2.33,<3.
- N/A
- Deprecated python 3.6 support.
- N/A
tft.quantiles,tft.meanandtft.varnow ignore NaNs and infinite input values. Previously, these would lead to incorrect output calculation.- Improved error message for
tft_beam.AnalyzeDataset,tft_beam.AnalyzeAndTransformDatasetandtft_beam.AnalyzeDatasetWithCachewhen the input metadata is empty. - Added best-effort TensorFlow Decision Forests (TF-DF) and Struct2Tensor op registration when loading transformation graphs.
- Depends on
tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,<2.7. - Depends on
tfx-bsl>=1.3.0,<1.4.0.
- Existing
tft.meanandtft.varcaches are automatically invalidated.
- N/A
- Added
RaggedTensorsupport to output schema inference and transformed tensors conversion to instance dicts andpa.RecordBatchwith TF 2.x.
- Depends on
apache-beam[gcp]>=2.31,<3. - Depends on
tensorflow-metadata>=1.2.0,<1.3.0. - Depends on
tfx-bsl>=1.2.0,<1.3.0.
- N/A
- N/A
- N/A
- Depends on
google-cloud-bigquery>>=1.28.0,<2.21. - Depends on
tfx-bsl>=1.1.0,<1.2.0.
- N/A
- N/A
- Improved resource usage for
tft.vocabularywhentop_kis set by removing stages performing repetitive sorting.
- Support invoking Keras models inside the
preprocessing_fnusingtft.make_and_track_objectwhenforce_tf_compat_v1=Falsewith TF2 behaviors enabled. - Fix an issue when computing the metadata for a function with automatic control dependencies added where dependencies on inputs which should not be evaluated was being retained.
- Census TFT example: wrapped table initialization with a tf.init_scope() in order to avoid reinitializing the table for each batch of data.
- Stopped depending on
six. - Depends on
protobuf>=3.13,<4. - Depends on
tensorflow-metadata>=1.1.0,<1.2.0. - Depends on
tfx-bsl>=1.1.0,<1.2.0.
- N/A
- N/A
- N/A
- Depends on
apache-beam[gcp]>=2.29,<3. - Depends on
tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,<2.6. - Depends on
tensorflow-metadata>=1.0.0,<1.1.0. - Depends on
tfx-bsl>=1.0.0,<1.1.0.
tft.ptransform_analyzerhas been moved undertft.experimental. The order of args in the API has also been changed.tft_beam.PTransformAnalyzerhas been moved undertft_beam.experimental.- The default value of the
drop_unused_featuresparameter toTFTransformOutput.transform_raw_featuresis now True.
- N/A
- N/A
- Removed the
dataset_schemamodule, most methods in it have been deprecated since version 0.14. - Fix a bug where having an analyzer operate on the output of
tft.vocabularywould cause it to evaluate incorrectly whenforce_tf_compat_v1=Falsewith TF2 behaviors enabled. - Depends on
tensorflow-metadata>=0.30.0,<0.31.0. - Depends on
tfx-bsl>=0.30.0,<0.31.0.
DatasetMetadatano longer accepts a dict as its input schema.schemais expected to be aSchemaproto now.- TF 1.15 specific APIs
apply_saved_modelandapply_function_with_checkpointwere removed from thetftnamespace. They are still available under thepretrained_modelsmodule. tft.AnalyzeDataset,tft.AnalyzeDatasetWithCache,tft.AnalyzeAndTransformDatasetandtft.TransformDatasetwill use the native TF2 implementation of tf.transform unless TF2 behaviors are explicitly disabled. The previous behaviour can still be obtained by settingtft.Context.force_tf_compat_v1=True.
- N/A
tft.AnalyzeAndTransformDatasetandtft.TransformDatasetcan now outputpyarrow.RecordBatches. This is controlled by a parameteroutput_record_batcheswhich is set toFalseby default.
- Added
tft.make_and_track_objectto load and tracktf.Trackableobjects created inside thepreprocessing_fn(for example, tf.hub models). This API should only be used whenforce_tf_compat_v1=Falseand TF2 behavior is enabled. - The
decodemethod of the available coders (tft.coders.CsvCoderandtft.coders.ExampleProtoCoder) have been removed. These were deprecated in the 0.25 release. Canned TFXIO implementations should be used to read and decode data instead. - Previously deprecated APIs were removed:
tft.uniques(replaced bytft.vocabulary),tft.string_to_int(replaced bytft.compute_and_apply_vocabulary),tft.apply_vocab(replaced bytft.apply_vocabulary), andtft.apply_function(identity function). - Removed the
always_return_num_quantilesarg oftft.quantilesandtft.bucketizewhich was deprecated in version 0.26. - Added support for
count_paramsmethod to theTransformFeaturesLayer. This will allow to call Keras Model'ssummary()method if the model is using theTransformFeaturesLayer. - Depends on
absl-py>=0.9,<0.13. - Depends on
tensorflow-metadata>=0.29.0,<0.30.0. - Depends on
tfx-bsl>=0.29.0,<0.30.0.
- Existing caches (for all analyzers) are automatically invalidated.
- N/A
- Large vocabularies are now computed faster due to partially parallelizing
VocabularyOrderAndWrite.
- Generic
tf.SparseTensorinput support has been added totft.scale_to_0_1,tft.scale_to_z_score,tft.scale_by_min_max,tft.min,tft.max,tft.mean,tft.var,tft.sum,tft.sizeandtft.word_count. - Optimize SavedModel written out by
tf.Transformwhen using native TF2 to speed up loading it. - Added
tft_beam.PTransformAnalyzeras a base PTransform class fortft.ptransform_analyzerusers who wish to have access to a base temporary directory. - Fix an issue where >2D
SparseTensors may be incorrectly represented in instance_dicts format. - Added support for out-of-vocabulary keys for per_key mappers.
- Added
tft.get_num_buckets_for_transformed_featurewhich provides the number of buckets for a transformed feature if it is a direct output oftft.bucketize,tft.apply_buckets,tft.compute_and_apply_vocabularyortft.apply_vocabulary. - Depends on
apache-beam[gcp]>=2.28,<3. - Depends on
numpy>=1.16,<1.20. - Depends on
tensorflow-metadata>=0.28.0,<0.29.0. - Depends on
tfx-bsl>=0.28.1,<0.29.0.
- Autograph is disabled when the preprocessing fn is traced using tf.function
when
force_tf_compat_v1=Falseand TF2 behavior is enabled.
- Added
QuantilesCombiner.compactmethod that moves some amount of work done bytft.quantilesfrom non-parallelizable to parallelizable stage of the computation.
- Strip only newlines instead of all whitespace in the TFTransformOutput vocabulary_by_name method.
- Switch analyzers that output asset files to return an eager tensor
containing the asset file path instead of a tf.saved_model.Asset object when
force_tf_compat_v1=False. If this file is then used to initialize a table, this ensures the input to thetf.lookup.TextFileInitializeris the file path as the initializer handles wrapping this in atf.saved_model.Assetobject. - Added
tft.annotate_assetfor annotating asset files with a string key that can be used to retrieve them intft.TFTransformOutput. - Depends on
apache-beam[gcp]>=2.27,<3. - Depends on
pyarrow>=1,<3. - Depends on
tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,<2.5. - Depends on
tensorflow-metadata>=0.27.0,<0.28.0. - Depends on
tfx-bsl>=0.27.0,<0.28.0.
- N/A
- Parameter
use_tfxioin the initializer ofContextis removed (it was deprecated in 0.24.0).
- Initial support added of >2D
SparseTensors as inputs and outputs of thepreprocessing_fn. Note that mappers and analyzers may not support those yet, and output >2DSparseTensors will have an unknown dense shape.
- Switched to calling tables and initializers within
tf.init_scopewhen thepreprocessing_fnis traced usingtf.functionto avoid re-initializing them on every invocation of the tracedtf.function. - Switched to a (notably) faster and more accurate implementation of
tft.quantilesanalyzer. - Fix an issue where graphs become non-hermetic if a TF2 transform_fn is
loaded in a TF1 Graph context, by making sure all assets are added to the
ASSET_FILEPATHScollection. - Depends on
apache-beam[gcp]>=2.25,!=2.26.*,<3. - Depends on
pyarrow>=0.17,<0.18. - Depends on
tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,<2.4. - Depends on
tensorflow-metadata>=0.26.0,<0.27.0. - Depends on
tfx-bsl>=0.26.0,<0.27.0.
- Existing
tft.quantiles,tft.minandtft.maxcaches are invalidated.
- Parameter
always_return_num_quantilesoftft.quantilesandtft.bucketizeis now deprecated. Both now always generate the requested number of buckets. Settingalways_return_num_quantileswill have no effect and it will be removed in the next version.
-
Updated the "Getting Started" guide and examples to demonstrate the support for both the "instance dict" and the "TFXIO" format. Users are encouraged to start using the "TFXIO" format, expecially in cases where pre-canned TFXIO implementations is available as it offers better performance.
-
From this release TFT will also be hosting nightly packages on https://pypi-nightly.tensorflow.org. To install the nightly package use the following command:
pip install --extra-index-url https://pypi-nightly.tensorflow.org/simple tensorflow-transformNote: These nightly packages are unstable and breakages are likely to happen. The fix could often take a week or more depending on the complexity involved for the wheels to be available on the PyPI cloud service. You can always use the stable version of TFT available on PyPI by running the command
pip install tensorflow-transform.
TFTransformOutput.transform_raw_featuresandTransformFeaturesLayercan be used when a transform fn is exported as a TF2 SavedModel and imported in graph mode.- Utility methods in
tft.inspect_preprocessing_fnnow take an optional parameterforce_tf_compat_v1. If this is False, thepreprocessing_fnis traced using tf.function in TF 2.x when TF 2 behaviors are enabled. - Switching to a wrapper for
collections.namedtupleto ensure compatibility with PySpark which modifies classes produced by the factory. - Caching has been disabled for
tft.tukey_h_params,tft.tukey_locationandtft.tukey_scaledue to the cached accumulator being non-deterministic. - Track variables created within the
preprocessing_fnin the native TF 2 implementation. TFTransformOutput.transform_raw_featuresreturns a wrapped python dict that overrides pop to return None instead of raising a KeyError when called with a key not found in the dictionary. This is done as preparation for switching the default value ofdrop_unused_featuresto True.- Vocabularies written in
tfrecord_gzipformat no longer filter out entries that are empty or that include a newline character. - Depends on
apache-beam[gcp]>=2.25,<3. - Depends on
tensorflow-metadata>=0.25,<0.26. - Depends on
tfx-bsl>=0.25,<0.26.
- N/A
- The
decodemethod of the available coders (tft.coders.CsvCoderandtft.coders.ExampleProtoCoder) has been deprecated and removed. Canned TFXIO implementations should be used to read and decode data instead.
- N/A
- Depends on
apache-beam[gcp]>=2.24,<3. - Depends on
tfx-bsl>=0.24.1,<0.25.
- N/A
- N/A
- Added native TF 2 implementation of Transform's Beam APIs -
tft.AnalyzeDataset,tft.AnalyzeDatasetWithCache,tft.AnalyzeAndTransformDatasetandtft.TransformDataset. The default behavior will continue to use Tensorflow's compat.v1 APIs. This can be overridden by settingtft.Context.force_tf_compat_v1=False. The default behavior for TF 2 users will be switched to the new native implementation in a future release.
- Added a small fanout to analyzers'
CombineGloballyfor improved performance. TransformFeaturesLayercan be called after being saved as an attribute to a Keras Model, even if the layer isn't used in the Model.- Depends on
absl-py>=0.9,<0.11. - Depends on
protobuf>=3.9.2,<4. - Depends on
tensorflow-metadata>=0.24,<0.25. - Depends on
tfx-bsl>=0.24,<0.25.
- N/A
- Deprecating Py3.5 support.
- Parameter
use_tfxioin the initializer ofContextis deprecated. TFT Beam APIs now accepts both "instance dicts" and "TFXIO" input formats. Setting it will have no effect and it will be removed in the next version.
- Added
tft.scale_to_gaussianto transform input to standard gaussian. - Vocabulary related analyzers and mappers now accept a
file_formatargument allowing the vocabulary to be saved in TFRecord format. The default format remains text (TFRecord format requires tensorflow>=2.4).
- Enable
SavedModelLoaderto import and apply TF2 SavedModels. tft.min,tft.max,tft.sum,tft.covarianceandtft.pcanow have default output values to properly process empty analysis datasets.tft.scale_by_min_max,tft.scale_to_0_1and the corresponding per-key versions now apply a sigmoid function to scale tensors if the analysis dataset is either empty or contains a single distinct value.- Added best-effort tf.text op registration when loading transformation graphs.
- Vocabularies computed over numerical features will now assign values to entries with equal frequency in reverse lexicographical order as well, similarly to string features.
- Fixed an issue that causes the
TABLE_INITIALIZERSgraph collection to contain a tensor instead of an op when a TF2 SavedModel or a TF2 Hub Module containing a table is loaded inside thepreprocessing_fn. - Fixes an issue where the output tensors of
tft.TransformFeaturesLayerwould all have unknown shapes. - Stopped depending on
avro-python3. - Depends on
apache-beam[gcp]>=2.23,<3. - Depends on
tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,<2.4. - Depends on
tensorflow-metadata>=0.23,<0.24. - Depends on
tfx-bsl>=0.23,<0.24.
- Existing caches (for all analyzers) are automatically invalidated.
- Deprecating Py2 support.
- Note: We plan to remove Python 3.5 support after this release.
tft.bucketize_per_keyno longer assumes that the keys during transformation existed in the analysis dataset. If a key is missing then the assigned bucket will be -1.tft.estimated_probability_density, whencategorical=True, no longer assumes that the values during transformation existed in the analysis dataset, and will assume 0 density in that case.- Switched analyzer cache representation of dataset keys from using a primitive str to a DatasetKey class.
tft_beam.analyzer_cache.ReadAnalysisCacheFromFScan now filter cache entry keys when given acache_entry_keysparameter.cache_entry_keyscan be produced by utilizingget_analysis_cache_entry_keys.- Reduced number of shuffles via packing multiple combine merges into a single Beam combiner.
- Switch
tft.TransformFeaturesLayerto use the TF 2tf.saved_model.loadAPI to load a previously exported SavedModel. - Adds
tft.sparse_tensor_left_alignas a utility which alignstf.SparseTensors to the left. - Depends on
avro-python3>=1.8.1,!=1.9.2.*,<2.0.0for Python3.5 + MacOS. - Depends on
apache-beam[gcp]>=2.20.0,<3. - Depends on
tensorflow>=1.15,!=2.0.*,<2.3. - Depends on
tensorflow-metadata>=0.22.0,<0.23.0. - Depends on
tfx-bsl>=0.22.0,<0.23.0.
tft.AnalyzeDatasetWithCacheno longer accepts a flat pcollection as an input. Instead it will flatten the datasets in theinput_values_pcoll_dictinput if needed.tft.TransformFeaturesLayerno longer takes a parameterdrop_unused_features. Its default behavior is now equivalent to having setdrop_unused_featurestoTrue.
- Expanded capability for per-key analyzers to analyze larger sets of keys that
would not fit in memory, by storing the key-value pairs in vocabulary files.
This is enabled by passing a
per_key_filenametotft.count_per_keyandtft.scale_to_z_score_per_key. - Added
tft.TransformFeaturesLayerandtft.TFTransformOutput.transform_features_layersto allow transforming features for a TensorFlow Keras model.
tft.apply_buckets_with_interpolationnow handles NaN values by imputing with the middle of the normalized range.- Depends on
tfx-bsl>=0.21.3,<0.22.
- Added a new version of the census example to demonstrate usage in TF 2.0.
- New mapper
estimated_probability_densityto compute either exact probabilities (for discrete categorical variable) or approximate density over fixed intervals (continuous variables). - New analyzers
count_per_keyandhistogramto return counts of unique elements or values within predefined ranges. Callingtft.histogramon non-categorical value will assign each data point to the appropriate fixed bucket and then count for each bucket. - Provided capability for per-key analyzers to analyze larger sets of keys that
would not fit in memory, by storing the key-value pairs in vocabulary files.
This is enabled by passing a
per_key_filenametotft.scale_by_min_max_per_keyandtft.scale_to_0_1_per_key.
- Added beam counters to log analyzer and mapper usage.
- Cleanup deprecated APIs used in census and sentiment examples.
- Support windows style paths in
analyzer_cache. tft_beam.WriteTransformFnandtft_beam.WriteMetadatahave been made idempotent to allow retrying them in case of a failure.tft_beam.WriteMetadatatakes an optional argumentwrite_to_unique_subdirand returns the path to which metadata was written. Ifwrite_to_unique_subdiris True, metadata is written to a unique subdirectory underpath, otherwise it is written topath.- Support non utf-8 characters when reading vocabularies in
tft.TFTransformOutput tft.TFTransformOutput.vocabulary_by_namenow returns bytes instead of str with python 3.
- This release introduces initial beta support for TF 2.0. TF 2.0 programs
running in "safety" mode (i.e. using TF 1.X APIs through the
tensorflow.compat.v1compatibility module are expected to work. Newly written TF 2.0 programs may not work if they exercise functionality that is not yet supported. If you do encounter an issue when usingtensorflow-transformwith TF 2.0, please create an issue https://github.com/tensorflow/transform/issues with instructions on how to reproduce it. - Performance improvements for
preprocessing_fnswith many Quantiles analyzers. tft.quantilesandtft.bucketizeare now using new TF core quantiles ops instead of contrib ops.- Performance improvements due to packing multiple combine analyzers into a single Beam Combiner.
- Existing analyzer cache is invalidated.
- Saved transforms now support composite tensors (such as
tf.RaggedTensor). - Vocabulary's cache coder now supports non utf-8 encodable tokens.
- Fixes encoding of the
tft.covarianceaccumulator cache. - Fixes encoding per-key analyzers accumulator cache.
- Make various utility methods in
tft.inspect_preprocessing_fnsupporttf.RaggedTensor. - Moved beam/shared lib to
tfx-bsl. If running with latest master,tfx-bslmust also be latest master. preprocessing_fns now have beta support of calls totf.functions, as long as they don't contain calls totf.Transformanalyzers/mappers or table initializers.tft.quantilesandtft.bucketizeare now using core TF ops.- Depends on
tfx-bsl>=0.15,<0.16. - Depends on
tensorflow-metadata>=0.15,<0.16. - Depends on
apache-beam[gcp]>=2.16,<3. - Depends on
tensorflow>=0.15,<2.2.- Starting from 1.15, package
tensorflowcomes with GPU support. Users won't need to choose betweentensorflowandtensorflow-gpu. - Caveat:
tensorflow2.0.0 is an exception and does not have GPU support. Iftensorflow-gpu2.0.0 is installed before installingtensorflow-transform, it will be replaced withtensorflow2.0.0. Re-installtensorflow-gpu2.0.0 if needed.
- Starting from 1.15, package
always_return_num_quantileschanged to default to True intft.quantilesandtft.bucketize, resulting in exact bucket count returned.- Removes the
input_fn_makermodule which has been deprecated since TFT 0.11. For idiomatic construction ofinput_fn, seetensorflow_transformexamples.
- New
tft.word_countmapper to identify the number of tokens for each row (for pre-tokenized strings). - All
tft.scale_to_*mappers now have per-key variants, along with analyzers formean_and_var_per_keyandmin_and_max_per_key. - New
tft_beam.AnalyzeDatasetWithCacheallows analyzing ranges of data while producing and utilizing cache.tft.analyzer_cachecan help read and write such cache to a filesystem between runs. This caching feature is worth using when analyzing a rolling range in a continuous pipeline manner. This is an experimental feature. - Added
reduce_instance_dimssupport totft.quantilesandelementwisetotft.bucketize, while avoiding separate beam calls for each feature.
sparse_tensor_to_dense_with_shapenow accepts an optionaldefault_valueparameter.tft.vocabularyandtft.compute_and_apply_vocabularynow supportfingerprint_shuffleto sort the vocabularies by fingerprint instead of counts. This is useful for load balancing the training parameter servers. This is an experimental feature.- Fix numerical instability in
tft.vocabularymutual information calculations. tft.vocabularyandtft.compute_and_apply_vocabularynow support computing vocabularies over integer categoricals and multivalent input features, and computing mutual information for non-binary labels.- New numeric normalization method available:
tft.apply_buckets_with_interpolation. - Changes to make this library more compatible with TensorFlow 2.0.
- Fix sanitizing of vocabulary filenames.
- Emit a friendly error message when context isn't set.
- Analyzer output dtypes are enforced to be TensorFlow dtypes, and by extension
ptransform_analyzer'soutput_dtypesis enforced to be a list of TensorFlow dtypes. - Make
tft.apply_buckets_with_interpolationsupport SparseTensors. - Adds an experimental api for analyzers to annotate the post-transform schema.
TFTransformOutput.transform_raw_featuresnow accepts an optionaldrop_unused_featuresparameter to exclude unused features in output.- If not specified, the min_diff_from_avg parameter of
tft.vocabularynow defaults to a reasonable value based on the size of the dataset (relevant only if computing vocabularies using mutual information). - Convert some
tf.contribfunctions to be compatible with TF2.0. - New
tft.bag_of_wordsmapper to compute the unique set of ngrams for each row (for pre-tokenized strings). - Fixed a bug in
tf_utils.reduce_batch_count_mean_and_var, and as a resultmean_and_varanalyzer, was miscalculating variance for the sparse elementwise=True case. - At test utility
tft_unit.cross_named_parametersfor creating parameterized tests that involve the cartesian product of various parameters. - Depends on
tensorflow-metadata>=0.14,<0.15. - Depends on
apache-beam[gcp]>=2.14,<3. - Depends on
numpy>=1.16,<2. - Depends on
absl-py>=0.7,<2. - Allow
preprocessing_fnto emit atf.RaggedTensor. In this case, the outputSchemaproto will not be able to be converted to a feature spec, and so the output data will not be able to be materialized withtft.coders. - Ability to directly set exact
num_bucketswith new parameteralways_return_num_quantilesforanalyzers.quantilesandmappers.bucketize, defaulting to False in general but True whenreduce_instance_dimsis False.
tf_utils.reduce_batch_count_mean_and_var, which feeds intotft.mean_and_var, now returns 0 instead of inf for empty columns of a sparse tensor.tensorflow_transform.tf_metadata.dataset_schema.Schemaclass is removed. Wherever adataset_schema.Schemawas used, users should now provide atensorflow_metadata.proto.v0.schema_pb2.Schemaproto. For backwards compatibility,dataset_schema.Schemais now a factory method that produces aSchemaproto. Updating code should be straightforward because thedataset_schema.Schemaclass was already a wrapper around theSchemaproto.- Only explicitly public analyzers are exported to the
tftmodule, e.g. combiners are no longer exported and have to be accessed directly throughtft.analyzers. - Requires pre-installed TensorFlow >=1.14,<2.
DatasetSchemais now a deprecated factory method (see above).tft.tf_metadata.dataset_schema.from_feature_specis now deprecated. Equivalent functionality is provided bytft.tf_metadata.schema_utils.schema_from_feature_spec.
- Now
AnalyzeDataset,TransformDatasetandAnalyzeAndTransformDatasetcan accept input data that only contains columns needed for that operation as opposed to all columns defined in schema. Utility methods to infer the list of needed columns are added totft.inspect_preprocessing_fn. This makes it easier to take advantage of columnar projection when data is stored in columnar storage formats. - Python 3.5 is supported.
- Version is now accessible as
tensorflow_transform.__version__. - Depends on
apache-beam[gcp]>=2.11,<3. - Depends on
protobuf>=3.7,<4.
- Coders now return index and value features rather than a combined feature for
SparseFeature. - Requires pre-installed TensorFlow >=1.13,<2.
- Python 3.5 readiness complete (all tests pass). Full Python 3.5 compatibility is expected to be available with the next version of Transform (after Apache Beam 2.11 is released).
- Performance improvements for vocabulary generation when using top_k.
- New optimized highly experimental API for analyzing a dataset was added,
AnalyzeDatasetWithCache, which allows reading and writing analyzer cache. - Update
DatasetMetadatato be a wrapper around thetensorflow_metadata.proto.v0.schema_pb2.Schemaproto. TensorFlow Metadata will be the schema used to define data parsing across TFX. The serializedDatasetMetadatais now theSchemaproto in ascii format, but the previous format can still be read. - Change
ApplySavedModelimplementation to usetf.Session.make_callableinstead oftf.Session.runfor improved performance.
tft.vocabularyandtft.compute_and_apply_vocabularynow support filtering based on adjusted mutual information whenuse_adjusetd_mutual_infois set to True.tft.vocabularyandtft.compute_and_apply_vocabularynow takes regularization term 'min_diff_from_avg' that adjusts mutual information to zero whenever the difference between count of the feature with any label and its expected count is lower than the threshold.- Added an option to
tft.vocabularyandtft.compute_and_apply_vocabularyto compute a coverage vocabulary, using the newcoverage_top_k,coverage_frequency_thresholdandkey_fnparameters. - Added
tft.ptransform_analyzerfor advanced use cases. - Modified
QuantilesCombinerto usetf.Session.make_callableinstead oftf.Session.runfor improved performance. - ExampleProtoCoder now also supports non-serialized Example representations.
tft.tfidfnow accepts a scalar Tensor asvocab_size.assertItemsEqualin unit tests are replaced byassertCountEqual.NumPyCombinernow outputs TF dtypes in output_tensor_infos instead of numpy dtypes.- Adds function
tft.apply_pyfuncthat provides limited support fortf.pyfunc. Note that this is incompatible with serving. See documentation for more details. CombinePerKeynow adds a dimension for the key.- Depends on
numpy>=1.14.5,<2. - Depends on
apache-beam[gcp]>=2.10,<3. - Depends on
protobuf==3.7.0rc2. ExampleProtoCoder.encodenow converts a feature whose value isNoneto an empty value, where before it did not acceptNoneas a valid value.AnalyzeDataset,AnalyzeAndTransformDatasetandTransformDatasetcan now accept dictionaries which containNone, and which will be interpreted the same as an empty list. They will never produce an output containingNone.
ColumnSchemaand related classes (Domain,AxisandColumnRepresentationand their subclasses) have been removed. In order to create a schema, usefrom_feature_spec. In order to inspect a schema use theas_feature_specanddomainsmethods ofSchema. The constructors of these classes are replaced by functions that still work when creating aSchemabut this usage is deprecated.- Requires pre-installed TensorFlow >=1.12,<2.
ExampleProtoCoder.decodenow converts a feature with empty value (e.g.features { feature { key: "varlen" value { } } }) or missing key for a feature (e.g.features { }) to aNonein the output dictionary. Before it would represent these with an empty list. This better reflects the original example proto and is consistent with TensorFlow Data Validation.- Coders now returns a
listinstead of anndarrayfor aVarLenFeature.
- 'tft.vocabulary' and 'tft.compute_and_apply_vocabulary' now support filtering
based on mutual information when
labelsis provided. - Export all package level exports of
tensorflow_transform, from thetensorflow_transform.beamsubpackage. This allows users to just import thetensorflow_transform.beamsubpackage for all functionality. - Adding API docs.
- Fix bug where Transform returned a different dtype for a VarLenFeature with 0 elements.
- Depends on
apache-beam[gcp]>=2.8,<3.
- Requires pre-installed TensorFlow >=1.11,<2.
- All functions in
tensorflow_transform.saved.input_fn_makerare deprecated. See the examples for how to construct theinput_fnfor training and serving. Note that the examples demonstrate the use of thetf.estimatorAPI. The functions named *_serving_input_fn were for use with thetf.contrib.estimatorAPI which is now deprecated. We do not provide examples of usage of thetf.contrib.estimatorAPI, instead users should upgrade to thetf.estimatorAPI.
- Performance improvements for vocabulary generation when using top_k.
- Utility to deep-copy Beam
PCollections was added to avoid unnecessary materialization. - Utilize deep_copy to avoid unnecessary materialization of pcollections when
the input data is immutable. This feature is currently off by default and can
be enabled by setting
tft.Context.use_deep_copy_optimization=True. - Add bucketize_per_key which computes separate quantiles for each key and then bucketizes each value according to the quantiles computed for its key.
tft.scale_to_z_scoreis now implemented with a single pass over the data.- Export schema_utils package to convert from the
tensorflow-metadatapackage to the (soon to be deprecated)tf_metadatasubpackage oftensorflow-transform.
- Memory reduction during vocabulary generation.
- Clarify documentation on return values from
tft.compute_and_apply_vocabularyandtft.string_to_int. tft.unitnow explicitly creates Beam PCollections and validates the transformed dataset by writing and then reading it from disk.tft.min,tft.size,tft.sum,tft.scale_to_z_scoreandtft.bucketizenow supporttf.SparseTensor.- Fix to
tft.scale_to_z_scoreso it no longer attempts to divide by 0 when the variance is 0. - Fix bug where internal graph analysis didn't handle the case where an operation has control inputs that are operations (as opposed to tensors).
tft.sparse_tensor_to_dense_with_shapeadded which allows densifying aSparseTensorwhile specifying the resultingTensor's shape.- Add
load_transform_graphmethod toTFTransformOutputto load the transform graph without applying it. This has the effect of adding variables to the checkpoint when calling it from the traininginput_fnwhen usingtf.Estimator. - 'tft.vocabulary' and 'tft.compute_and_apply_vocabulary' now accept an
optional
weightsargument. Whenweightsis provided, weighted frequencies are used instead of frequencies based on counts. - 'tft.quantiles' and 'tft.bucketize' now accept an optional
weightsargument. Whenweightsis provided, weighted count is used for quantiles instead of the counts themselves. - Updated examples to construct the schema using
dataset_schema.from_feature_spec. - Updated the census example to allow the 'education-num' feature to be missing and fill in a default value when it is.
- Depends on
tensorflow-metadata>=0.9,<1. - Depends on
apache-beam[gcp]>=2.6,<3.
- We now validate a
Schemain its constructor to make sure that it can be converted to a feature spec. In particular onlytf.int64,tf.stringandtf.float32types are allowed. - We now disallow default values for
FixedColumnRepresentation. - It is no longer possible to set a default value in the Schema, and validation of shape parameters will occur earlier.
- Removed Schema.as_batched_placeholders() method.
- Removed all components of DatasetMetadata except the schema, and removed all related classes and code.
- Removed the merge method for DatasetMetadata and related classes.
- read_metadata can now only read from a single metadata directory and
read_metadata and write_metadata no longer accept the
versionsparameter. They now only read/write the JSON format. - Requires pre-installed TensorFlow >=1.9,<2.
apply_functionis no longer needed and is deprecated.apply_function(fn, *args)is now equivalent tofn(*args). tf.Transform is able to handle while loops and tables without the user wrapping the function call inapply_function.
- Add TFTransformOutput utility class that wraps the output of tf.Transform for use in training. This makes it easier to consume the output written by tf.Transform (see update examples for usage).
- Increase efficiency of
quantiles(and thereforebucketize).
- Change
tft.sum/tft.mean/tft.varto only support basic numeric types. - Widen the output type of
tft.sumfor some input types to avoid overflow and/or to preserve precision. - For int32 and int64 input types, change the output type of
tft.mean/tft.var/tft.scale_to_z_scorefrom float64 to float32 . - Change the output type of
tft.sizeto be always int64. Contextnow accepts passthrough_keys which can be used when additional information should be attached to dataset instances in the pipeline which should not be part of the transformation graph, for example: instance keys.- In addition to using TFTransformOutput, the examples demonstrate new workflows
where a vocabulary is computed, but not applied, in the
preprocessing_fn. - Added dependency on the absl-py package.
TransformTestCasetest cases can now be parameterized.- Add support for partitioned variables when loading a model.
- Export the
coderssubpackage so that users can access it astft.coders, e.g.tft.coders.ExampleProtoCoder. - Setting dtypes for numpy arrays in
tft.coders.ExampleProtoCoderandtft.coders.CsvCoder. tft.mean,tft.maxandtft.varnow supporttf.SparseTensor.- Update examples to use "core" TensorFlow estimator API (
tf.estimator). - Depends on
protobuf>=3.6.0<4.
apply_saved_transformis removed. See note onpartially_apply_saved_transformin theDeprecationssection.- No longer set
vocabulary_fileinIntDomainwhen usingtft.compute_and_apply_vocabularyortft.apply_vocabulary. - Requires pre-installed TensorFlow >=1.8,<2.
- The
expected_asset_file_contentsofTransformTestCase.assertAnalyzeAndTransformResultshas been deprecated, useexpected_vocab_file_contentsinstead. transform_fn_io.TRANSFORMED_METADATA_DIRandtransform_fn_io.TRANSFORM_FN_DIRshould not be used, they are now aliases forTFTransformOutput.TRANSFORMED_METADATA_DIRandTFTransformOutput.TRANSFORM_FN_DIRrespectively.partially_apply_saved_transformis deprecated, users should use thetransform_raw_featuresmethod ofTFTransformOutputinstead. These differ in thatpartially_apply_saved_transformcan also be used to return both the input placeholders and the outputs. But users do not need this functionality because they will typically create the input placeholders themselves based on the feature spec.- Renamed
tft.uniquestotft.vocabulary,tft.string_to_inttotft.compute_and_apply_vocabularyandtft.apply_vocabtotft.apply_vocabulary. The existing methods will remain for a few more minor releases but are now deprecated and should get migrated away from.
- Depends on
apache-beam[gcp]>=2.4,<3. - Trim min/max value in
tft.bucketizewhere the computed number of bucket boundaries is more than requested. Updated documentation to clearly indicate that the number of buckets is computed using approximate algorithms, and that computed number can be more or less than requested. - Change the namespace used for Beam metrics from
tensorflow_transformtotfx.Transform. - Update Beam metrics to also log vocabulary sizes.
CsvCoderupdated to support unicode.- Update examples to not use the
coderargument for IO, and instead use a separatebeam.Mapto encode/decode data.
- Requires pre-installed TensorFlow >=1.6,<2.
- Batching of input instances is now done automatically and dynamically.
- Added analyzers to compute covariance matrices (
tft.covariance) and principal components for PCA (tft.pca). - CombinerSpec and combine_analyzer now accept multiple inputs/outputs.
- Depends on
apache-beam[gcp]>=2.3,<3. - Fixes a bug where TransformDataset would not return correct output if the output DatasetMetadata contained deferred values (such as vocabularies).
- Added checks that the prepreprocessing function's outputs all have the same size in the batch dimension.
- Added
tft.apply_bucketswhich takes an input tensor and a list of bucket boundaries, and returns bucketized data. tft.bucketizeandtft.apply_bucketsnow set metadata for the output tensor, which means the resulting tf.Metadata for the output of these functions will contain min and max values based on the number of buckets, and also be set to categorical.- Testing helper function assertAnalyzeAndTransformResults can now also test the content of vocabulary files and other assets.
- Reduces the number of beam stages needed for certain analyzers, which can be a performance bottleneck when transforming many features.
- Performance improvements in
tft.uniques. - Fix a bug in
tft.bucketizewhere the bucket boundary could be same as a min/max value, and was getting dropped. - Allows scaling individual components of a tensor independently with
tft.scale_by_min_max,tft.scale_to_0_1, andtft.scale_to_z_score. - Fix a bug where
apply_saved_transformcould only be applied in the global name scope. - Add warning when
frequency_thresholdthat are <= 1. This is a no-op and generally reflects mistakingfrequency_thresholdfor a relative frequency where in fact it is an absolute frequency.
- The interfaces of CombinerSpec and combine_analyzer have changed to allow for multiple inputs/outputs.
- Requires pre-installed TensorFlow >=1.5,<2.
- Added a combine_analyzer() that supports user provided combiner, conforming to beam.CombinFn(). This allows users to implement custom combiners (e.g. median), to complement analyzers (like min, max) that are prepackaged in TFT.
- Quantiles Analyzer (
tft.quantiles), with a correspondingtft.bucketizemapper.
- Depends on
apache-beam[gcp]>=2.2,<3. - Fixes some KeyError issues that appeared in certain circumstances when one would call AnalyzeAndTransformDataset (due to a now-fixed Apache Beam [bug] (https://issues.apache.org/jira/projects/BEAM/issues/BEAM-2966)).
- Allow all functions that accept and return tensors, to accept an optional name scope, in line with TensorFlow coding conventions.
- Update examples to construct input functions by hand instead of using helper functions.
- Change scale_by_min_max/scale_to_0_1 to return the average(min, max) of the range in case all values are identical.
- Added export of serving model to examples.
- Use "core" version of feature columns (tf.feature_column instead of tf.contrib) in examples.
- A few bug fixes and improvements for coders regarding Python 3.
- Requires pre-installed TensorFlow >= 1.4.
- No longer distributing a WHL file in PyPI. Only doing a source distribution
which should however be compatible with all platforms (ie you are still able
to
pip install tensorflow-transformand userequirements.txtorsetup.pyfiles for environment setup). - Some functions now introduce a new name scope when they did not before so the names of tensors may change. This will only affect you if you directly lookup tensors by name in the graph produced by tf.Transform.
- Various Analyzer Specs (_NumericCombineSpec, _UniquesSpec, _QuantilesSpec) are now private. Analyzers are accessible only via the top-level TFT functions (min, max, sum, size, mean, var, uniques, quantiles).
- The
serving_input_fns ontensorflow_transform/saved/input_fn_maker.pywill be removed on a future version and should not be used on new code, see theexamplesdirectory for details on how to migrate your code to define their own serving functions.
- We now provide helper methods for creating
serving_input_receiver_fnfor use with tf.estimator. These mirror the existing functions targeting the legacy tf.contrib.learn.estimators-- i.e. for each*_serving_input_fn()in input_fn_maker there is now also a*_serving_input_receiver_fn().
- Introduced
tft.apply_vocabthis allows users to separately apply a single vocabulary (as generated bytft.uniques) to several different columns. - Provide a source distribution tar
tensorflow-transform-X.Y.Z.tar.gz.
- The default prefix for
tft.string_to_intvocab_filenamechanged fromvocab_string_to_inttovocab_string_to_int_uniques. To make your pipelines resilient to implementation details please setvocab_filenameif you are using the generated vocab_filename on a downstream component.
- Added hash_strings mapper.
- Write vocabularies as asset files instead of constants in the SavedModel.
- 'tft.tfidf' now adds 1 to idf values so that terms in every document in the corpus have a non-zero tfidf value.
- Performance and memory usage improvement when running with Beam runners that use multi-threaded workers.
- Performance optimizations in ExampleProtoCoder.
- Depends on
apache-beam[gcp]>=2.1.1,<3. - Depends on
protobuf>=3.3<4. - Depends on
six>=1.9,<1.11.
- Requires pre-installed TensorFlow >= 1.3.
- Removed
tft.mapusetft.apply_functioninstead (as needed). - Removed
tft.tfidf_weightsusetft.tfidfinstead. beam_metadata_io.WriteMetadatanow requires a secondpipelineargument (see examples).- A Beam bug will now affect users who call AnalyzeAndTransformDataset in
certain circumstances. Roughly speaking, if you call
beam.Pipeline()at some point (as all our examples do) you will not experience this bug. The bug is characterized by an error similar toKeyError: (u'AnalyzeAndTransformDataset/AnalyzeDataset/ComputeTensorValues/Extract[Maximum:0]', None)This bug will be fixed in Beam 2.2.
- Add json-example serving input functions to TF.Transform.
- Add variance analyzer to tf.transform.
- Remove duplication in output of
tft.tfidf. - Ensure ngrams output dense_shape is greater than or equal to 0.
- Alters the behavior and interface of tensorflow_transform.mappers.ngrams.
- Depends on
apache-beam[gcp]=>2,<3. - Making TF Parallelism runner-dependent.
- Fixes issue with csv serving input function.
- Various performance and stability improvements.
tft.mapwill be removed on version 0.2.0, see theexamplesdirectory for instructions on how to usetft.apply_functioninstead (as needed).tft.tfidf_weightswill be removed on version 0.2.0, usetft.tfidfinstead.
- Refactor internals to remove Column and Statistic classes
- Remove collections from graph to avoid warnings
- Return float32 from
tfidf_weights - Update tensorflow_transform to use
tf.saved_modelAPIs. - Add default values on example proto coder.
- Various performance and stability improvements.