AttributeError: 'SequenceRecordInputter' object has no attribute 'input_depth'

Gerd · March 18, 2019, 7:18pm

Greetings.

I’m running into problems using a multisource model with one source being a TFRecord (each line of text corresponds to a sequence of floating-point vectors) and the other source being text.

I’m using OpenNMT-tf_v1.21.0, python 3.4, TensorFlow 1.13.1

My current problem is shown in the log at the bottom of the post: “AttributeError: ‘SequenceRecordInputter’ object has no attribute ‘input_depth’”. Any help would be appreciated.

Lesser problems (not of immediate concern, but might be related to the above):

A compressed (GZIP or ZLIB) TFRecord file does not seem to load properly (“tensorflow.python.framework.errors_impl.DataLossError: corrupted record at 0”). The vectors are sparse, so compression would help a lot.
I had to set source embedding_size equal to the depth of the TFRecord’s vectors to get it to work. I was hoping for a larger embedding size than the vector’s 400-dimensional representation.

Thanks in advance!

Here are some I/O. Happy to provide more info or try some experiments on this end.

Command-line:

onmt-main train_and_eval \
  --config config/enmt_wordshape_transformer.yml \
  --model config/model/wordshape_transformer.py \
  --auto_config \
  --gpu_allow_growth \
  --num_gpus 2

yml:

model_dir: srctgt_wordshape_transformer

data:
  train_features_file: [data/train.wv200.src, data/train.sp32k.src]
  train_labels_file: data/train.sp32k.tgt
  eval_features_file: [data/valid.wv200.src, data/valid.sp32k.src]
  eval_labels_file: data/valid.sp32k.tgt
  source_words_vocabulary: data/enmt-src-32k.onmt.vocab
  target_words_vocabulary: data/enmt-tgt-32k.onmt.vocab

train:
  save_checkpoints_steps: 1000
  exporters: last

eval:
  eval_delay: 3600  # Every 1 hour
  external_evaluators: BLEU

infer:
  batch_size: 32

model_description:

import opennmt as onmt

def model():
  return onmt.models.Transformer(
      source_inputter=onmt.inputters.ParallelInputter([
          onmt.inputters.SequenceRecordInputter(),
          onmt.inputters.WordEmbedder(
              vocabulary_file_key="source_words_vocabulary",
              embedding_size=400)]),
      target_inputter=onmt.inputters.WordEmbedder(
          vocabulary_file_key="target_words_vocabulary",
          embedding_size=512),
      num_layers=2,
      num_units=512,
      num_heads=8,
      ffn_inner_dim=1024,
      dropout=0.1,
      attention_dropout=0.1,
      relu_dropout=0.1,
      share_encoders=True)

Log:

WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:You provided a model configuration but a checkpoint already exists. The model configuration must define the same model as the one used for the initial training. However, you can change non structural values like dropout.
INFO:tensorflow:Using parameters:
data:
  eval_features_file:
  - data/valid.wv200.src
  - data/valid.sp32k.src
  eval_labels_file: data/valid.sp32k.tgt
  source_words_vocabulary: data/srctgt-src-32k.onmt.vocab
  target_words_vocabulary: data/srctgt-tgt-32k.onmt.vocab
  train_features_file:
  - data/train.wv200.src
  - data/train.sp32k.src
  train_labels_file: data/train.sp32k.tgt
eval:
  batch_size: 32
  eval_delay: 3600
  exporters: last
  external_evaluators: BLEU
infer:
  batch_size: 32
  bucket_width: 5
model_dir: srctgt_wordshape_transformer
params:
  average_loss_in_time: true
  beam_width: 4
  decay_params:
    model_dim: 512
    warmup_steps: 8000
  decay_type: noam_decay_v2
  label_smoothing: 0.1
  learning_rate: 2.0
  length_penalty: 0.6
  optimizer: LazyAdamOptimizer
  optimizer_params:
    beta1: 0.9
    beta2: 0.998
score:
  batch_size: 64
train:
  average_last_checkpoints: 8
  batch_size: 3072
  batch_type: tokens
  bucket_width: 1
  effective_batch_size: 25000
  exporters: last
  keep_checkpoint_max: 8
  maximum_features_length: 100
  maximum_labels_length: 100
  sample_buffer_size: -1
  save_checkpoints_steps: 1000
  save_summary_steps: 100
  train_steps: 500000

INFO:tensorflow:Accumulate gradients of 5 iterations to reach effective batch size of 25000
2019-03-15 08:19:52.109281: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-03-15 08:19:52.534444: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x4471720 executing computations on platform CUDA. Devices:
2019-03-15 08:19:52.534533: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): Tesla M40, Compute Capability 5.2
2019-03-15 08:19:52.534580: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (1): Tesla M40, Compute Capability 5.2
2019-03-15 08:19:52.541476: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2599970000 Hz
2019-03-15 08:19:52.547350: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x456f210 executing computations on platform Host. Devices:
2019-03-15 08:19:52.547415: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2019-03-15 08:19:52.548121: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: Tesla M40 major: 5 minor: 2 memoryClockRate(GHz): 1.112
pciBusID: 0000:02:00.0
totalMemory: 11.18GiB freeMemory: 11.07GiB
2019-03-15 08:19:52.548637: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 1 with properties:
name: Tesla M40 major: 5 minor: 2 memoryClockRate(GHz): 1.112
pciBusID: 0000:81:00.0
totalMemory: 11.18GiB freeMemory: 11.07GiB
2019-03-15 08:19:52.548778: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1
2019-03-15 08:19:52.551544: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-15 08:19:52.551594: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 1
2019-03-15 08:19:52.551637: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N N
2019-03-15 08:19:52.551668: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1:   N N
2019-03-15 08:19:52.552640: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/device:GPU:0 with 10765 MB memory) -> physical GPU (device: 0, name: Tesla M40, pci bus id: 0000:02:00.0, compute capability: 5.2)
2019-03-15 08:19:52.554089: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/device:GPU:1 with 10765 MB memory) -> physical GPU (device: 1, name: Tesla M40, pci bus id: 0000:81:00.0, compute capability: 5.2)
INFO:tensorflow:Using config: {'_task_type': 'worker', '_train_distribute': None, '_eval_distribute': None, '_keep_checkpoint_max': 8, '_tf_random_seed': None, '_log_step_count_steps': 500, '_task_id': 0, '_service': None, '_protocol': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 1000, '_device_fn': None, '_experimental_distribute': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fd476fd7c18>, '_master': '', '_global_id_in_cluster': 0, '_keep_checkpoint_every_n_hours': 10000, '_save_checkpoints_secs': None, '_num_worker_replicas': 1, '_session_config': gpu_options {
  allow_growth: true
}
allow_soft_placement: true
graph_options {
  rewrite_options {
    layout_optimizer: OFF
  }
}
, '_num_ps_replicas': 0, '_evaluation_master': '', '_model_dir': 'srctgt_wordshape_transformer', '_is_chief': True}
INFO:tensorflow:Not using Distribute Coordinator.
INFO:tensorflow:Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps 1000 or save_checkpoints_secs None.
WARNING:tensorflow:From /home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /home/Gerd/OpenNMT-tf_v1.21.0/opennmt/inputters/record_inputter.py:38: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and:
`tf.data.TFRecordDataset(path)`
INFO:tensorflow:Training on 5191494 examples
INFO:tensorflow:Calling model_fn.
WARNING:tensorflow:From /home/Gerd/OpenNMT-tf_v1.21.0/opennmt/encoders/self_attention_encoder.py:59: dropout (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dropout instead.
WARNING:tensorflow:From /home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/keras/layers/core.py:143: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
WARNING:tensorflow:From /home/Gerd/OpenNMT-tf_v1.21.0/opennmt/layers/transformer.py:136: conv1d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.conv1d instead.
WARNING:tensorflow:From /home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Number of trainable parameters: 17008945
INFO:tensorflow:Graph was finalized.
2019-03-15 08:21:56.222115: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1
2019-03-15 08:21:56.222216: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-15 08:21:56.222232: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 1
2019-03-15 08:21:56.222242: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N N
2019-03-15 08:21:56.222250: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1:   N N
2019-03-15 08:21:56.222843: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10765 MB memory) -> physical GPU (device: 0, name: Tesla M40, pci bus id: 0000:02:00.0, compute capability: 5.2)
2019-03-15 08:21:56.223124: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 10765 MB memory) -> physical GPU (device: 1, name: Tesla M40, pci bus id: 0000:81:00.0, compute capability: 5.2)
WARNING:tensorflow:From /home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO:tensorflow:Restoring parameters from srctgt_wordshape_transformer/model.ckpt-0
WARNING:tensorflow:From /home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/training/saver.py:1070: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file utilities to get mtimes.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into srcen_wordshape_transformer/model.ckpt.
2019-03-15 08:22:21.999207: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
2019-03-15 08:22:32.388179: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 361008 of 5191494
2019-03-15 08:22:42.388177: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 668594 of 5191494
2019-03-15 08:22:52.388135: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 991006 of 5191494
2019-03-15 08:23:02.388121: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 1302235 of 5191494
2019-03-15 08:23:12.388138: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 1597622 of 5191494
2019-03-15 08:23:22.388134: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 1907869 of 5191494
2019-03-15 08:23:32.388124: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 2205092 of 5191494
2019-03-15 08:23:42.388143: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 2539193 of 5191494
2019-03-15 08:23:52.388129: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 2834096 of 5191494
2019-03-15 08:24:02.388115: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 3183796 of 5191494
2019-03-15 08:24:12.388188: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 3532660 of 5191494
2019-03-15 08:24:22.388290: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 3871255 of 5191494
2019-03-15 08:24:32.388126: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 4211576 of 5191494
2019-03-15 08:24:42.388177: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 4541068 of 5191494
2019-03-15 08:24:52.388133: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 4910743 of 5191494
2019-03-15 08:25:00.026547: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:140] Shuffle buffer filled.
INFO:tensorflow:loss = 8.203028, step = 0
INFO:tensorflow:loss = 7.2150645, step = 100 (149.425 sec)
INFO:tensorflow:loss = 6.8113074, step = 200 (143.945 sec)
INFO:tensorflow:target_words/sec: 21509
INFO:tensorflow:source_words/sec: 33238
INFO:tensorflow:loss = 6.4717984, step = 300 (143.768 sec)
INFO:tensorflow:target_words/sec: 21536
INFO:tensorflow:source_words/sec: 33338
INFO:tensorflow:loss = 6.194136, step = 400 (143.209 sec)
INFO:tensorflow:target_words/sec: 21622
INFO:tensorflow:source_words/sec: 33532
INFO:tensorflow:global_step/sec: 0.691394
INFO:tensorflow:loss = 5.8775764, step = 500 (143.132 sec)
INFO:tensorflow:target_words/sec: 21630
INFO:tensorflow:source_words/sec: 33457
INFO:tensorflow:loss = 5.720784, step = 600 (145.018 sec)
INFO:tensorflow:target_words/sec: 21350
INFO:tensorflow:source_words/sec: 33097
INFO:tensorflow:loss = 5.582198, step = 700 (141.451 sec)
INFO:tensorflow:target_words/sec: 21886
INFO:tensorflow:source_words/sec: 33814
INFO:tensorflow:loss = 5.302073, step = 800 (143.275 sec)
INFO:tensorflow:target_words/sec: 21609
INFO:tensorflow:source_words/sec: 33490
INFO:tensorflow:loss = 5.1950145, step = 900 (142.375 sec)
INFO:tensorflow:target_words/sec: 21748
INFO:tensorflow:source_words/sec: 33625
INFO:tensorflow:Saving checkpoints for 1000 into srctgt_wordshape_transformer/model.ckpt.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
WARNING:tensorflow:From /home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/ops/metrics_impl.py:363: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
INFO:tensorflow:Starting evaluation at 2019-03-15T12:49:09Z
INFO:tensorflow:Graph was finalized.
2019-03-15 08:49:09.501931: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1
2019-03-15 08:49:09.502042: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-15 08:49:09.502061: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 1
2019-03-15 08:49:09.502073: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N N
2019-03-15 08:49:09.502083: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1:   N N
2019-03-15 08:49:09.502340: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10765 MB memory) -> physical GPU (device: 0, name: Tesla M40, pci bus id: 0000:02:00.0, compute capability: 5.2)
2019-03-15 08:49:09.502561: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 10765 MB memory) -> physical GPU (device: 1, name: Tesla M40, pci bus id: 0000:81:00.0, compute capability: 5.2)
INFO:tensorflow:Restoring parameters from srctgt_wordshape_transformer/model.ckpt-1000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Evaluation predictions saved to srctgt_wordshape_transformer/eval/predictions.txt.1000
INFO:tensorflow:BLEU evaluation score: 0.410000
INFO:tensorflow:Finished evaluation at 2019-03-15-12:55:08
INFO:tensorflow:Saving dict for global step 1000: global_step = 1000, loss = 4.695636
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: srctgt_wordshape_transformer/model.ckpt-1000
Traceback (most recent call last):
  File "/home/Gerd/tf3/bin/onmt-main", line 11, in <module>
    load_entry_point('OpenNMT-tf', 'console_scripts', 'onmt-main')()
  File "/home/Gerd/OpenNMT-tf_v1.21.0/opennmt/bin/main.py", line 172, in main
    runner.train_and_evaluate(checkpoint_path=args.checkpoint_path)
  File "/home/Gerd/OpenNMT-tf_v1.21.0/opennmt/runner.py", line 295, in train_and_evaluate
    result = tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/training.py", line 471, in train_and_evaluate
    return executor.run()
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/training.py", line 611, in run
    return self.run_local()
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/training.py", line 712, in run_local
    saving_listeners=saving_listeners)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 358, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1124, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1158, in _train_model_default
    saving_listeners)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1407, in _train_with_estimator_spec
    _, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/training/monitored_session.py", line 676, in run
    run_metadata=run_metadata)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/training/monitored_session.py", line 1171, in run
    run_metadata=run_metadata)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/training/monitored_session.py", line 1270, in run
    raise six.reraise(*original_exc_info)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/six.py", line 693, in reraise
    raise value
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/training/monitored_session.py", line 1255, in run
    return self._sess.run(*args, **kwargs)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/training/monitored_session.py", line 1335, in run
    run_metadata=run_metadata))
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/training/basic_session_run_hooks.py", line 582, in after_run
    if self._save(run_context.session, global_step):
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/training/basic_session_run_hooks.py", line 607, in _save
    if l.after_save(session, step):
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/training.py", line 517, in after_save
    self._evaluate(global_step_value)  # updates self.eval_result
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/training.py", line 537, in _evaluate
    self._evaluator.evaluate_and_export())
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/training.py", line 925, in evaluate_and_export
    is_the_final_export)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/training.py", line 958, in _export_eval_result
    is_the_final_export=is_the_final_export))
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/exporter.py", line 473, in export
    is_the_final_export)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/exporter.py", line 126, in export
    strip_default_attrs=self._strip_default_attrs)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1645, in export_savedmodel
    experimental_mode=model_fn_lib.ModeKeys.PREDICT)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 723, in export_saved_model
    checkpoint_path=checkpoint_path)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 827, in experimental_export_all_saved_models
    save_variables, mode=model_fn_lib.ModeKeys.PREDICT)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 890, in _add_meta_graph_for_mode
    input_receiver = input_receiver_fn()
  File "/home/Gerd/OpenNMT-tf_v1.21.0/opennmt/estimator.py", line 30, in _fn
    return local_model.features_inputter.get_serving_input_receiver()
  File "/home/Gerd/OpenNMT-tf_v1.21.0/opennmt/inputters/inputter.py", line 130, in get_serving_input_receiver
    receiver_tensors = self.get_receiver_tensors()
  File "/home/Gerd/OpenNMT-tf_v1.21.0/opennmt/inputters/inputter.py", line 386, in get_receiver_tensors
    tensors = inputter.get_receiver_tensors()
  File "/home/Gerd/OpenNMT-tf_v1.21.0/opennmt/inputters/record_inputter.py", line 42, in get_receiver_tensors
    "tensor": tf.placeholder(self.dtype, shape=(None, None, self.input_depth)),
AttributeError: 'SequenceRecordInputter' object has no attribute 'input_depth'

guillaumekln · March 19, 2019, 8:30am

Hi,

This regression was fixed in v1.21.4. The error occurs when trying to export the model.

The dataset instance is currently constructed with the default arguments (i.e. no compression). I will look if compression options can be cleanly exposed.

Mmh, I think this should work for multi source training. What was the error?

Gerd · March 19, 2019, 6:22pm

Thanks!

Updating to 1.21.6 fixed my input_depth problem. I could not replicate the embedding_size problem, so that must have been a figment of my imagination.

I would be happy to have compressed TFRecords supported. I tried hard-coding a hack to enable this (return tf.data.TFRecordDataset(data_file,compression_type=“GZIP”)) in record_inputter.py, but it did not work for me.

Thanks again,
Gerd

guillaumekln · March 20, 2019, 8:34am

That should be the way to do it. Just to make sure, to generate the compressed record file you configured the options argument of the TFRecordWriter, right?

https://www.tensorflow.org/api_docs/python/tf/io/TFRecordWriter

Gerd · March 20, 2019, 7:37pm

That sounds like what I tried:

import tensorflow as tf
import opennmt as onmt
import numpy as np

options=tf.python_io.TFRecordOptions(tf.python_io.TFRecordCompressionType.GZIP)
writer=tf.io.TFRecordWriter(outfile,options)

while line:

  # Define vectorList, a list of vectors (numpy arrays) for the line

  sentenceRecord=np.vstack(vectorList)
  onmt.inputters.write_sequence_record(sentenceRecord,writer)

writer.close()

The file is definitely compressed (100x compression). I’ll try the hard-coded fix again when I have fewer systems in progress. I don’t think I tried it after the latest update.

Thanks!

Gerd · March 22, 2019, 8:08pm

Reading compressed TFRecords now works for me, with the below changes in record_inputter. Compression options are not cleanly exposed, so this “solution” is bad for people who prefer not to compress with gzip.

def make_dataset(self, data_file, training=None):
    options=tf.python_io.TFRecordOptions(tf.python_io.TFRecordCompressionType.GZIP)
    first_record = next(compat.tf_compat(v1="python_io.tf_record_iterator")(data_file,options))
    first_record = tf.train.Example.FromString(first_record)
    shape = first_record.features.feature["shape"].int64_list.value
    self.input_depth = shape[-1]
    return tf.data.TFRecordDataset(data_file,compression_type="GZIP")

  def get_dataset_size(self, data_file):
    options=tf.python_io.TFRecordOptions(tf.python_io.TFRecordCompressionType.GZIP)
    return sum(1 for _ in compat.tf_compat(v1="python_io.tf_record_iterator")(data_file,options))