AttributeError: 'SequenceRecordInputter' object has no attribute 'input_depth'

Greetings.

I’m running into problems using a multisource model with one source being a TFRecord (each line of text corresponds to a sequence of floating-point vectors) and the other source being text.

I’m using OpenNMT-tf_v1.21.0, python 3.4, TensorFlow 1.13.1

My current problem is shown in the log at the bottom of the post: “AttributeError: ‘SequenceRecordInputter’ object has no attribute ‘input_depth’”. Any help would be appreciated.

Lesser problems (not of immediate concern, but might be related to the above):

  1. A compressed (GZIP or ZLIB) TFRecord file does not seem to load properly (“tensorflow.python.framework.errors_impl.DataLossError: corrupted record at 0”). The vectors are sparse, so compression would help a lot.
  2. I had to set source embedding_size equal to the depth of the TFRecord’s vectors to get it to work. I was hoping for a larger embedding size than the vector’s 400-dimensional representation.

Thanks in advance!

Here are some I/O. Happy to provide more info or try some experiments on this end.

Command-line:

onmt-main train_and_eval \
  --config config/enmt_wordshape_transformer.yml \
  --model config/model/wordshape_transformer.py \
  --auto_config \
  --gpu_allow_growth \
  --num_gpus 2

yml:

model_dir: srctgt_wordshape_transformer

data:
  train_features_file: [data/train.wv200.src, data/train.sp32k.src]
  train_labels_file: data/train.sp32k.tgt
  eval_features_file: [data/valid.wv200.src, data/valid.sp32k.src]
  eval_labels_file: data/valid.sp32k.tgt
  source_words_vocabulary: data/enmt-src-32k.onmt.vocab
  target_words_vocabulary: data/enmt-tgt-32k.onmt.vocab

train:
  save_checkpoints_steps: 1000
  exporters: last

eval:
  eval_delay: 3600  # Every 1 hour
  external_evaluators: BLEU

infer:
  batch_size: 32

model_description:

import opennmt as onmt

def model():
  return onmt.models.Transformer(
      source_inputter=onmt.inputters.ParallelInputter([
          onmt.inputters.SequenceRecordInputter(),
          onmt.inputters.WordEmbedder(
              vocabulary_file_key="source_words_vocabulary",
              embedding_size=400)]),
      target_inputter=onmt.inputters.WordEmbedder(
          vocabulary_file_key="target_words_vocabulary",
          embedding_size=512),
      num_layers=2,
      num_units=512,
      num_heads=8,
      ffn_inner_dim=1024,
      dropout=0.1,
      attention_dropout=0.1,
      relu_dropout=0.1,
      share_encoders=True)

Log:

WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:You provided a model configuration but a checkpoint already exists. The model configuration must define the same model as the one used for the initial training. However, you can change non structural values like dropout.
INFO:tensorflow:Using parameters:
data:
  eval_features_file:
  - data/valid.wv200.src
  - data/valid.sp32k.src
  eval_labels_file: data/valid.sp32k.tgt
  source_words_vocabulary: data/srctgt-src-32k.onmt.vocab
  target_words_vocabulary: data/srctgt-tgt-32k.onmt.vocab
  train_features_file:
  - data/train.wv200.src
  - data/train.sp32k.src
  train_labels_file: data/train.sp32k.tgt
eval:
  batch_size: 32
  eval_delay: 3600
  exporters: last
  external_evaluators: BLEU
infer:
  batch_size: 32
  bucket_width: 5
model_dir: srctgt_wordshape_transformer
params:
  average_loss_in_time: true
  beam_width: 4
  decay_params:
    model_dim: 512
    warmup_steps: 8000
  decay_type: noam_decay_v2
  label_smoothing: 0.1
  learning_rate: 2.0
  length_penalty: 0.6
  optimizer: LazyAdamOptimizer
  optimizer_params:
    beta1: 0.9
    beta2: 0.998
score:
  batch_size: 64
train:
  average_last_checkpoints: 8
  batch_size: 3072
  batch_type: tokens
  bucket_width: 1
  effective_batch_size: 25000
  exporters: last
  keep_checkpoint_max: 8
  maximum_features_length: 100
  maximum_labels_length: 100
  sample_buffer_size: -1
  save_checkpoints_steps: 1000
  save_summary_steps: 100
  train_steps: 500000

INFO:tensorflow:Accumulate gradients of 5 iterations to reach effective batch size of 25000
2019-03-15 08:19:52.109281: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-03-15 08:19:52.534444: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x4471720 executing computations on platform CUDA. Devices:
2019-03-15 08:19:52.534533: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): Tesla M40, Compute Capability 5.2
2019-03-15 08:19:52.534580: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (1): Tesla M40, Compute Capability 5.2
2019-03-15 08:19:52.541476: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2599970000 Hz
2019-03-15 08:19:52.547350: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x456f210 executing computations on platform Host. Devices:
2019-03-15 08:19:52.547415: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2019-03-15 08:19:52.548121: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: Tesla M40 major: 5 minor: 2 memoryClockRate(GHz): 1.112
pciBusID: 0000:02:00.0
totalMemory: 11.18GiB freeMemory: 11.07GiB
2019-03-15 08:19:52.548637: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 1 with properties:
name: Tesla M40 major: 5 minor: 2 memoryClockRate(GHz): 1.112
pciBusID: 0000:81:00.0
totalMemory: 11.18GiB freeMemory: 11.07GiB
2019-03-15 08:19:52.548778: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1
2019-03-15 08:19:52.551544: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-15 08:19:52.551594: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 1
2019-03-15 08:19:52.551637: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N N
2019-03-15 08:19:52.551668: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1:   N N
2019-03-15 08:19:52.552640: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/device:GPU:0 with 10765 MB memory) -> physical GPU (device: 0, name: Tesla M40, pci bus id: 0000:02:00.0, compute capability: 5.2)
2019-03-15 08:19:52.554089: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/device:GPU:1 with 10765 MB memory) -> physical GPU (device: 1, name: Tesla M40, pci bus id: 0000:81:00.0, compute capability: 5.2)
INFO:tensorflow:Using config: {'_task_type': 'worker', '_train_distribute': None, '_eval_distribute': None, '_keep_checkpoint_max': 8, '_tf_random_seed': None, '_log_step_count_steps': 500, '_task_id': 0, '_service': None, '_protocol': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 1000, '_device_fn': None, '_experimental_distribute': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fd476fd7c18>, '_master': '', '_global_id_in_cluster': 0, '_keep_checkpoint_every_n_hours': 10000, '_save_checkpoints_secs': None, '_num_worker_replicas': 1, '_session_config': gpu_options {
  allow_growth: true
}
allow_soft_placement: true
graph_options {
  rewrite_options {
    layout_optimizer: OFF
  }
}
, '_num_ps_replicas': 0, '_evaluation_master': '', '_model_dir': 'srctgt_wordshape_transformer', '_is_chief': True}
INFO:tensorflow:Not using Distribute Coordinator.
INFO:tensorflow:Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps 1000 or save_checkpoints_secs None.
WARNING:tensorflow:From /home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /home/Gerd/OpenNMT-tf_v1.21.0/opennmt/inputters/record_inputter.py:38: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and:
`tf.data.TFRecordDataset(path)`
INFO:tensorflow:Training on 5191494 examples
INFO:tensorflow:Calling model_fn.
WARNING:tensorflow:From /home/Gerd/OpenNMT-tf_v1.21.0/opennmt/encoders/self_attention_encoder.py:59: dropout (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dropout instead.
WARNING:tensorflow:From /home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/keras/layers/core.py:143: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
WARNING:tensorflow:From /home/Gerd/OpenNMT-tf_v1.21.0/opennmt/layers/transformer.py:136: conv1d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.conv1d instead.
WARNING:tensorflow:From /home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Number of trainable parameters: 17008945
INFO:tensorflow:Graph was finalized.
2019-03-15 08:21:56.222115: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1
2019-03-15 08:21:56.222216: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-15 08:21:56.222232: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 1
2019-03-15 08:21:56.222242: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N N
2019-03-15 08:21:56.222250: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1:   N N
2019-03-15 08:21:56.222843: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10765 MB memory) -> physical GPU (device: 0, name: Tesla M40, pci bus id: 0000:02:00.0, compute capability: 5.2)
2019-03-15 08:21:56.223124: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 10765 MB memory) -> physical GPU (device: 1, name: Tesla M40, pci bus id: 0000:81:00.0, compute capability: 5.2)
WARNING:tensorflow:From /home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO:tensorflow:Restoring parameters from srctgt_wordshape_transformer/model.ckpt-0
WARNING:tensorflow:From /home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/training/saver.py:1070: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file utilities to get mtimes.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into srcen_wordshape_transformer/model.ckpt.
2019-03-15 08:22:21.999207: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
2019-03-15 08:22:32.388179: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 361008 of 5191494
2019-03-15 08:22:42.388177: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 668594 of 5191494
2019-03-15 08:22:52.388135: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 991006 of 5191494
2019-03-15 08:23:02.388121: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 1302235 of 5191494
2019-03-15 08:23:12.388138: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 1597622 of 5191494
2019-03-15 08:23:22.388134: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 1907869 of 5191494
2019-03-15 08:23:32.388124: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 2205092 of 5191494
2019-03-15 08:23:42.388143: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 2539193 of 5191494
2019-03-15 08:23:52.388129: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 2834096 of 5191494
2019-03-15 08:24:02.388115: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 3183796 of 5191494
2019-03-15 08:24:12.388188: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 3532660 of 5191494
2019-03-15 08:24:22.388290: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 3871255 of 5191494
2019-03-15 08:24:32.388126: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 4211576 of 5191494
2019-03-15 08:24:42.388177: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 4541068 of 5191494
2019-03-15 08:24:52.388133: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:101] Filling up shuffle buffer (this may take a while): 4910743 of 5191494
2019-03-15 08:25:00.026547: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:140] Shuffle buffer filled.
INFO:tensorflow:loss = 8.203028, step = 0
INFO:tensorflow:loss = 7.2150645, step = 100 (149.425 sec)
INFO:tensorflow:loss = 6.8113074, step = 200 (143.945 sec)
INFO:tensorflow:target_words/sec: 21509
INFO:tensorflow:source_words/sec: 33238
INFO:tensorflow:loss = 6.4717984, step = 300 (143.768 sec)
INFO:tensorflow:target_words/sec: 21536
INFO:tensorflow:source_words/sec: 33338
INFO:tensorflow:loss = 6.194136, step = 400 (143.209 sec)
INFO:tensorflow:target_words/sec: 21622
INFO:tensorflow:source_words/sec: 33532
INFO:tensorflow:global_step/sec: 0.691394
INFO:tensorflow:loss = 5.8775764, step = 500 (143.132 sec)
INFO:tensorflow:target_words/sec: 21630
INFO:tensorflow:source_words/sec: 33457
INFO:tensorflow:loss = 5.720784, step = 600 (145.018 sec)
INFO:tensorflow:target_words/sec: 21350
INFO:tensorflow:source_words/sec: 33097
INFO:tensorflow:loss = 5.582198, step = 700 (141.451 sec)
INFO:tensorflow:target_words/sec: 21886
INFO:tensorflow:source_words/sec: 33814
INFO:tensorflow:loss = 5.302073, step = 800 (143.275 sec)
INFO:tensorflow:target_words/sec: 21609
INFO:tensorflow:source_words/sec: 33490
INFO:tensorflow:loss = 5.1950145, step = 900 (142.375 sec)
INFO:tensorflow:target_words/sec: 21748
INFO:tensorflow:source_words/sec: 33625
INFO:tensorflow:Saving checkpoints for 1000 into srctgt_wordshape_transformer/model.ckpt.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
WARNING:tensorflow:From /home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/ops/metrics_impl.py:363: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
INFO:tensorflow:Starting evaluation at 2019-03-15T12:49:09Z
INFO:tensorflow:Graph was finalized.
2019-03-15 08:49:09.501931: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1
2019-03-15 08:49:09.502042: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-15 08:49:09.502061: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 1
2019-03-15 08:49:09.502073: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N N
2019-03-15 08:49:09.502083: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1:   N N
2019-03-15 08:49:09.502340: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10765 MB memory) -> physical GPU (device: 0, name: Tesla M40, pci bus id: 0000:02:00.0, compute capability: 5.2)
2019-03-15 08:49:09.502561: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 10765 MB memory) -> physical GPU (device: 1, name: Tesla M40, pci bus id: 0000:81:00.0, compute capability: 5.2)
INFO:tensorflow:Restoring parameters from srctgt_wordshape_transformer/model.ckpt-1000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Evaluation predictions saved to srctgt_wordshape_transformer/eval/predictions.txt.1000
INFO:tensorflow:BLEU evaluation score: 0.410000
INFO:tensorflow:Finished evaluation at 2019-03-15-12:55:08
INFO:tensorflow:Saving dict for global step 1000: global_step = 1000, loss = 4.695636
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: srctgt_wordshape_transformer/model.ckpt-1000
Traceback (most recent call last):
  File "/home/Gerd/tf3/bin/onmt-main", line 11, in <module>
    load_entry_point('OpenNMT-tf', 'console_scripts', 'onmt-main')()
  File "/home/Gerd/OpenNMT-tf_v1.21.0/opennmt/bin/main.py", line 172, in main
    runner.train_and_evaluate(checkpoint_path=args.checkpoint_path)
  File "/home/Gerd/OpenNMT-tf_v1.21.0/opennmt/runner.py", line 295, in train_and_evaluate
    result = tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/training.py", line 471, in train_and_evaluate
    return executor.run()
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/training.py", line 611, in run
    return self.run_local()
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/training.py", line 712, in run_local
    saving_listeners=saving_listeners)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 358, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1124, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1158, in _train_model_default
    saving_listeners)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1407, in _train_with_estimator_spec
    _, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/training/monitored_session.py", line 676, in run
    run_metadata=run_metadata)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/training/monitored_session.py", line 1171, in run
    run_metadata=run_metadata)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/training/monitored_session.py", line 1270, in run
    raise six.reraise(*original_exc_info)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/six.py", line 693, in reraise
    raise value
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/training/monitored_session.py", line 1255, in run
    return self._sess.run(*args, **kwargs)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/training/monitored_session.py", line 1335, in run
    run_metadata=run_metadata))
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/training/basic_session_run_hooks.py", line 582, in after_run
    if self._save(run_context.session, global_step):
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow/python/training/basic_session_run_hooks.py", line 607, in _save
    if l.after_save(session, step):
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/training.py", line 517, in after_save
    self._evaluate(global_step_value)  # updates self.eval_result
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/training.py", line 537, in _evaluate
    self._evaluator.evaluate_and_export())
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/training.py", line 925, in evaluate_and_export
    is_the_final_export)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/training.py", line 958, in _export_eval_result
    is_the_final_export=is_the_final_export))
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/exporter.py", line 473, in export
    is_the_final_export)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/exporter.py", line 126, in export
    strip_default_attrs=self._strip_default_attrs)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1645, in export_savedmodel
    experimental_mode=model_fn_lib.ModeKeys.PREDICT)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 723, in export_saved_model
    checkpoint_path=checkpoint_path)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 827, in experimental_export_all_saved_models
    save_variables, mode=model_fn_lib.ModeKeys.PREDICT)
  File "/home/Gerd/tf3/lib64/python3.4/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 890, in _add_meta_graph_for_mode
    input_receiver = input_receiver_fn()
  File "/home/Gerd/OpenNMT-tf_v1.21.0/opennmt/estimator.py", line 30, in _fn
    return local_model.features_inputter.get_serving_input_receiver()
  File "/home/Gerd/OpenNMT-tf_v1.21.0/opennmt/inputters/inputter.py", line 130, in get_serving_input_receiver
    receiver_tensors = self.get_receiver_tensors()
  File "/home/Gerd/OpenNMT-tf_v1.21.0/opennmt/inputters/inputter.py", line 386, in get_receiver_tensors
    tensors = inputter.get_receiver_tensors()
  File "/home/Gerd/OpenNMT-tf_v1.21.0/opennmt/inputters/record_inputter.py", line 42, in get_receiver_tensors
    "tensor": tf.placeholder(self.dtype, shape=(None, None, self.input_depth)),
AttributeError: 'SequenceRecordInputter' object has no attribute 'input_depth'

Hi,

This regression was fixed in v1.21.4. The error occurs when trying to export the model.

The dataset instance is currently constructed with the default arguments (i.e. no compression). I will look if compression options can be cleanly exposed.

Mmh, I think this should work for multi source training. What was the error?

Thanks!

Updating to 1.21.6 fixed my input_depth problem. I could not replicate the embedding_size problem, so that must have been a figment of my imagination.

I would be happy to have compressed TFRecords supported. I tried hard-coding a hack to enable this (return tf.data.TFRecordDataset(data_file,compression_type=“GZIP”)) in record_inputter.py, but it did not work for me.

Thanks again,
Gerd

That should be the way to do it. Just to make sure, to generate the compressed record file you configured the options argument of the TFRecordWriter, right?

https://www.tensorflow.org/api_docs/python/tf/io/TFRecordWriter

That sounds like what I tried:

import tensorflow as tf
import opennmt as onmt
import numpy as np

options=tf.python_io.TFRecordOptions(tf.python_io.TFRecordCompressionType.GZIP)
writer=tf.io.TFRecordWriter(outfile,options)

while line:

  # Define vectorList, a list of vectors (numpy arrays) for the line

  sentenceRecord=np.vstack(vectorList)
  onmt.inputters.write_sequence_record(sentenceRecord,writer)

writer.close()

The file is definitely compressed (100x compression). I’ll try the hard-coded fix again when I have fewer systems in progress. I don’t think I tried it after the latest update.

Thanks!

Reading compressed TFRecords now works for me, with the below changes in record_inputter. Compression options are not cleanly exposed, so this “solution” is bad for people who prefer not to compress with gzip.

def make_dataset(self, data_file, training=None):
    options=tf.python_io.TFRecordOptions(tf.python_io.TFRecordCompressionType.GZIP)
    first_record = next(compat.tf_compat(v1="python_io.tf_record_iterator")(data_file,options))
    first_record = tf.train.Example.FromString(first_record)
    shape = first_record.features.feature["shape"].int64_list.value
    self.input_depth = shape[-1]
    return tf.data.TFRecordDataset(data_file,compression_type="GZIP")

  def get_dataset_size(self, data_file):
    options=tf.python_io.TFRecordOptions(tf.python_io.TFRecordCompressionType.GZIP)
    return sum(1 for _ in compat.tf_compat(v1="python_io.tf_record_iterator")(data_file,options))
1 Like