Types of generalizations learned by NMT?

What types of generalizations do we expect NMT to learn?
Is there a good paper on this?

eg 1. if we train the following sentences:

I like dogs.
They are cute.
I eat sandwiches because they are tasty.

plus examples of the other words in the sandwich sentence.

Do we expect it to correctly translate:

I like dogs because they are cute.

How do we think it’s doing this exactly?

eg 2. If it’s never seen a ‘because’ sentence with LONG clauses on either side of the ‘because’ (but has seen because examples with SHORT clauses), will it handle the translation of a ‘because’ sentence with LONG clauses correctly if it can handle the left and right-handed LONG clauses by themselves?

How do we think it’s doing this exactly?