An idf is constant for every corpus, and accounts to the ratio of documents which include the term "this". With this case, We have now a corpus of two documents and all of them include things like the phrase "this".epoch. For this reason a Dataset.batch applied right after Dataset.repeat will generate batches that straddle epoch boundaries:Tf–idf