Sunday, October 15, 2023

"A decoder-only architecture is being used, but right-padding was detected!"

I was playing with Huggingface transformers and kept getting the warning "A decoder-only architecture is being used, but right-padding was detected!". I finally found a solution in a StackOverflow reply that will be credited at the end:

Padding in this context is referring to the "tokenizer.eos_token", and you are currently padding to the right of the user input and the error is saying that for correct results add padding to the left. You need to do this:

new_user_input_ids = tokenizer.encode(tokenizer.eos_token + input(">> User:"), return_tensors='pt')

While I originally thought it was about setting the parameter padding_side='left', it turned out to be about the order in which you concatenate the input and the eos_token.

Thank you user Travis Thayer on StackOverflow:
https://stackoverflow.com/questions/74748116/huggingface-automodelforcasuallm-decoder-only-architecture-warning-even-after/74972288#74972288

No comments: