Generation config boolean defaults#43000
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
run-slow: llama, gemma2, qwen2, bart, t5, whisper |
|
This comment contains models: ["models/bart", "models/gemma2", "models/llama", "models/qwen2", "models/t5", "models/whisper"] |
CI ResultsModel CI Report❌ Failed tests
|
| if key == "use_cache": | ||
| continue # common key for most models |
There was a problem hiding this comment.
btw, could we remove use_cache as a key from config and instead keep it only in generation_config? Imo it isn't really a model attribute
There was a problem hiding this comment.
Hmm unsure. I understand the point but that would be hugely breaking for a lot of users and external libraries - on the other hand we could include this for v5 if you feel motivated 🤔
There was a problem hiding this comment.
it will definitely break stuff 😿 Just a thought for now
There was a problem hiding this comment.
so, this is a workaround we have to not throw errors that model config contains generation params. We won't throw errors anyway if use_cache is in model config signature but unfortunately many don't have it documented
There was a problem hiding this comment.
Gotcha, maybe at some point we can 👀
src/transformers/trainer.py
Outdated
| if getattr(self.model, "config", None) is not None: | ||
| self.model.config.use_cache = self.args.use_cache | ||
| if getattr(self.model, "generation_config", None) is not None: | ||
| self.model.generation_config.use_cache = self.args.use_cache |
There was a problem hiding this comment.
model config might not contain use_cache unless it is in config's own init. We don't assign generation attributes anymore in base config class
So checking generation_config is more reliable
There was a problem hiding this comment.
What's the current logic here then? kwarg > config > generation config? Seems like another case of bad relations here 😭
It does feel a bit out of place tho to me. I'd like to keep this PR strictly to the default values, this semi progresses the point of removing reliance on use_cache in the config. On another note, we use the check_model_inputs decorator with the config use_cache so it might be hard to disentangle this completely
There was a problem hiding this comment.
we are nor removing use_cache yet in this PR. I hacked by the above check that doesn't include use_cache in generation params
I will revert it back, forgot to do so after testing some stuff
There was a problem hiding this comment.
The logic stays as is, so yes, kwarg > config > generation config
vasqu
left a comment
There was a problem hiding this comment.
Left a few comments, I'd be pro changing the use_cache changes (at least for now) - see my last comment.
Also do we still need
transformers/src/transformers/generation/utils.py
Lines 1800 to 1812 in a616b91
I added this only due to these not being none within their defaults, but now we should update them properly without that workaround, no?
Last note, semi-unrelated, let's also keep track of https://github.com/huggingface/transformers/pull/42702/files#r2635806800 in another PR cc @ebezzam
| if key == "use_cache": | ||
| continue # common key for most models |
There was a problem hiding this comment.
Hmm unsure. I understand the point but that would be hugely breaking for a lot of users and external libraries - on the other hand we could include this for v5 if you feel motivated 🤔
| generation_mode = GenerationMode.CONSTRAINED_BEAM_SEARCH | ||
| elif self.num_beams is None or self.num_beams == 1: | ||
| if not self.do_sample: | ||
| if self.do_sample is not True: |
There was a problem hiding this comment.
| if self.do_sample is not True: | |
| if self.do_sample is False: |
super nit, seen this multiple times
There was a problem hiding this comment.
Actually, it should be if not self.do_sample haha
There was a problem hiding this comment.
if not do_sample will result in false positives when do_sample is None 😢 Gotta be explicitly checking for not True, since the value of None is same as having False, i.e. the new default value
There was a problem hiding this comment.
Can we add a small comment then, that's such a weird case lol.
There was a problem hiding this comment.
Are these changes really relevant? Seems unrelated to me
There was a problem hiding this comment.
some tests were failing without it because the re-loaded model has a incorrect config type. And we couldn't find generation keys in it
It was luckily passing in main branch
There was a problem hiding this comment.
Ok, then let's keep it, not wrong was just surprised
src/transformers/trainer.py
Outdated
| if getattr(self.model, "config", None) is not None: | ||
| self.model.config.use_cache = self.args.use_cache | ||
| if getattr(self.model, "generation_config", None) is not None: | ||
| self.model.generation_config.use_cache = self.args.use_cache |
There was a problem hiding this comment.
What's the current logic here then? kwarg > config > generation config? Seems like another case of bad relations here 😭
It does feel a bit out of place tho to me. I'd like to keep this PR strictly to the default values, this semi progresses the point of removing reliance on use_cache in the config. On another note, we use the check_model_inputs decorator with the config use_cache so it might be hard to disentangle this completely
vasqu
left a comment
There was a problem hiding this comment.
LGTM, could you just double check if we still need
transformers/src/transformers/generation/utils.py
Lines 1800 to 1812 in a616b91
Now that these values are defaulting to None as well, I suspect we might not need it anymore
| if key == "use_cache": | ||
| continue # common key for most models |
There was a problem hiding this comment.
Gotcha, maybe at some point we can 👀
| generation_mode = GenerationMode.CONSTRAINED_BEAM_SEARCH | ||
| elif self.num_beams is None or self.num_beams == 1: | ||
| if not self.do_sample: | ||
| if self.do_sample is not True: |
There was a problem hiding this comment.
Can we add a small comment then, that's such a weird case lol.
There was a problem hiding this comment.
Ok, then let's keep it, not wrong was just surprised
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
my bad, didn't see the comment on it. Not needed! |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: bart, clvp |
| self.num_return_sequences = kwargs.pop("num_return_sequences", 1) | ||
| self.output_attentions = kwargs.pop("output_attentions", False) | ||
| self.output_hidden_states = kwargs.pop("output_hidden_states", False) | ||
| self.output_scores = kwargs.pop("output_scores", False) | ||
| self.output_logits = kwargs.pop("output_logits", False) | ||
| self.return_dict_in_generate = kwargs.pop("return_dict_in_generate", False) | ||
| self.num_return_sequences = kwargs.pop("num_return_sequences", None) | ||
| self.output_attentions = kwargs.pop("output_attentions", None) | ||
| self.output_hidden_states = kwargs.pop("output_hidden_states", None) | ||
| self.output_scores = kwargs.pop("output_scores", None) | ||
| self.output_logits = kwargs.pop("output_logits", None) | ||
| self.return_dict_in_generate = kwargs.pop("return_dict_in_generate", None) |
There was a problem hiding this comment.
Sorry, I'm a bit lost into the motivation the PR and making those values to None (which is usually not a super good idea if they are boolean by essence) - could you give me a bit of context on the PR?
There was a problem hiding this comment.
The motivation was to fix a ton of edge cases when creating a generation config. In the past we had ugly workaround to allow users to change a generation param back to default if different value was saved on the hub (for ex #42762).
That was fixed in #42702 and I deleted a lot of code, and defined a clear priority list, Next step will be to disallow users to pass both: a generation config and kwargs. Instead ask to put all kwargs in config. The below is accepted rn and is a bad practice imo
conf = GenerationConfig(do_sample=False)
text = model.generate(input_ids, generation_config=conf, max_new_tokens=128, temperature=1.6)* push * fix a few tests * why no tests are failing, suspicious * one tiny fix * maybe? * typo * clvp has incorrect config types * nit * docstring and revert unrelated trainer * forgot to revert this one * Update src/transformers/generation/configuration_utils.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * delete the prev fix and add small comment --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
What does this PR do?
Continues on #42958 and makes sure the opposite case, when config has a non-default value saved but the user wants to set it back to a default boolean, is also covered