Fix DeepSpeed model preparation logic in Trainer class#43780
Conversation
|
@ArthurZucker @Cyrilvallez ca we have this in the next patch release please (if any) |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
albertvillanova
left a comment
There was a problem hiding this comment.
Thanks for the fix. Just a question below.
| if self.is_deepspeed_enabled: | ||
| from accelerate.utils import DummyScheduler | ||
|
|
||
| if isinstance(self.lr_scheduler, DummyScheduler): |
There was a problem hiding this comment.
Not sure why we need this condition first.
There was a problem hiding this comment.
This is a special case when the lr_scheduler is created by deepspeed and not by us / users
ArthurZucker
left a comment
There was a problem hiding this comment.
cc @SunMarc do you know if our tests just did not catch or they are slow?
SunMarc
left a comment
There was a problem hiding this comment.
Sorry for the regression ! Changes make sense
Slow tests but we definitely need to work on improving our tests. Coming soon ! |
…3780) Fix deepspeed model preparation logic in Trainer class Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
The changes in #43711 caused the model to be never prepared when using DeepSpeed. When training you hit for example:
this PR fixes it