[Docs] Complete missing Llama4 configuration docs#43460
[Docs] Complete missing Llama4 configuration docs#43460Cyrilvallez merged 5 commits intohuggingface:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR completes missing documentation by replacing TODO placeholders in the Llama4 configuration file with proper parameter descriptions. The changes add documentation for MoE (Mixture of Experts) parameters in Llama4TextConfig and Vision Projector settings in Llama4VisionConfig.
Changes:
- Added documentation for vision projector parameters (vision_feature_select_strategy, pixel_shuffle_ratio, projector dimensions, dropout settings)
- Added documentation for MoE parameters (num_experts_per_tok, num_local_experts, moe_layers, router settings)
- Added documentation for attention-related parameters (attention_dropout, use_qk_norm, attention_chunk_size, floor_scale, attn_scale)
- Applied automatic code formatting via
make styleto several lines
|
Update: I did fixed type hints (from However, I noticed that tests failing due to |
stevhliu
left a comment
There was a problem hiding this comment.
thanks for filling in the todo's!
i think it'd be even more helpful if you could add to the argument description what effect changing a specific arg has. for example, what does increasing or decreasing pixel_shuffle_ratio do?
ee21085 to
2ff1d8a
Compare
Thanks for the review, Steven! I've updated the vague parameters (including I rebased on main, so it's ready for a re-review! |
stevhliu
left a comment
There was a problem hiding this comment.
super nice! made some suggestions to make them a bit more concise :)
2ff1d8a to
4b94610
Compare
|
Thanks for the better wording! I've applied the necessary suggestions to each one of them and amended to previous commit to keep the commit history clean. |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
@ydshieh, can you help merge this please? this pr adds missing arg descriptions to docstrings and the failing tests are unrelated 🙏 |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: llama4 |
What does this PR do?
This PR addresses the
TODOplaceholders left insrc/transformers/models/llama4/configuration_llama4.py.The
Llama4VisionConfigandLlama4TextConfigclasses contained several "TODO" markers in their docstrings (specifically for MoE parameters and Vision Projector settings). I have replaced these with standard parameter descriptions consistent with the MoE architecture and similar models.I also ran
make style, which automatically reformatted a few lines in the file.Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@ArthurZucker @Cyrilvallez @stevhliu