Skip to content

[Docs] Complete missing Llama4 configuration docs#43460

Merged
Cyrilvallez merged 5 commits intohuggingface:mainfrom
udaymehta:fix-llama4-docstrings
Feb 2, 2026
Merged

[Docs] Complete missing Llama4 configuration docs#43460
Cyrilvallez merged 5 commits intohuggingface:mainfrom
udaymehta:fix-llama4-docstrings

Conversation

@udaymehta
Copy link
Contributor

What does this PR do?

This PR addresses the TODO placeholders left in src/transformers/models/llama4/configuration_llama4.py.

The Llama4VisionConfig and Llama4TextConfig classes contained several "TODO" markers in their docstrings (specifically for MoE parameters and Vision Projector settings). I have replaced these with standard parameter descriptions consistent with the MoE architecture and similar models.

I also ran make style, which automatically reformatted a few lines in the file.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@ArthurZucker @Cyrilvallez @stevhliu

Copilot AI review requested due to automatic review settings January 24, 2026 07:17
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR completes missing documentation by replacing TODO placeholders in the Llama4 configuration file with proper parameter descriptions. The changes add documentation for MoE (Mixture of Experts) parameters in Llama4TextConfig and Vision Projector settings in Llama4VisionConfig.

Changes:

  • Added documentation for vision projector parameters (vision_feature_select_strategy, pixel_shuffle_ratio, projector dimensions, dropout settings)
  • Added documentation for MoE parameters (num_experts_per_tok, num_local_experts, moe_layers, router settings)
  • Added documentation for attention-related parameters (attention_dropout, use_qk_norm, attention_chunk_size, floor_scale, attn_scale)
  • Applied automatic code formatting via make style to several lines

@udaymehta
Copy link
Contributor Author

Update: I did fixed type hints (from int to float where appropriate) and moe_layers type based on copilot suggestions.

However, I noticed that tests failing due to PIL.UnidentifiedImageError and timeout errors. I think it seems unrelated to the docstring changes in this PR.

Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for filling in the todo's!

i think it'd be even more helpful if you could add to the argument description what effect changing a specific arg has. for example, what does increasing or decreasing pixel_shuffle_ratio do?

@udaymehta udaymehta force-pushed the fix-llama4-docstrings branch from ee21085 to 2ff1d8a Compare January 26, 2026 05:40
@udaymehta
Copy link
Contributor Author

for example, what does increasing or decreasing pixel_shuffle_ratio do?

Thanks for the review, Steven!

I've updated the vague parameters (including pixel_shuffle_ratio) to explicitly explain how changing the values impacts the behavior.

I rebased on main, so it's ready for a re-review!

Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nice! made some suggestions to make them a bit more concise :)

@udaymehta udaymehta force-pushed the fix-llama4-docstrings branch from 2ff1d8a to 4b94610 Compare January 26, 2026 17:48
@udaymehta
Copy link
Contributor Author

Thanks for the better wording! I've applied the necessary suggestions to each one of them and amended to previous commit to keep the commit history clean.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

@stevhliu
Copy link
Member

@ydshieh, can you help merge this please? this pr adds missing arg descriptions to docstrings and the failing tests are unrelated 🙏

@Cyrilvallez Cyrilvallez enabled auto-merge (squash) February 2, 2026 16:13
@github-actions
Copy link
Contributor

github-actions bot commented Feb 2, 2026

[For maintainers] Suggested jobs to run (before merge)

run-slow: llama4

@Cyrilvallez Cyrilvallez disabled auto-merge February 2, 2026 16:52
@Cyrilvallez Cyrilvallez merged commit 0294359 into huggingface:main Feb 2, 2026
22 of 25 checks passed
@udaymehta udaymehta deleted the fix-llama4-docstrings branch March 3, 2026 04:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants