TinderZ

👋 Hi, I’m TinderZ

🔍 Research Area	📖 Description
🧠 LLM Reasoning	Exploring the mechanisms and boundaries of LLMs in complex logical reasoning, covering areas such as Test-time Scaling and RL4LLM.
⚙️ Auxiliary Policy Model	Integrating lightweight auxiliary policy models, trained via Reinforcement Learning, into LLM architectures to enhance model capabilities.
🤖 MARL	Focusing on multi-agent system, including both LLM-based agents and traditional agents, to solve coordination and game-theoretic equilibrium.

I'm always open to collaboration and discussion on these topics.
Email：b23042510@njupt.edu.cn ✅
Feel free to reach out if you share similar interests or have exciting projects in mind!