Decentralized LLM collaboration offers significant advantages by enabling specialized models to divide and conquer complex tasks according to their strengths, run inferences in parallel to improve efficiency, and support more flexible deployment. We introduce the theory for fully decentralized LLM collaboration, and propose various multi-agent reinforcement learning algorithms for optimizing decentralized LLM collaboration. We also develop simplified applications to prove the concept that decentralized SLMs fine-tuned by MARL can collaborate effectively to achieve equal or better performance than a single LLM [ref1], [ref2].
Mean-Field Game is developed to study the decision-making strategy in multi-agent systems with very large populations by building a connection between stochastic modeling and distributed control. In the context of autonomous vehicle navigation, each vehicle acts as an agent and makes decisions regarding velocity control and route choice [ref] according to current population density distribution. The actions of all vehicles jointly trigger the evolution of density dynamics. This process repeats until it converges to the mean-field equilibrium. We proposed various approaches to address the practical challenges, such as fine granularity [ref], scalability and computational efficiency [ref1, ref2].
