李维, 李秋生, 李天一, 陆忞, 刘海璇. 基于强化学习的异构业务资源分配方法[J]. 电力信息与通信技术, 2024, 22(12): 40-48. DOI: 10.16543/j.2095-641x.electric.power.ict.2024.12.06
引用本文: 李维, 李秋生, 李天一, 陆忞, 刘海璇. 基于强化学习的异构业务资源分配方法[J]. 电力信息与通信技术, 2024, 22(12): 40-48. DOI: 10.16543/j.2095-641x.electric.power.ict.2024.12.06
LI Wei, LI Qiusheng, LI Tianyi, LU Min, LIU Haixuan. Reinforcement Learning-based Heterogeneous Business Resource Allocation Method[J]. Electric Power Information and Communication Technology, 2024, 22(12): 40-48. DOI: 10.16543/j.2095-641x.electric.power.ict.2024.12.06
Citation: LI Wei, LI Qiusheng, LI Tianyi, LU Min, LIU Haixuan. Reinforcement Learning-based Heterogeneous Business Resource Allocation Method[J]. Electric Power Information and Communication Technology, 2024, 22(12): 40-48. DOI: 10.16543/j.2095-641x.electric.power.ict.2024.12.06

基于强化学习的异构业务资源分配方法

Reinforcement Learning-based Heterogeneous Business Resource Allocation Method

  • 摘要: 在5G及之后的无线网络中,增强移动宽带(enhanced mobile broadband,eMBB)和超可靠低延迟通信(ultra-reliable low-latency communication,URLLC)是具有不同服务质量需求的两种核心业务,如何利用有限的无线电资源实现两种业务的异构共存是一个具有挑战性的问题。文章基于3GPP所提出的穿孔技术,提出了一种智能资源分配框架,将eMBB/URLLC的资源分配问题建模为在满足URLLC可靠性约束的前提下,最大化eMBB用户数据速率的优化问题;同时考虑到无线信道的不确定性和URLLC随机穿孔的影响,引入比例公平(proportional fairness,PF)算法在总吞吐量和用户公平性之间作出权衡,提出一种基于比例公平的深度强化学习算法(proportional fairness-twin delayed deep deterministic policy gradient algorithm,PF-TD3A)智能地为两种业务分配资源。实验结果表明,所提出的算法能够在达到较好的eMBB可靠性要求的同时,进一步增大eMBB用户数据速率,平均提升约为7.4%。

     

    Abstract: In 5G and beyond wireless networks, enhanced mobile broadband (eMBB) and ultra-reliable low latency communications (URLLC) are two core services with different quality of service requirements. Achieving the heterogeneous coexistence of these two services using limited radio resources is a challenging problem. This paper proposes an intelligent resource allocation framework based on the perforation technique introduced by 3GPP, modeling the eMBB/URLLC resource allocation problem as an optimization problem that aims to maximize the eMBB user data rate while satisfying the URLLC reliability constraints. Additionally, considering the uncertainty of wireless channels and the impact of URLLC random perforation, a proportional fairness (PF) algorithm is introduced to balance the trade-off between total throughput and user fairness. To address this issue, this paper proposes a proportional fairness-based deep reinforcement learning algorithm, the proportional fairness-twin delayed deep deterministic policy gradient (PF-TD3A), to intelligently allocate resources for the two services. Experimental results show that the proposed algorithm can further increase the eMBB user data rate while meeting the eMBB reliability requirements, achieving an average improvement of approximately 7.4%.

     

/

返回文章
返回