基于多智能体策略梯度的售电商购电报价及售电定价联合策略优化

徐弘升, 钱涛, 王珂, 谢章天, 胡秦然

电力建设 ›› 2025, Vol. 46 ›› Issue (9) : 190-202.

PDF(4047 KB)
PDF(4047 KB)
电力建设 ›› 2025, Vol. 46 ›› Issue (9) : 190-202. DOI: 10.12204/j.issn.1000-7229.2025.09.015
电力经济

基于多智能体策略梯度的售电商购电报价及售电定价联合策略优化

作者信息 +

Joint Bidding and Pricing Strategy Optimization for Electricity Retailers Based on Multi-agent Policy Gradient

Author information +
文章历史 +

摘要

【目的】在双侧竞价的电力现货市场中,售电商的主要业务是通过优化购售电策略以获得最大利润。【方法】针对售电商的购电报价及售电定价联合策略优化问题,首先建立了马尔可夫决策过程;其次,在此基础上提出了一种可以用于连续动作空间的改进型反事实基线多智能体策略梯度算法(modified counterfactual multi-agent deep reinforcement learning with soft actor-critic algorithm,mCOMA-SAC)并对联合策略优化问题进行了求解;此外,进一步对比分析了价格影响者(price-maker)售电商和价格接受者(price-taker)售电商在报价和定价联合决策上的差异,以及不同程度的线路阻塞对价格影响者售电商决策的影响。【结果】算例结果验证了mCOMA-SAC算法在学习性能、计算性能和优化性能上的优点。【结论】所提的mCOMA-SAC算法可以有效地优化售电商购电报价和售电定价联合决策,解决了价格影响者型售电商在双侧竞价电力现货市场中的策略性购售电行为模拟问题,为电力市场中售电商市场力的控制提供了方法支持。

Abstract

[Objective] In the electricity spot market with two-sided bidding, the primary business of electricity retailers is to optimize their energy purchasing and selling strategies to maximize profits. [Methods] In this study, we first established a Markov decision process for the joint bidding and pricing strategy optimization problem of electricity retailers. Then, a modified counterfactual multi-agent deep reinforcement learning with soft actor-critic algorithm (mCOMA-SAC) is proposed to solve the joint optimization problem. Furthermore, the joint bidding and pricing strategy of price-maker and price-taker retailers are comparatively analyzed, and an in-depth analysis of the impact of line congestion on the joint bidding and pricing strategy is conducted. [Results] The numerical results demonstrate the advantages of the mCOMA-SAC algorithm in learning, computing, and optimization performance. [Conclusions] The proposed mCOMA-SAC algorithm can effectively optimize the joint bidding and pricing strategy of electricity retailers, solves the simulation problem of strategic power purchasing and selling behaviors of price-maker retailers in the dual-side bidding electricity spot market, and provides methodological support for the control of the market power of retailers in the electricity market.

关键词

电力市场 / 售电商 / 报价策略 / 定价策略 / 联合策略优化 / 多智能体强化学习

Key words

electricity market / electricity retailer / bidding strategy / pricing strategy / joint strategy optimization / multi-agent reinforcement learning

引用本文

导出引用
徐弘升, 钱涛, 王珂, . 基于多智能体策略梯度的售电商购电报价及售电定价联合策略优化[J]. 电力建设. 2025, 46(9): 190-202 https://doi.org/10.12204/j.issn.1000-7229.2025.09.015
XU Hongsheng, QIAN Tao, WANG Ke, et al. Joint Bidding and Pricing Strategy Optimization for Electricity Retailers Based on Multi-agent Policy Gradient[J]. Electric Power Construction. 2025, 46(9): 190-202 https://doi.org/10.12204/j.issn.1000-7229.2025.09.015
中图分类号: TM73   

参考文献

[1]
杨甲甲, 赵俊华, 文福拴, 等. 电力零售核心业务架构与购售电决策[J]. 电力系统自动化, 2017, 41(14): 10-18, 20-23.
YANG Jiajia, ZHAO Junhua, WEN Fushuan, et al. Key business framework and purchase/sale decision-making for electricity retailers[J]. Automation of Electric Power Systems, 2017, 41(14): 10-18, 20-23.
[2]
詹祥澎, 杨军, 王昕妍, 等. 考虑实时市场联动的电力零售商鲁棒定价策略[J]. 电网技术, 2022, 46(6): 2141-2153.
ZHAN Xiangpeng, YANG Jun, WANG Xinyan, et al. Robust pricing strategy of power retailer considering linkage of real-time market[J]. Power System Technology, 2022, 46(6): 2141-2153.
[3]
YANG J J, ZHAO J H, LUO F J, et al. Decision-making for electricity retailers: a brief survey[J]. IEEE Transactions on Smart Grid, 2018, 9(5): 4140-4153.
[4]
李源, 蓝歆格, 尹纯亚, 等. 基于改进DHNN模型的售电公司信用评价[J]. 浙江电力, 2024, 43(1): 72-79.
LI Yuan, LAN Xinge, YIN Chunya, et al. Credit evaluation of electricity sales companies based on an enhanced DHNN model[J]. Zhejiang Electric Power, 2024, 43(1): 72-79.
[5]
舒征宇, 朱凯翔, 王灿, 等. 考虑碳交易的虚拟电厂日前电力市场竞价策略[J]. 电力工程技术, 2024, 43(5): 58-68, 149.
SHU Zhengyu, ZHU Kaixiang, WANG Can, et al. Virtual power plants participating in day-ahead electricity market bidding strategy considering carbon trading[J]. Electric Power Engineering Technology, 2024, 43(5): 58-68, 149.
[6]
赵懿雯, 温家兴, 陈斐, 等. 可再生能源配额制下电力市场发电主体决策优化模型[J]. 电网与清洁能源, 2024, 40(1): 150-155, 162.
ZHAO Yiwen, WEN Jiaxing, CHEN Fei, et al. Optimization model of power generation subject decision-making in electricity market under renewable energy quota system[J]. Power System and Clean Energy, 2024, 40(1): 150-155, 162.
[7]
马光, 江伟, 李文朝, 等. 基于云边协同和区块链的分布式能源交易系统设计[J]. 电力工程技术, 2023, 42(4): 159-166.
MA Guang, JIANG Wei, LI Wenchao, et al. Design of distributed energy trading system based on cloud edge collaboration and blockchain[J]. Electric Power Engineering Technology, 2023, 42(4): 159-166.
[8]
吴华华, 张思, 甘雯, 等. 面向多时间尺度市场交易的电网企业代理购电决策模型[J]. 浙江电力, 2023, 42(9): 42-51.
WU Huahua, ZHANG Si, GAN Wen, et al. Decision-making model of electricity purchasing agent service of power grid enterprises for multi-time-scale market transactions[J]. Zhejiang Electric Power, 2023, 42(9): 42-51.
[9]
何奇琳, 艾芊. 售电侧放开环境下含需求响应虚拟电厂的电力市场竞价策略[J]. 电力建设, 2019, 40(2): 1-10.
摘要
售电侧市场放开是我国新一轮电力体制改革的重点任务之一。文章考虑售电侧放开对需求侧资源参与电力市场的影响,提出了售电侧放开环境下计及需求响应的虚拟电厂(virtual power plant, VPP)竞价策略,建立了含虚拟电厂的电力市场竞价模型,在对虚拟电厂内部进行优化调度的基础上采用自适应学习方法对竞价模型进行求解,并以社会效益最大化为目标进行市场出清,最后选取算例对竞价策略的有效性进行验证。结论表明,售电侧放开环境下发电商与售电商的经济性有所提高且市场整体的用电需求得到调整,同时售电侧放开有利于需求侧资源参与市场竞争。
HE Qilin, AI Qian. Bidding strategy of electricity market including virtual power plant considering demand response under retail power market deregulation[J]. Electric Power Construction, 2019, 40(2): 1-10.
The opening of the retail market is one of the key tasks of the power system reform in China. In this paper, under retail power market deregulation, we consider the influence on demand-side resource participating in electric market, propose a bidding strategy of virtual power plant considering demand response, and set up a bidding model for the whole market including virtual power plant (VPP). On the basis of the optimization of the virtual power plant, we use the adaptive learning method to solve the model, and clear the market with the goal of maximizing the social benefit. Finally, an example is selected to verify the validity of the bidding strategy, and the results show that the economy of the generation companies and retailing suppliers is improved and the demand for electricity in the whole market is adjusted. What's more, retail power market deregulation is benefit to demand-side resources participating in market competition.
[10]
郭昆健, 高赐威, 林国营, 等. 现货市场环境下售电商激励型需求响应优化策略[J]. 电力系统自动化, 2020, 44(15): 28-35.
GUO Kunjian, GAO Ciwei, LIN Guoying, et al. Optimization strategy of incentive based demand response for electricity retailer in spot market environment[J]. Automation of Electric Power Systems, 2020, 44(15): 28-35.
[11]
XU H C, ZHANG K Q, ZHANG J B. Optimal joint bidding and pricing of profit-seeking load serving entity[J]. IEEE Transactions on Power Systems, 2018, 33(5): 5427-5436.
[12]
KAZEMPOUR S J, CONEJO A J, RUIZ C. Strategic bidding for a large consumer[J]. IEEE Transactions on Power Systems, 2015, 30(2): 848-856.
[13]
甘宇翔, 蒋传文, 白宏坤, 等. 市场环境下园区售电商的最优报价和运行优化[J]. 电网技术, 2018, 42(3): 707-715.
GAN Yuxiang, JIANG Chuanwen, BAI Hongkun, et al. Optimal bidding strategy and operation of industrial park electricity retailer in electricity market[J]. Power System Technology, 2018, 42(3): 707-715.
[14]
彭谦, 周晓洁, 杨睿, 等. 泛在电力物联网环境下综合能源型售电公司参与电力市场竞争的报价策略研究[J]. 电网技术, 2019, 43(12): 4337-4343.
PENG Qian, ZHOU Xiaojie, YANG Rui, et al. Bidding strategy of comprehensive energy based power selling company participating in electricity market competition under ubiquitous environment of Internet of Things[J]. Power System Technology, 2019, 43(12): 4337-4343.
[15]
王林炎, 张粒子, 张凡, 等. 售电公司购售电业务决策与风险评估[J]. 电力系统自动化, 2018, 42(1): 47-54, 143.
WANG Linyan, ZHANG Lizi, ZHANG Fan, et al. Decision-making and risk assessment of purchasing and selling business for electricity retailers[J]. Automation of Electric Power Systems, 2018, 42(1): 47-54, 143.
[16]
孔祥玉, 张禹森, 杨世海, 等. 市场机制下考虑风险的售电公司日前电价决策方法[J]. 电网技术, 2019, 43(3): 935-943.
KONG Xiangyu, ZHANG Yusen, YANG Shihai, et al. Day-ahead pricing decision method considering risk of electricity selling company under market mechanism[J]. Power System Technology, 2019, 43(3): 935-943.
[17]
李雅婷, 唐家俊, 张思, 等. 考虑多重不确定性因素的售电公司购售电决策模型[J]. 电力系统自动化, 2022, 46(7): 33-41.
LI Yating, TANG Jiajun, ZHANG Si, et al. Decision-making model of electricity procurement and sale for electricity retailers considering multiple uncertain factors[J]. Automation of Electric Power Systems, 2022, 46(7): 33-41.
[18]
徐弘升, 陆继翔, 杨志宏, 等. 基于深度强化学习的激励型需求响应决策优化模型[J]. 电力系统自动化, 2021, 45(14): 97-103.
XU Hongsheng, LU Jixiang, YANG Zhihong, et al. Decision optimization model of incentive demand response based on deep reinforcement learning[J]. Automation of Electric Power Systems, 2021, 45(14): 97-103.
[19]
薛溟枫, 毛晓波, 肖浩, 等. 基于改进深度Q网络算法的多园区综合能源系统能量管理方法[J]. 电力建设, 2022, 43(12): 83-93.
摘要
多园区综合能源系统可通过多能互补互济显著提升运行经济性,然而园区之间的复杂互动、多能耦合决策会给多园区综合能源系统的能量管理带来决策空间庞大、算法难以收敛等挑战性问题。为解决上述问题,提出了一种基于改进深度Q网络(modified deep Q network,MDQN)算法的多园区综合能源系统能量管理方法。首先,采用独立于园区的外部气象数据、历史互动功率数据,构建了基于长短期记忆(long short-term memory,LSTM)深度网络的各园区综合能源系统外部互动环境等值模型,降低了强化学习奖励函数的计算复杂度;其次,提出一种基于k优先采样策略的MDQN算法,用k-优先采样策略来代替ε贪心策略,克服了大规模动作空间中探索效率低下的问题;最后,在含3个园区综合能源系统的算例中进行验证,结果表明MDQN算法相比原DQN算法具有更好的收敛性与稳定性,同时可以提升园区经济效益达29.16%。
XUE Mingfeng, MAO Xiaobo, XIAO Hao, et al. A novel energy management method based on modified deep Q network algorithm for multi-park integrated energy system[J]. Electric Power Construction, 2022, 43(12): 83-93.

Multi-park integrated energy system can significantly improve the operation economy by complementing each other with multiple energy sources. However, the complex interactions between parks and multi-energy coupling decisions can bring challenging problems such as large decision space and difficult convergence of algorithms to the energy management of multi-park integrated energy system. To solve the above problems, an energy management method based on modified deep Q network (MDQN) algorithm for multi-park integrated energy systems is proposed. Firstly, the external meteorological data and historical interactive power data independent of the park are used to construct a long short-term memory (LSTM) deep network-based external interactive environmental equivalence model for each park integrated energy system, which reduces the computational complexity of the reinforcement learning reward function. Secondly, an improved DQN algorithm based on k-first sampling strategy is proposed to replace the greedy strategy with k-first sampling strategy to overcome the inefficiency of exploration in large-scale action spaces. Finally, the results are validated in an algorithm containing three integrated energy systems in the park, and show that the MDQN algorithm has better convergence and stability compared with the original DQN algorithm, while it can improve the economic efficiency of the park by 29.16%.

[20]
LU R Z, HONG S H. Incentive-based demand response for smart grid with reinforcement learning and deep neural network[J]. Applied Energy, 2019, 236: 937-949.
[21]
GHASEMKHANI A, YANG L, ZHANG J S. Learning-based demand response for privacy-preserving users[J]. IEEE Transactions on Industrial Informatics, 2019, 15(9): 4988-4998.
[22]
ZHANG Y, YANG Q Y, AN D, et al. Multistep multiagent reinforcement learning for optimal energy schedule strategy of charging stations in smart grid[J]. IEEE Transactions on Cybernetics, 2023, 53(7): 4292-4305.
[23]
戴彦, 王刘旺, 李媛, 等. 新一代人工智能在智能电网中的应用研究综述[J]. 电力建设, 2018, 39(10): 1-11.
摘要
智能电网是人工智能 (artificial intelligence, AI) 的重要应用领域之一, 以高级机器学习理论、大数据、云计算为主要代表的新一代人工智能 (new generation artificial intelligence, NGAI) 技术的进步和突破, 将会促进智能电网的发展。首先概述AI的主要方法, 并对NGAI的内涵、特点与技术体系进行论述。之后, 对NGAI在能源供应、电力系统安全与控制、运维与故障诊断、电力需求和电力市场等领域中的最新应用研究情况进行比较系统的综述。最后, 总结NGAI在智能电网中应用的关键问题, 提出人工智能在智能电网中的应用可分为三阶段实施的建议。
DAI Yan, WANG Liuwang, LI Yuan, et al. A brief survey on applications of new generation artificial intelligence in smart grids[J]. Electric Power Construction, 2018, 39(10): 1-11.

Smart grid is an important application field of artificial intelligence (AI). The progress and breakthrough of new generation artificial intelligence (NGAI) technologies, mainly represented by advanced machine learning, big data, and cloud computing, will provide incentives for the development of smart grids. Dominated methods of AI are first introduced, and the connotation, characteristics and technical system of NGAI briefly described. Next, the recent applications of NGAI in some fields including energy supply, power system security and control, operation maintenance and fault diagnosis, power demand and electricity market are surveyed in detail. Finally, some key issues concerning the applications of NGAI in smart grids are summarized, and a proposal is suggested that the applications of AI in smart grids be implemented through three stages.

[24]
张继行, 张一, 王旭, 等. 基于多代理强化学习的多新型市场主体虚拟电厂博弈竞价及效益分配策略[J]. 电网技术, 2024, 48(5): 1980-1991.
ZHANG Jihang, ZHANG Yi, WANG Xu, et al. Game bidding and benefit allocation strategies for virtual power plants with multiple new market entities based on multi-agent reinforcement learning[J]. Power System Technology, 2024, 48(5): 1980-1991.
[25]
MOGHIMI F H, BARFOROUSHI T. A short-term decision-making model for a price-maker distribution company in wholesale and retail electricity markets considering demand response and real-time pricing[J]. International Journal of Electrical Power & Energy Systems, 2020, 117: 105701.
[26]
XU H C, SUN H B, NIKOVSKI D, et al. Deep reinforcement learning for joint bidding and pricing of load serving entity[J]. IEEE Transactions on Smart Grid, 2019, 10(6): 6366-6375.
[27]
XU H S, WU Q W, WEN J Y, et al. Joint bidding and pricing for electricity retailers based on multi-task deep reinforcement learning[J]. International Journal of Electrical Power & Energy Systems, 2022, 138: 107897.
[28]
贾乾罡, 陈思捷, 李亦言, 等. 有限信息环境下基于学习自动机的发电商竞价策略[J]. 电力系统自动化, 2021, 45(6): 133-139.
JIA Qiangang, CHEN Sijie, LI Yiyan, et al. Learning automata based bidding strategy for power suppliers in incomplete information environment[J]. Automation of Electric Power Systems, 2021, 45(6): 133-139.
[29]
HU B, GONG Y Z, CHUNG C Y, et al. Price-maker bidding and offering strategies for networked microgrids in day-ahead electricity markets[J]. IEEE Transactions on Smart Grid, 2021, 12(6): 5201-5211.
[30]
孙长银, 穆朝絮. 多智能体深度强化学习的若干关键科学问题[J]. 自动化学报, 2020, 46(7): 1301-1312.
SUN Changyin, MU Chaoxu. Important scientific problems of multi-agent deep reinforcement learning[J]. Acta Automatica Sinica, 2020, 46(7): 1301-1312.
[31]
LIANG Y C, GUO C L, DING Z H, et al. Agent-based modeling in electricity market using deep deterministic policy gradient algorithm[J]. IEEE Transactions on Power Systems, 2020, 35(6): 4180-4192.
[32]
员江洋, 杨明, 刘宁宁, 等. 不完全信息下基于多代理深度确定策略梯度算法的发电商竞价策略[J]. 电网技术, 2022, 46(12): 4832-4844.
YUN Jiangyang, YANG Ming, LIU Ningning, et al. Bidding strategy of generation companies based on multi-agent deep deterministic policy gradient algorithm under incomplete information[J]. Power System Technology, 2022, 46(12): 4832-4844.
[33]
谢昕怡, 应黎明, 田书圣, 等. 基于MADDPG和智能合约的微电网交易决策优化[J]. 电力建设, 2022, 43(11): 142-150.
摘要
为解决微电网在传统集中化交易模式下面临的决策耗时长、信任成本高和隐私安全等问题,提出了基于多智能体深度确定性策略梯度(multi-agent deep deterministic policy gradient, MADDPG)算法与智能合约的微电网去中心化市场交易体系。首先,对微电网市场中多智能体进行划分后设计了适用于各主体参与分布式交易的微电网去中心化交易机制,以保障市场主体利益。其次,为实现交易确认阶段微电网市场主体的交易策略优化,采用MADDPG算法对各主体追求利益最大的竞价模型进行求解。最后,通过算例仿真验证了MADDPG算法在智能合约下微电网市场主体交易策略优化过程中的可行性和经济性。
XIE Xinyi, YING Liming, TIAN Shusheng, et al. Optimization of microgrid trading strategy based on MADDPG and smart contracts[J]. Electric Power Construction, 2022, 43(11): 142-150.

In order to solve the problems of long decision-making time, high cost of trust and privacy security faced by microgrid in the traditional centralized trading, a decentralized market trading system based on multi-agent deep deterministic policy gradient (MADDPG) algorithm and smart contract is proposed for microgrid. Firstly, after dividing the multiple agents in the microgrid market, a decentralized transaction mechanism for microgrids that is suitable for all entities to participate in distributed transactions is designed to protect the interests of market entities. Secondly, in order to realize the optimization of the transaction strategy of the microgrid market entities in the transaction confirmation stage, the multi-agent deep deterministic policy gradient algorithm is used to solve the bidding model that each entity pursues the most benefits. Finally, the feasibility and economy of the MADDPG algorithm in the optimization process of the transaction strategy of the microgrid market entities under the smart contract is verified by example simulation.

[34]
TAO Y C, QIU J, LAI S Y. Deep reinforcement learning based bidding strategy for EVAs in local energy market considering information asymmetry[J]. IEEE Transactions on Industrial Informatics, 2022, 18(6): 3831-3842.
[35]
LEE K C, YANG H T, TANG W J. Data-driven online interactive bidding strategy for demand response[J]. Applied Energy, 2022, 319: 119082.
[36]
NGUYEN T T, NGUYEN N D, NAHAVANDI S. Deep reinforcement learning for multiagent systems: a review of challenges, solutions, and applications[J]. IEEE Transactions on Cybernetics, 2020, 50(9): 3826-3839.
Reinforcement learning (RL) algorithms have been around for decades and employed to solve various sequential decision-making problems. These algorithms, however, have faced great challenges when dealing with high-dimensional environments. The recent development of deep learning has enabled RL methods to drive optimal policies for sophisticated and capable agents, which can perform efficiently in these challenging environments. This article addresses an important aspect of deep RL related to situations that require multiple agents to communicate and cooperate to solve complex tasks. A survey of different approaches to problems related to multiagent deep RL (MADRL) is presented, including nonstationarity, partial observability, continuous state and action spaces, multiagent training schemes, and multiagent transfer learning. The merits and demerits of the reviewed methods will be analyzed and discussed with their corresponding applications explored. It is envisaged that this review provides insights about various MADRL methods and can lead to the future development of more robust and highly useful multiagent learning methods for solving real-world problems.
[37]
FANG X, HU Q R, LI F X, et al. Coupon-based demand response considering wind power uncertainty: a strategic bidding model for load serving entities[J]. IEEE Transactions on Power Systems, 2016, 31(2): 1025-1037.
[38]
EMAMI I T, SAMANI E, ABYANEH H A, et al. A conceptual analysis of equilibrium bidding strategy in a combined oligopoly and oligopsony wholesale electricity market[J]. IEEE Transactions on Power Systems, 2022, 37(6): 4229-4243.
[39]
FOERSTER J, FARQUHAR G, AFOURAS T, et al. Counterfactual multi-agent policy gradients[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32(1): 1-9.
[40]
SILVER D, HUANG A, MADDISON C J, et al. Mastering the game of go with deep neural networks and tree search[J]. Nature, 2016, 529(7587): 484-489.
[41]
BOMPARD E, MA Y C, NAPOLI R, et al. The demand elasticity impacts on the strategic bidding behavior of the electricity producers[J]. IEEE Transactions on Power Systems, 2007, 22(1): 188-197.

基金

国家自然科学基金项目(52307085)
江苏省基础研究计划(自然科学基金)项目(BK20210002)
综合交通运输理论交通运输行业重点实验室(南京现代综合交通实验室)开放课题(MTF2023002)

编辑: 景贺峰
PDF(4047 KB)

Accesses

Citation

Detail

段落导航
相关文章
AI小编
你好!我是《电力建设》AI小编,有什么可以帮您的吗?

/