PDF(1808 KB)
Proximal Policy Optimization-based Bidding Strategy for Thermal Power Generators Participating in Energy and Frequency Regulation Markets
ZHANG Bin, CAO Fan, XIAO Kun, SONG Yin, GUO Ying, YE Yujian, XU Dezhi
Electric Power Construction ›› 2026, Vol. 47 ›› Issue (4) : 82-92.
PDF(1808 KB)
PDF(1808 KB)
Proximal Policy Optimization-based Bidding Strategy for Thermal Power Generators Participating in Energy and Frequency Regulation Markets
[Objective] With China’s ongoing electricity market reforms and the pursuit of carbon peaking and neutrality goals, renewable energy penetration in the power system is rapidly increasing. While supporting clean energy transition, this also introduces marked electricity price volatility and market uncertainty, highly complicating the development of bidding strategies by power producers, relying in particular on spot trading. In response to the development of the optimal bidding strategies by traditional thermal power enterprises and diverse energy market players in the joint energy and frequency regulation ancillary services market, a bidding strategy optimization method based on proximal policy optimization (PPO) is proposed. [Methods] First, a bi-level optimization model is established for the joint energy-frequency regulation market, integrating multiple generation types and renewable energy storage, where storage smooths price fluctuations through charge-discharge control, elevating the risk response capability of market players such as wind-storage unions. In this framework, the upper-level power producers develop bidding strategies aiming at profit maximization, while the lower-level market clearing model achieves joint dispatch with the objective of minimizing system operating costs. Second, the bidding problem is formulated as a Markov decision process (MDP) within a deep reinforcement learning (DRL) framework, where PPO algorithm is employed to achieve autonomous learning and dynamic optimization of bidding strategies. [Results] Comparative analysis against the theoretical optimal solution in typical cases demonstrates that the proposed approach effectively boosts thermal power enterprises’ revenues, mitigates the risks resulting from renewable energy price fluctuations, reduces system operating costs, and enhances frequency regulation efficiency. [Conclusions] The proposed approach demonstrates superior economic performance and higher real-time computational efficiency in a joint market compared with benchmark solutions.
power generator bidding / electricity market risk response / deep reinforcement learning(DRL) / proximal policy optimization(PPO) / actor-critic architecture
| [1] |
周孝信, 赵强, 张玉琼, 等. “双碳”目标下我国能源电力系统发展趋势分析: 绿电替代与绿氢替代[J]. 中国电机工程学报, 2024, 44(17): 6707-6721.
|
| [2] |
魏旭, 刘东, 高飞, 等. “双碳”目标下考虑源网荷储协同优化运行的新型电力系统发电规划[J]. 电网技术, 2023, 47(9): 3648-3658.
|
| [3] |
辛永. “双碳”目标下的电网数字化转型技术研究与应用[J]. 供用电, 2023, 40(11): 1.
|
| [4] |
杨知方, 王娴琳, 李琪瑞. 目标导向的电力市场机制设计: 基本框架与案例分析[J/OL]. 中国电机工程学报, 2025: 1-12. (2025-08-20) [2025-09-10]. https://doi.org/10.13334/j.0258-8013.pcsee.250312.
|
| [5] |
张硕, 陈媛丽, 李英姿, 等. 计及电力现货机会成本的构网型储能电站调频辅助服务竞价出清双层博弈模型[J]. 中国电机工程学报, 2024, 44(S1): 146-158.
|
| [6] |
国家发展改革委国家能源局关于建立健全电力辅助服务市场价格机制的通知[EB/OL]. (2024-02-07) [2025-09-10]. https://www.gov.cn/zhengce/zhengceku/202402/content_6931026.htm.
|
| [7] |
王傲儿, 赵书强, 宋金历, 等. 考虑新能源与储能参与调频的联合市场出清模型[J]. 太阳能学报, 2024, 45(3): 367-376.
|
| [8] |
唐成鹏, 张粒子, 刘方, 等. 基于多智能体强化学习的电力现货市场定价机制研究(一): 不同定价机制下发电商报价双层优化模型[J]. 中国电机工程学报, 2021, 41(2): 536-552.
|
| [9] |
|
| [10] |
张晓瑾. 火电厂生产成本分析与报价策略的研究[D]. 天津: 天津大学, 2007.
|
| [11] |
|
| [12] |
汤君博, 潘凯岩, 王富友, 等. 基于纳什议价法的多主体虚拟电厂优化调度及效用分配策略[J]. 太阳能学报, 2025, 46(5): 79-88.
|
| [13] |
于娣, 胡健, 张晓杰, 等. 电力P2P交易中的双轮竞价博弈模型[J]. 电力建设, 2023, 44(7): 21-32.
点对点(peer-to-peer, P2P)交易是一种适合分布式电力产消者参与电力市场的新型交易方式。基于市场化视角并综合考虑产消者的经济效益和分布式清洁能源就地消纳情况,设计了基于双轮竞价博弈的电力P2P交易流程,建立了以经济效益为目标和以申报电量出清为目标的双轮竞价博弈模型。针对一个包含居民、办公、商业等不同类型电力产消者的社区进行算例分析,结果显示:相较于连续双边拍卖、单轮竞价博弈以及全额与电网(peer-to-grid, P2G)交易的方式,双轮竞价博弈方式下经济效益分别提升19.01%、28.78%、56.81%,分布式清洁能源就地消纳率分别提升10.51%、24.05%、85.10%。结果表明,采用合理的竞价方式和报价策略,可增加分布式电力产消者的收益,也可提高P2P交易效率,促进分布式清洁能源就地消纳,助力“双碳”目标的实现。
P2P transactions are a new transaction mode suitable for distributed electricity prosumers participating in the electricity market. Considering the economic benefits to prosumers and the local consumption of distributed clean energy, a P2P electricity transaction process was developed based on a double-round bidding game, and a game model with the goals of economic benefit and clearing energy was established. An example analysis was conducted for a community with different types of distributed electricity prosumers such as residents, offices, and commerce. Compared with continuous bilateral auctions, one-round bidding games, and P2G, the results of P2P transactions with a double-round bidding game showed that economic benefits increased by 19.01%, 28.78%, and 56.81%, and the local consumption rate of distributed clean energy increased by 10.51%, 24.05%, and 85.10%, respectively. In conclusion, adopting a bidding method and strategy can increase the revenue of distributed electricity prosumers, improve the efficiency of P2P transactions, promote the local consumption of distributed clean energy, and help realize the “double carbon” goal. |
| [14] |
王晛, 王胜彩, 张少华. 电-碳-绿证交易耦合下新能源发电商参与投标竞争的多市场博弈分析[J]. 电网技术, 2024, 48(10): 4125-4134.
|
| [15] |
|
| [16] |
|
| [17] |
董雷, 田爱忠, 于汀, 等. 基于混合整数半定规划的含分布式电源配电网无功优化[J]. 电力系统自动化, 2015, 39(21): 66-72, 125.
|
| [18] |
徐潇源, 王晗, 严正, 等. 能源转型背景下电力系统不确定性及应对方法综述[J]. 电力系统自动化, 2021, 45(16): 1-13.
|
| [19] |
刘洪, 徐正阳, 葛少云, 等. 考虑储能调节的主动配电网有功-无功协调运行与电压控制[J]. 电力系统自动化, 2019, 43(11): 51-58.
|
| [20] |
王守相, 李琦, 赵倩宇, 等. 计及源荷随机性的交直流配电网电压多目标优化改进粒子群算法[J]. 电力系统及其自动化学报, 2021, 33(12): 10-17.
|
| [21] |
胡维昊, 曹迪, 黄琦, 等. 深度强化学习在配电网优化运行中的应用[J]. 电力系统自动化, 2023, 47(14): 174-191.
|
| [22] |
胥栋, 李逸超, 李赟, 等. 基于深度强化学习的多能流楼宇低碳调度方法[J]. 浙江电力, 2024, 43(2): 126-136.
|
| [23] |
沈健, 宋智功. 基于深度学习的双臂系统协同控制综述[J/OL]. 控制工程, 2025: 1-13. (2025-09-09) [2025-10-10]. https://doi.org/10.14107/j.cnki.kzgc.20250283.
|
| [24] |
方虹苏. 基于深度强化学习的智能汽车控制模型研究[J]. 自动化应用, 2025, 66(4): 59-62.
|
| [25] |
韩冬, 黄微, 严正. 基于深度强化学习的电力市场虚拟投标策略[J]. 中国电机工程学报, 2022, 42(4): 1443-1454.
|
| [26] |
李超英, 檀勤良. 基于智能体建模的新型电力系统下火电企业市场交易策略[J]. 中国电力, 2024, 57(2): 212-225.
|
| [27] |
许丹, 胡晓静, 胡斐, 等. 基于深度强化学习的电力市场量价组合竞价策略[J]. 电网技术, 2024, 48(8): 3278-3286.
|
| [28] |
李钟平, 向月. 深度强化学习驱动的风储系统参与能量-调频市场竞价策略[J]. 电力工程技术, 2025, 44(3): 30-42.
|
| [29] |
|
| [30] |
|
利益冲突声明(Conflict of Interests) 所有作者声明不存在利益冲突。
/
| 〈 |
|
〉 |