Optimal Energy System Scheduling Using A Constraint-Aware Reinforcement Learning Algorithm

This repository accompanies the paper Optimal Energy System Scheduling Using A Constraint-Aware Reinforcement Learning Algorithm.

What Is In The Repository

MIP_DQN.py Default training entrypoint. It now routes environment interaction through Actor_MIP, which projects actor outputs into the MIP-constrained action space.
random_generator_battery.py Single-battery environment used by the default training flow.
random_generator_more_battery.py Multi-battery environment variant kept for follow-up experiments.
Parameters.py Unit and battery parameter definitions.
data/ Historical PV, price, and load time series.

Core Idea

The neural actor still proposes a continuous action. Actor_MIP then solves a mixed-integer surrogate problem over the critic network so that the executed action respects the implemented action-space constraint:

power balance within grid exchange limits

The current repository keeps the historical constraint scope and does not add new paper constraints that were not already encoded in code.

Runtime Dependencies

Base training flow:

numpy
pandas
torch

Actor_MIP path:

pyomo
omlt
onnx
gurobi

Experiment logging:

wandb (optional)

Development tooling:

pytest
ruff
pre-commit

Running The Training Script

The default entrypoint remains:

python MIP_DQN.py

Useful environment overrides for quick checks:

MIP_DQN_RANDOM_SEEDS=1234
MIP_DQN_NUM_EPISODES=1
MIP_DQN_TARGET_STEP=2
MIP_DQN_INITIAL_BUFFER_SIZE=2
MIP_DQN_BATCH_SIZE=1
MIP_DQN_REPEAT_TIMES=1
MIP_DQN_ENABLE_WANDB=0
MIP_DQN_SAVE_NETWORK=0
MIP_DQN_SAVE_RECORDS=0

To bypass the MIP projection path in a lightweight debug run:

MIP_DQN_USE_ACTOR_MIP=0

Development Checks

ruff check .
pytest
pre-commit run --all-files

Common Failure Modes

Actor_MIP requires the 'gurobi' solver... The MIP projection path is enabled, but Gurobi is not available in the current environment.
wandb is not installed... Training will continue without experiment logging when MIP_DQN_ENABLE_WANDB=0 or wandb is absent.
Data file path errors The environments now resolve CSV files relative to the module location, so the script can be launched from outside the repository root.

Citation

The preprint is available here: arXiv:2305.05484

Hou Shengren
Pedro P. Vergara
Edgar Mauricio Salazar
Peter Palensky

If you use this repository, please cite the paper or preprint.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
data		data
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
MIP_DQN.py		MIP_DQN.py
Parameters.py		Parameters.py
README.md		README.md
pytest.ini		pytest.ini
random_generator_battery.py		random_generator_battery.py
random_generator_more_battery.py		random_generator_more_battery.py
requirements-dev.txt		requirements-dev.txt
ruff.toml		ruff.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Optimal Energy System Scheduling Using A Constraint-Aware Reinforcement Learning Algorithm

What Is In The Repository

Core Idea

Runtime Dependencies

Running The Training Script

Development Checks

Common Failure Modes

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Optimal Energy System Scheduling Using A Constraint-Aware Reinforcement Learning Algorithm

What Is In The Repository

Core Idea

Runtime Dependencies

Running The Training Script

Development Checks

Common Failure Modes

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages