Defending computer networks from cyber attack requires coordinating actions
across multiple nodes based on imperfect indicators of compromise while
minimizing disruptions to network operations. Advanced attacks can progress
with few observable signals over several months before execution. The resulting
sequential decision problem has large observation and action spaces and a long
time-horizon, making it difficult to solve with existing methods. In this work,
we present techniques to scale deep reinforcement learning to solve the cyber
security orchestration problem for large industrial control networks. We
propose a novel attention-based neural architecture with size complexity that
is invariant to the size of the network under protection. A pre-training
curriculum is presented to overcome early exploration difficulty. Experiments
show in that the proposed approaches greatly improve both the learning sample
complexity and converged policy performance over baseline methods in
simulation.

Go to Source of this post
Author Of this post: <a href="http://arxiv.org/find/cs/1/au:+Mern_J/0/1/0/all/0/1">John Mern</a>, <a href="http://arxiv.org/find/cs/1/au:+Hatch_K/0/1/0/all/0/1">Kyle Hatch</a>, <a href="http://arxiv.org/find/cs/1/au:+Silva_R/0/1/0/all/0/1">Ryan Silva</a>, <a href="http://arxiv.org/find/cs/1/au:+Brush_J/0/1/0/all/0/1">Jeff Brush</a>, <a href="http://arxiv.org/find/cs/1/au:+Kochenderfer_M/0/1/0/all/0/1">Mykel J. Kochenderfer</a>

By admin