Some options: + use efficient c code like xarray does with [`rolling`](http://xarray.pydata.org/en/stable/computation.html#rolling-window-operations) and [bottleneck](https://github.com/kwgoodman/bottleneck/tree/master/bottleneck) + continue to use numpy strides, but this doesn't scale well. See [here](http://scikit-image.org/docs/0.10.x/api/skimage.util.html#view-as-windows) and [here](https://stackoverflow.com/questions/4936620/using-strides-for-an-efficient-moving-average-filter)
Some options:
rollingand bottleneck