On the one hand, this feels like it's "cheating" because it's using an approach which is basically the reverse of the process by which the game's visual output is generated, but on the other hand, the approach of separating foreground entities from the background and predicting movement is very similar to how biological vision systems work.
It's possible to be generally intelligent without having any visual sense, so this approach may not help to develop General AI, but it could still be useful for many machine learning tasks.
It's possible to be generally intelligent without having any visual sense, so this approach may not help to develop General AI, but it could still be useful for many machine learning tasks.