Back in 2007 to 2010, we have invented the world’s first end-to-end machine learning algorithm for learning optimal control strategies in visual control tasks. Our Deep Fitted Q-Iteration (DFQ) algorithm directly interacts with the technical system and learns with these steps:
- continuously collect raw image data and the reward signal
- automatic analysis of the images for their content
- learning of a suitable representation of the systems state from the image data
- learning a near optimal control policy.
The AI gets continuously better with interaction time, more collected data produces better representations which allow for even more fine-grained control strategies.