Reward Function Design via Human Knowledge Graph and Inverse Reinforcement Learning for Intelligent Driving
Motivated by applying artificial intelligence technology to the automobile industry, reinforcement learning is becoming more and more popular in the community of intelligent driving research. The reward function is one of the critical factors which affecting reinforcement learning. Its design principle is highly dependent on the features of the agent. The agent studied in this paper can do perception, decision-making, and motion-control, which aims to be the assistant or substitute for human driving in the latest future. Therefore, this paper analyzes the characteristics of excellent human driving behavior based on the six-layer model of driving scenarios and constructs it into a human knowledge graph. Furthermore, for highway pilot driving, the expert demo data is created, and the reward function is self-learned via inverse reinforcement learning. The reward function design method proposed in this paper has been verified in the Unity ML-Agent environment.