Decision-making logics for autonomous vehicles and other road users can be similar and mutual-beneficial in terms of researches. Many scholars have studied the decision-making for autonomous vehicles\cite{Kiran_2021} and pedestrians\cite{Kooij_2016}. Among them, learning-based approaches are promising and gaining popularity\cite{Kiran_2021}.  Some studies that human driving behaviors can be extracted through the machine learning algorithms, such as deep learning\cite{Sama_2020,Huang_2020}, imitation learning\cite{christoph2017}, and inverse reinforcement learning\cite{d2016}. However, due to the inherent black-box nature of the neural networks, the interpret ability of learning based methods is not ideal. Inspired by the game-like essence of road users interactions, more interpretable  game-theoretic approaches are investigated and considered more reasonable and practical. Some researches formulate the decision process as a Stackelberg game\cite{Huang_2021,Yu_2018,Hang_2021,Hang_2020} , and they impose a strong assumption on the availability of the leader as well as their utility function during the game. Furthermore, as the opponent vehicles may not always act as the formulated Stackelberg game expects, an online estimation algorithm is proposed using historical data to improve the game-based interactions \cite{Zhang_2020} and\cite{Zhang_2020a} . Additionally, there exists a problem in finding Nash equilibrium. There may be more than one Nash equilibrium, thus they might conflict with one another\cite{Wang_2021,Spica_2020} .
Apart from the above mentioned methods, MOBIL-IDM model has also been widely used and\cite{Kesting_2007} dominates the field of traffic scene generation. It is originally designed to be collision-free. However, after modifying some of the key parameters, such as the politeness, acceleration,and the grid distance estimation, the algorithm can generate adversarial behaviors for testing \cite{Feng2021,Lindorfer_2018}. Other methods, including the risk field\cite{Kolekar2020}, artificial potential field\cite{Rasekhipour_2017,Gao_2019}, constrained Delaunay triangulation\cite{Huang_2021a,Huang_2021} , and scene prediction\cite{Lawitzky_2013}, are capable of modeling human driver's cognitive states while considering safety. In general, the aforementioned methods either make strong assumptions on the availability of data and information, or lack integrity in representing human behaviors, resulting in in-ideal scenes for driving testing. Besides, being either adversarial or cautious, the generated behaviors of road users in the simulation testing should be human-like, to further improve the fidelity of the simulation testing environment. Moreover, currently both the learning-based and game-based approaches require heavy computation resources, which limit the implementation of advanced scene generation algorithms.

The Estimation of Driving Aggressiveness

Aggressiveness is an important factor for vehicles in the competition of right of ways\cite{Huang_2021a}. It can be considered as a result of the trade-off between driving safety and travel efficiency\cite{Zhang_2020}. Due to the complexity of the problem, there is no unified method in measuring aggressiveness\cite{Marina_Martinez_2018}. Intuitively, relative speed, acceleration, and the distance between vehicles can be utilized to quantify te aggressiveness\cite{Huang_2021a,Colombo_2017,Li_2019,Wang_2017} . Nevertheless, using just vehicle dynamics to measure driving styles seems not very comprehensive. Thus, many studies shift to the driver-behavior oriented and scene-specific methods for the aggressiveness estimation\cite{Solovey_2014,seon2020,Mole_2021} . In recent years, some new elements are introduced to the discussions of the evaluation of aggressiveness. Many new explorations are conducted from the the aspect of scenes, e.g. straight road\cite{Kolekar_2020}}, curves\cite{Kolekar2020}, and roundabout\cite{Hang_2021a}, as well as from the aspect of human factors --- hand\cite{Muhlbacher_Karrer_2017}, eye\cite{Hu_2021a}, EEG\cite{Rupp_2019}, and so forth.
However, these methods have two main drawbacks. First, the aggressiveness of a driver may not be consistent, due to the varying scenarios and the travel demands. Second, from the energy management perspective, although the driving style recognition is proved to be beneficial for long-term strategy optimization\cite{Yang_2018}, obtaining the exact value of aggressiveness instantly may not be necessary. Therefore, instead of realizing an accurate and continuous value for the estimation, we maintain that an identification of the relative competitiveness or aggressiveness classification is a feasible and more pragmatic way for autonomous driving.

The Human-like Behaviors

One of the challenges that distinguish autonomous driving from other mobile platforms is the traffic uncertainty. From this point of view, representing human-like behaviors of road-users is very essential for scene generation. The results reported in\cite{Feng2021} indicate that generating rare cases, i.e. using Markov model to randomly generate initial scenes and IDM-MOBIL model for adversarial behavior, can shorten the overall testing time. But the drawback is also clear. The Markov model is only used when the vehicle is cruising, thus there's no significant interactions between the ego vehicle and other surrounding ones. Besides, during the interactions, the IDM-MOBIL cannot completely represent human behaviors. Generating aggressive behaviors and possible accidents are essential, but these can be hardly realized if only non-human-like behaviors and interaction movements are produced in the simulation environment. Learning-based approaches are exploited as approximators for human-like driving as well\cite{Li_2018,Zhang_2018}. However, these methods suffer from the black-box nature of neural networks which can hardly be customized and interpreted for logic analysis. Learning from datasets is an interesting and promising methodology for realizing human-like driving \cite{Xu_2020}, but the diversities and uncertainties of human driver behaviors should be further considered. Therefore, it is difficult to formulate all human decisions as a unified optimization problem, especially for solving a global optimum. And because human drivers have limited and various sensing and motor abilities, their control performances are imperfect.

Experimental Section/Methods

According to the aforementioned analysis, the formulated problem consists of two main elements: 1) Conflict. There will be a severe consequence if neither of the two interaction agents is willing to deviate from their original expected choices. Thus, one of them has to yield eventually. 2) Alternatives. Each of the participants should have at least two options, i.e. fight or yield. The conflict is defined as: the expected trajectories of multiple agents would cross. The expected motion trajectory is predicted based on an assumption of the current velocity and yaw angle. This is in line with the human-like concept, as a human does not make complicated and precise predictions (detailed definition is given in Note S2, Supporting Information). In this work, we will mainly focus on the alternatives.

Proposed Framework

The complete framework is shown in Figure \ref{832958} for interactive driving scene generation, which is explicit and inspired by the human's decision-making process. Within the framework, the scheduler is used to select players of interest. If any one is not selected, it will follow its own expected way-points, as how it moves has no direct impact on the driving situation of the tested vehicle. However, when one agent is selected, the candidate trajectory generator and algorithm in Figure \ref{918425} will be activated. The trajectory generator algorithm can be RRT \cite{LaValle_2001} , semi-reactive trajectory generation \cite{Werling_2010}, and so forth as long as it can generate multiple possible trajectories for the player. Then, the algorithm will determine whether there exists a conflict between the expected trajectory and the prediction using the algorithm presented. If there is no conflict, the best solution will be selected from the candidates. If there exists a conflict, the algorithm will determine whether there is still available room to fight. If the ego player is not blocked, it will estimate the aggressiveness of other surrounding agents using the method proposed. Then, the aggressiveness will be updated for the decision-making module which is based on Bayesian game theory. If the decision is explicit, i.e. to fight or to yield, then the agent will follow the decision. However, when the decision is not clear, it will make some small steps for probing inspired from human behaviors. The small-step action is of low risk in terms of collision, but it is enough to demonstrate the agent's intention. This proposed framework is applicable for decision-making and interactions among multi-modal agents, including vehicles, cyclists and pedestrians, but in this work we will mainly focus on the interactions between vehicles.