1 Introduction

Autonomous vehicles (AVs) are expected to improve driving safety compared with vehicles driven by humans. The driving space for an AV is the reconstruction of a surrounding real driving environment, including the free drivable area, obstacles, and other relevant driving elements, and it consists of all the static and dynamic traffic elements in the surrounding space and thus is a wider concept than drivable area or drivable space that indicates free space. In this paper, only the local space serving for local driving decision (rather than large-scale or road-level space) is discussed. Generating driving space is the process of environment modeling with sensor information and other driving constraints, such as traffic rules. As it is generated from perception and is the basis of decision-making, the driving space acts as a bridge (or interface) between the two, which are two key research areas in autonomous driving. The driving space is mainly dedicated to intelligent vehicles of level 3 or higher on the SAE scale [1], at which the vehicles must be able to monitor the environment and drive autonomously.

In modeling the real driving environment, it is unrealistic to describe all details due to the heavy calculation burden. Therefore, it is necessary to make simplification and abstraction to efficiently understand the surrounding space. In existing research, the world can be modeled with three approaches that define the simplified driving space [2]. The first is the grid space built by discrete sampling of the entire driving space. The second is the feature space built by continuous and sparse descriptions of the environment. The third is the topological space, a more abstract form with nodes and links that concentrate on key points or landmarks. In the grid space, the space is segmented into grids, and each cell (i.e., grid) is associated with occupancy probability. The feature space only describes key elements (e.g., obstacles, traffic lanes) by their features within the continuous driving space instead of describing the whole space. The topological space is defined by nodes and links, focusing on the connections and relationships between key points in the feature space. In addition, construction methods and decision methods (i.e., behavior planning, path planning, and control signal generation) for the three types of defined space are different correspondingly.

A systematic summary of the different types of driving space has previously been investigated from the perspective of path planning in other reviews [3, 4]. In contrast, this study aims to analyze the fundamental properties of different forms of driving space and systematically compares them from the perspective of construction methods and application in decision-making, which could help understand the relationship between the perception and decision-making.

The remainder of this article is structured as follows. The second section introduces the construction methods of different kinds of driving space and comparatively analyzes their advantages and disadvantages. In the third section, the application of the driving space is reviewed from the perspective of AV driving decisions, including rule-based and learning-based methods. For rule-based decisions, the discussion of application is based on different kinds of space, and for learning-based methods, end-to-end learning and reinforcement learning (RL) are considered separately. Finally, a brief conclusion and future research direction are given in the last section. The flowchart of the study is shown in Fig. 1.

Fig. 1
figure 1

Flowchart of the study

2 Construction of the Driving Space

The driving space integrates the roles of perceiving the driving environment and providing the basis for decision-making. To obtain a complete understanding of the environment, the reconstructed driving space should contain the space boundary (usually the road boundary) and driving-relevant elements, such as traffic lanes and obstacles. The approaches for constructing the driving space fall into three categories: (1) the grid space, which is a discrete description covering the entire surrounding space; (2) the feature space, which is described in continuous coordinates and focuses on the position and shape of the space boundary and obstacles; and (3) the topological space, which is composed of nodes and links as an abstract representation of features. It should be noticed that topological space is widely used in robotic research while rarely used in research on autonomous vehicles.

2.1 Construction of the Grid Space

The concept of the grid space was first proposed through robotic research by Elfes in 1987 [5]. The space is first segmented into small grids, and then the probability of occupation is calculated for each cell. The detection task is performed by calculating the probability of occupation according to sensor information.

In the 1980s, most robotic driving space detections were realized by sonar sensors [5,6,7]. The location of obstacles and walls were found using sonar reflection. Moravec [8] went further and combined sonar sensors and stereo cameras to achieve grid space construction. Based on this idea, Marchese considered moving object by introducing time axis, constructing several grid spaces in future time series by prediction according to the speed of moving objects [9]. These early researches laid foundation of grid space. The basic ideas of space segmentation and occupation probability were widely applied in the following research.

The grid space is also widely used in research on AVs. LiDARs have replaced sonar sensors as distance sensors owing to their improved accuracy. The occupation probability is calculated using properties of the LiDAR point cloud in each grid, such as height and density. LiDAR detection of the grid space is divided into two categories: (1) 2D detection, usually performed by a LiDAR with 4 or fewer channels, and (2) 3D detection, usually performed by a LiDAR with 16 or more channels. In 2D detection, the LiDAR is installed in front of the vehicle, and the point cloud is closely spaced in vertical distance. The occupancy grids are then acquired by projecting the whole point cloud to the ground [10,11,12], as shown in Fig. 2a [9]. In 3D detection, the LiDAR is installed on the top of the vehicle; thus, the point cloud can directly reach the ground, as shown in Fig. 2b [13]. Bohren et al. [13] segmented the flat ground from the whole point cloud by plane extraction. Na et al. [14] considered the continuity of the ground to achieve detection on both even and uneven topographies. The points higher than the ground were marked as the space boundary. Moras et al. [11] went further and combined LiDAR detection with a high precision map, optimizing the detection results by lane and road boundary on the map. The fundamental idea of LiDAR-based AV driving space construction followed the pioneering robotic research.

Fig. 2
figure 2

LiDAR detection of the grid space [10, 13]

Cameras have also been applied in the driving space construction for AVs. With the development of machine vision technology, it is now possible to segment the drivable area out of an image [15,16,17,18]. Some research only focused on the pixel plane, while others transformed the image to a grid map on the ground. Others have gone further and fit the boundary into the feature space. Yao et al. [15] achieved drivable area detection on the image with support vector machine (SVM). Hsu et al. [16] first segmented the drivable area on the image, and then transformed it to the ground plane using vanishing point detection and inverse perspective transform, and finally obtained a grid map of the drivable area. Camera-based approach focuses on free drivable area and operates in the image plane; however, the basic idea of grid space construction remains the same.

Sensor fusion in the grid space is achieved by the fusion of occupation probability, as shown by the example in Fig. 3. Each sensor calculates the probability independently, and then the probabilities are fused using the Bayes method [8, 19], Dempster–Shafer method [11, 20, 21], or other fusion methods [22,23,24]. The open framework of occupation probability makes grid space highly adaptive to sensor layout and fusion algorithm, which is an advantage of grid driving space.

Fig. 3
figure 3

Sensor fusion in the grid space

The size and distribution of grids are two important factors in the grid space definition. In the existing research using regular grids [20, 25, 26], the size of the grid is approximately 20 cm, which is smaller than the size of the vehicles and pedestrians on road and thus is suitable for structural road application. Regarding grid distribution, the 2D grid space is more commonly used due to its simplicity, whereas the 3D grid space is rarely used due to its complexity. Plazaleiva et al. [27] used 3D LiDAR to perform space construction in a voxel grid, but the 3D grid map was then transformed into 2D for further application. In the application of 2D grid space, uniform distribution is commonly used because it is simpler to realize sensor fusion on grid cells of equal size. However, large amount of computation and storage resources are required in uniform grid space detection. Therefore, it is not economically feasible to calculate on each grid in a large open area. Considering this, some researchers have used non-uniform grids. The commonly used non-uniform grid layout is quadratic trees [3, 28, 29]. In this layout, the grids are dense where there is an obstacle and sparse in the free space; thus, a perception focus is formed, as shown in Fig. 4. The number of grids can be reduced compared to uniform distribution. Thus, computation cost and storage resource consumption can be reduced. However, there are still some drawbacks of non-uniform grids. The path planned in non-uniform grids is less smooth than in uniform grids [3], and it is also difficult to unify the results of different sensors in multi-sensor fusion, since the focus of each sensor might be different.

Fig. 4
figure 4

Non-uniform grids

2.2 Construction of the Feature Space

In the construction of the driving space, obstacles are represented by the position coordinate values and their geometric shapes while the space boundary is fit into an analytic formula. The whole space is described continuously and geometrically as compared with the discrete description of the grid space.

In robotic research, the feature space is described by geometric figures composed of angles, edges, and curves, and some researchers also consider the speed of the obstacles (see Fig. 5). This is suitable for an indoor environment for a robot where obstacles and boundaries are in various unpredictable shapes. Correspondingly, these geometric elements are the targets of detection, which is typically performed by sonars or camera sensors. Ip et al. [30] detected geometric features in the space by clustering sensor information to find a collision-free path. Hardy et al. [31] used polygons to describe obstacles to construct a geometric feature space. In simultaneous localization and mapping (SLAM), a map is built in real time and enables the robot to locate itself within an unknown environment, which is a hotspot in robotic research. The feature map is widely used in SLAM [32,33,34] and consists of points, edges, corners, etc. SLAM is also applied in AVs as an auxiliary means for positioning, especially when GNSS (global navigation satellite system) does not work [35, 36]. For indoor robot applications, the feature space provides a different idea compared with the grid space. By modeling the space with sparse geometric features, it is more computationally economic and more intuitive.

Fig. 5
figure 5

Feature driving space for a robot

On structural roads, space boundaries and on-road obstacles have certain patterns. Therefore, the feature space for AVs is constructed by traffic elements such as the road boundary, lanes, vehicles, and pedestrians, as shown in Fig. 6. Traffic lights, traffic signs, and crossings are also important elements in the feature space. The detection tasks can be divided into several small tasks: road boundary detection, lane detection, object detection, etc. These are all important research topics for environment perception.

Fig. 6
figure 6

Feature driving space construction for AVs

Feature space construction is composed of several subtasks. Road boundary detection is typically based on LiDAR and camera similar to grid space construction. Here in the feature space, an analytic curve takes the place of the grid. For example, Loose et al. [37] used a Bézier curve to fit the road boundary. Lane detection is also important in the feature space, and numerous studies have been conducted on rule-based lane detection [38,39,40]. In recent years, some researchers used deep learning in lane detection [41, 42]. With the development of sensor technologies and machine vision, object detection and tracking have developed quickly in recent years. Vehicle and pedestrian detections are usually carried out on images by machine learning [43,44,45,46,47]. Some researchers have realized multiple object tracking (MOT) by the fusion of radar, LiDAR, and cameras [48, 49]; other researchers have achieved traffic sign detection [50, 51] and traffic light detection [52, 53]. By combining these elements, a complete feature driving space can be constructed. However, since each part of detection tasks is completed independently, further researches should be carried out to get a more systematic and integrated feature driving space.

2.3 Construction of the Topological Space

The topological space is another way to describe the environment besides the grid and feature space [2]. The topological space shows the landmarks and their connection relationship. Distinct landmarks in the space, usually vertexes of polygon obstacles, corners, and doors, are set as nodes in the topological space. The links show the connection between the nodes. The nodes in the topological space are depicted geometrically in the feature space; however, the topological space focuses on their connection rather than the actual distance and position in the world coordinate, such as the visibility graph [54] and Voronoi diagram [55]. Omar et al. [3] gave an example (see Fig. 7) where the vertexes are connected by straight links.

Fig. 7
figure 7

Visibility graph (a topological map) [3]

It is easy to find the shortest path in the visibility graph. Ryu et al. [2] pointed out that the topological space is more suitable than the grid and feature space for robot applications owing to sensing error endurance. There are two main advantages of topological space in robotic research. First, its perception system only needs to find key points in the space rather than accurate geometric boundaries or occupation possibility; thus, it has great advantage when the sensors are not accurate enough. Second, robots can steer quickly and follow sectional straight line as the shortest path, and therefore the topological space is more suitable in looking for shortest path on the landmark network.

However, considering the structural road and vehicle dynamics, the topological space is not that suitable in AV research. Firstly, the shortest path as sectional straight lines cannot be executed by a vehicle due to vehicle dynamic restriction. To obtain a drivable path, more accurate space boundary and geometry information should be provided other than landmarks. Secondly, on structural roads, path planning is no longer restricted to looking for non-collision path, but needs to consider traffic rules and behaviors of other traffic participants (e.g., vehicles and pedestrians). The detailed information is difficult to be described by existing topology-based space models. In addition, more accurate sensors on AVs make it possible to receive more detailed information; thus, error endurance as an advantage of topological space is not that important to AVs. Due to these facts, topological space is seldom applied in local driving space construction for AVs. However, some ideas of the topological space are embodied in grid-based path planning.

It should be noticed that topology is applied more in other aspects of autonomous driving than in driving space construction. For example, topology is used in SLAM, an important technology in autonomous driving. SLAM can improve location accuracy and is also applied in constructing high precision maps. GraphSLAM is a kind of SLAM that applies topology, with nodes representing the pose or the feature in the map, and links representing a motion event between two poses or a measurement of the map features. However, the poses are expressed in a topological graph when positioning and mapping are considered separately, though the map it builds is still a grid map or feature map [56, 57]. In other words, the topological graph is only an intermediate result showing the relationship between the poses and the map features; however, the output space expression is still grid based or feature based. Also, topological graph is widely applied in macroscopic road-level or lane-level navigation maps [58, 59], showing the connectivity between roads, lanes, and intersections. However, the topological space is seldom used in local dynamic driving space for local decision-making.

2.4 Comparative Analysis

Table 1 lists the characteristics of different types of space for comparison. The following comparison results can be concluded.

Table 1 Comparison of different types of space
  1. (1)

    There are fundamental differences among the three types of driving space.

  2. (2)

    The three space types have different mathematic characteristics. The grid space is discrete and emphasizes completeness, whereas the feature and topological space are continuous and sparse. Therefore, constructing grid space require many computational and storage resources. In contrast, the feature and topological space are sparse and intuitive and thus are more computationally economic but less detailed than grid space.

  3. (3)

    Regarding detection targets, the grid space describes the space itself. Therefore, it does not focus on semantic information. The feature space focuses on the geometric features of the boundary and obstacles rather than the open space. The topological space focuses more on the link between key points in the feature space versus position and distance, which works well for indoor robots but is not suitable to apply in AVs.

  4. (4)

    In robotic research, the grid space needs calculations on occupancy probability on grids; sensor fusion is then achieved by probability fusion. The feature space and topological space need the detection of geometric features, such as points, corners, edges, etc. The difference between the feature space and topological space lies in the representation methods. Topological space has advantages in looking for shortest path and error endurance.

  5. (5)

    For AV applications, the grid and feature space are commonly used (versus the topological space). The grid space construction method is the same as that of robot applications, while the sensor layout is typically different. There is usually no semantic information in the grid space. The feature space contains different traffic elements, such as roads, vehicles, and pedestrians. Therefore, it has semantic information and benefits from object detection and tracking technologies. However, the detection methods of different elements in the feature space are researched separately; the construction of feature driving space for AVs still needs systematic integration.

Considering the observations above, some researchers have combined grid and feature representation for AV application. To take advantage of the grid representation of open space, the boundary can be fit into a continuous formula based on the grid space [60,61,62]. The position, shape, and speed of vehicles and pedestrians are acquired by object detection. Then, the above elements are integrated in the feature space, as in the theses of Zhang [63] and Liu [64]. However, each part of the detection task is completed independently; therefore, this solution still lacks completeness.

In all, the existing methods can achieve driving space detection using sensor information and then reconstruct the driving space by expressing it with grids, features, or topology. It can be found from the previous analysis that the three categories of space definitions and detections all have their advantages and disadvantages.

3 Application of the Driving Space in Autonomous Driving Decisions

The driving space provides the constraints for behavior planning, path planning, and control signal generation in the decision layer. Existing AV decision methods can be divided into two categories: rule-based and learning-based methods.

3.1 Rule-Based Decision Methods

A rule-based AV decision can be realized in the grid or feature space. In the decision layer, behavior decision and path planning are achieved using the driving space constraints. Although the topological space is seldom used in driving space construction in autonomous driving, its concept is applied in the grid space decision.

3.1.1 Rule-Based Decision Methods in the Grid Space

The rule-based decision methods in the grid driving space can be divided into two categories: those that directly plan a path in the grid space, and those that plan a path using discrete lattices sampled from the driving space.

Direct decisions on the grid space use the occupancy probability [65, 66]. Hundelshausen et al. [65] chose a trajectory from a group of arcs by setting the occupation probability of passing grids as cost, as shown in Fig. 8. Similarly, Mouhagir et al. [20] chose a clothoid curve and further applied the Markov decision to realize lane changing. These methods made good use of the probability information on the grid map. However, they were less intuitive and simple compared with decision methods in the feature space. Moreover, a single obstacle was represented as several independent grids (not a whole), causing unnecessary calculations.

Fig. 8
figure 8

Path planning in the grid space [65]

Decision-making using state lattices is another grid-based decision method. Discrete state lattices are generated by sampling within the driving space. The lattices are usually relatively large in size to reduce computation cost. In the decision method with state lattices, the goal is to find the collision-free drivable path in a free area; thus, the graph search or sample-based planning on the nodes is able to complete the task without the step of behavioral planning (e.g., go straight, turn left, etc.). Figure 9 shows an example of the planned path, in which nodes are connected with a sectional-continuous curve to the destination.

Fig. 9
figure 9

Rule-based path planning on state lattices

The nodes and links of state lattices are similar to those in the topological space. However, the nodes in state lattices are not geometrically distinct points but sampled discrete grids set in advance; thus, it should still be regarded as grid space. This similarity shows the decisions on state lattices are made by searching a path on the graph, which is similar to the decisions in the topological space. There are many specific path planning methods on the state lattice [67,68,69,70,71,72,73]. There has been much research on this type of decision process and thus is considered to be reliable. However, sampling in space causes accuracy reduction and information loss in the constructed driving space. Moreover, the generated sectional-continuous path is not as smooth as a single curve; therefore, the driving experience is not as comfortable as the path of a human driver.

3.1.2 Rule-Based Decision Method in the Feature Space

In the feature space, the space boundary formula, position, shape, and speed of obstacles are the decision inputs.

Rule-based behavior planning is usually based on the feature space. Behavior planning finds the best behavior among a finite number of possible behaviors [74,75,76,77], such as vehicle following, lane changing, merging, turning, etc. In existing behavior planning research, the road boundary and lanes are the essential inputs; for objects, usually only vehicles are considered. However, in environment perception research, there are typically many types of objects in the feature space; for example, there are eight types of objects in the research of Prabhakar et al. [78]. Objects such as traffic signs and traffic lights, as well as specific vehicle classifications (trucks, buses, etc.), are not well considered in the current behavior planning researches.

There is also much research on path planning in the feature space. Ziegler et al. [79] represented obstacles with polygons and then predicted their moving paths X(t) and finally achieved path planning Xpred, j(t), as shown in Fig. 10. Brechtel et al. [80] searched among many continuous trajectories in the feature space, which is also a typical method for trajectory planning. In feature space, the space boundary and obstacle information are the constraints for the optimization problem of path planning. There are many models of path planning in feature space, such as clothoid [81, 82], Bézier line [83, 84], spline [85, 86], etc. These methods focus on searching for a smooth, collision-free path. However, the semantic information provided by feature space is usually not fully considered.

Fig. 10
figure 10

Rule-based decision in the feature space [79]

3.2 Learning-Based Decision Methods

Machine learning is widely used in the decision layer of AVs. Some researchers have used supervised learning to realize end-to-end driving, which is usually based on deep learning on images or LiDAR clouds. Other researchers have used reinforcement learning to make driving decisions. In addition, combining learning-based and rule-based methods is also an important research area.

3.2.1 The Driving Space in the End-to-End Driving Decision

In 1989, Pomerleau [87] used a simple neural network with one hidden layer to realize the end-to-end prediction of steering angle. This was considered to be pioneering in end-to-end autonomous driving.

With the development of deep learning, a convolutional neural network (CNN) was applied in end-to-end driving. CNN has better performance in feature extraction than simple networks and is therefore better for end-to-end driving. Bojarski et al. [88] realized end-to-end control from image input to steering angle control, which made end-to-end driving another new research hotspot [89, 90]. In end-to-end driving, control signals are directly predicted from sensor input, while environment perception and space description are implemented in the neural network. Bojarski et al. [88] visualized the CNN weights, as shown in Fig. 11. It can be seen that CNN extracted the driving space (road boundary in this example); however, the driving space is not explicit and difficult to appropriately optimize.

Fig. 11
figure 11

Implicit driving space in end-to-end driving [88]

End-to-end driving has attracted researchers thanks to its novelty and simplicity. There is no need to remodel the environment and set complex control rules in this framework. However, problems arise with this simplicity. The system is highly integrated. The driving space is not an intermediate result of environment perception in the rule-based decision and is therefore difficult to be appropriately optimized. Moreover, since the neural network output is uncertain, there is no guarantee that the end-to-end output is reliable and safe in any conditions, especially in unfamiliar scenarios outside the training set. These problems may be solved with large-scale dataset and deeper neural networks. Future development of computational ability will support this method better. However, with current computational ability and datasets, the combination of end-to-end driving and rule-based methods is more reliable.

3.2.2 The Driving Space in the Reinforcement Learning Decision

The basic idea of reinforcement learning is to generate a control policy by adjusting actions according to environment–reward feedback. This framework is used in AV research and many other research areas. Markov decision is the basis of reinforcement learning and actions optimized by environment input. Therefore, it caters to the perception–decision framework of AV technologies.

In AV applications, reinforcement learning is similar to a human driver’s decision method. Firstly, the driving space is modeled with the road boundary, traffic lanes, position, speed, and acceleration of other vehicles in the feature space. Secondly, actions are defined as discrete behavior decisions [91, 92] (lane changing, going straight, turning, etc.) or continuous control signal outputs [93, 94] (steering angle and acceleration). Meanwhile, a reward is set according to the driving task, e.g., passing efficiency, collision, etc. Finally, the training process of trial and correction is carried out to optimize the driving policy. It is not suitable to apply grid space in reinforcement learning without combination with deep learning. As typical Markov decision process needs to describe the environment with a finite set of states and calculate the probability of transition between the states, it is difficult to apply grid space with large amount of data. As shown in Fig. 12 [91], this research is based on simulated feature space with the road boundary and vehicles as input, and behavior decisions as actions. Reinforcement learning relies on simulation since it needs to find the best driving policy by trial and error; therefore, it needs some adaption on real-road tests.

Fig. 12
figure 12

Reinforcement learning in the feature space [91]

Deep reinforcement learning combines the perception ability of deep learning with the decision ability of reinforcement learning. Therefore, deep reinforcement learning can adequately process higher dimensional or larger amount of data, e.g., the grid space or raw sensor information. Kashikara [95] used the grid space as CNN input to realize deep reinforcement learning. However, the applied grid space had low resolution with only one car occupying one grid. This is different from the high-resolution perception result, but the idea of using grid space is still important. Some researchers have used raw sensor input and deep reinforcement learning [96, 97] with images or point clouds as input [97]. Liu et al. [98] further combined deep reinforcement learning with supervised deep learning to make driving decisions. Deep reinforcement learning provides more options than traditional reinforcement learning and thus has greater potential in AV applications.

3.2.3 Combining Learning-Based and Rule-Based Decision Methods

Learning-based decision methods can avoid the complexity of setting rules for various scenarios; however, the trained network is a black box, which makes the output uncertain and uncontrollable, especially in unfamiliar scenarios. To improve safety, some researchers have combined learning-based and rule-based methods to make decisions. Correspondingly, the application of the driving space is also in combination form.

Xiong et al. [99] combined reinforcement learning, lane keeping, and collision avoidance by weighting their steering and acceleration control signal outputs, as shown in Fig. 13. The reinforcement learning directly used the sensor input; the other two tasks were carried out in the feature space, calculating control signals according to bias to lane center and positions of other vehicles. Hubschneider et al. [100] revised the trajectory of end-to-end learning with rule-based methods according to obstacle positions in the feature space. It can be found that the driving space is considered independently in both learning-based and rule-based components of the combined decision.

Fig. 13
figure 13

Example of the combination of learning-based and rule-based decisions

3.3 Summary of Driving Space Application in the AV Decision Layer

In summary, there are various forms of driving space applications in AV decisions. Table 2 presents the relationship between decision-making categories and driving space categories.

Table 2 Driving space application in AV decision-making

Except for end-to-end and some deep reinforcement learning decisions, most decision methods are based on the driving space reconstructed by the perception layer. Both the grid and feature space are applied in rule-based and learning-based decision methods.

Regarding rule-based decision methods, in the grid space, path planning is based on occupancy probability on grid map or by searching for a path in the sampled lattices. In the feature space, the road boundary and object information are the decision constraints for behavior planning or path planning. However, perception research and decision research are still not well integrated. For the grid space, research on perception focuses on improving accuracy through sensor fusion. However, the decision layer faces some problems rarely considered in perception research on grid space. Firstly, an obstacle is not described as a whole, and the computational cost is high. Secondly, for path planning on the sampled lattices, the paths are only sectionally continuous and thus are not as accurate as one single curve. For the feature space, the decision layer is better suited to the perception results and is more similar to human decision-making; however, the semantic information is still not thoroughly considered. For both grid space and feature space, the decision layer is still not able to make proper requests to the perception layer regarding accuracy and safety, nor can it determine what it needs to detect or how it needs to express, and thus cannot guide the perception layer to meet the demand of decision-making.

Regarding learning-based decision methods, end-to-end driving decisions directly use raw sensor input, and therefore driving space is not constructed explicitly. However, this causes difficulty in appropriate optimization and the problem of uncertainty. Therefore, combining it with rule-based methods in the explicit driving space is more reliable at this stage. In reinforcement learning, the feature space is better suited for application due to its simple description of the space; however, the existing research rely on simplified simulation without making full use of the constructed feature space. The grid space is applied in some conceptual deep reinforcement learning studies, but it still needs further study to cater to the constructed grid space by perception research. In learning-based methods, some research do not apply the explicit driving space, and some still depend on simplified simulations, and thus there is a gap between decision module and perception module. Learning-based decision methods need further development for the real applications; however, with the development of deep learning and computational ability, learning–based methods are promising. Future studies should follow the development of learning-based methods and focus on the application of driving space.

For the decision process combining rule-based methods and learning-based methods, each part applies to the driving space independently.

In summary, both rule-based and learning-based decisions are not sufficiently consistent when using driving space construction results in the perception layer. In addition, the wealth of information provided by driving space construction is not fully utilized by the decision-making process. This situation is exacerbated by the lack of demand for driving space construction from the decision layer. In addition, the existing decision-making methods in local driving space focus on local driving decision yet lack integration with the global driving task and map information.

4 Conclusions and Future Direction

Existing research on driving spaces form a complete forward path, from its construction to its applications in decision-making. Based on different types of space definitions, driving space construction technology has been improved with new sensor technology and sensor fusion algorithm; the decision technology uses the constructed driving space as its driving environment input. However, this chain still lacks integrity; the perception layer aims to increase accuracy, while research on decision focus on designing new decision policy but consider less on the characteristics of perception results. Therefore, the driving space construction results cannot support the decision-making process solidly. In rule-based decision methods, those based on the grid space experience unnecessary and repeated calculations. The methods that rely on the feature space cannot take full advantage of the wealth of information in the constructed driving space, especially semantic information. Learning-based methods are still in the stage of conceptual research; some of them do not need the driving space construction, and others use a simplified driving space in simulations. This results in a gap of driving space construction technologies between simulation and real world.

In future research, it will be important to combine the perception layer and decision layer more systematically based on a deeper understanding of the driving space. Based on analyzing the decision demand on accuracy and safety, it is important to determine what the driving space should contain and how to define and construct a driving space, so as to reduce the existing gap between perception and decision. This is of great significance for future research on the driving space.