Complex railway systems: capacity and utilisation of interconnected networks

Introduction Worldwide the transport sector faces several issues related to the rising of traffic demand such as congestion, energy consumption, noise, pollution, safety, etc. Trying to stem the problem, the European Commission is encouraging a modal shift towards railway, considered as one of the key factors for the development of a more sustainable European transport system. The coveted increase in railway share of transport demand for the next decades and the attempt to open up the rail market (for freight, international and recently also local services) strengthen the attention to capacity usage of the system. This contribution proposes a synthetic methodology for the capacity and utilisation analysis of complex interconnected rail networks; the procedure has a dual scope since it allows both a theoretically robust examination of suburban rail systems and a solid approach to be applied, with few additional and consistent assumptions, for feasibility or strategic analysis of wide networks (by efficiently exploiting the use of Big Data and/or available Open Databases). Method In particular the approach proposes a schematization of typical elements of a rail network (stations and line segments) to be applied in case of lack of more detailed data; in the authors ’ opinion the strength points of the presented procedure stem from the flexibility of the applied synthetic methods and from the joint analysis of nodes and lines. The article, after building a quasi-automatic model to carry out several analyses by changing the border conditions or assumptions, even presents some general abacuses showing the variability of capacity/utilization of the network ’ s elements in function of basic parameters. Results This has helped in both the presented case studies: one focuses on a detailed analysis of the Naples ’ suburban node, while the other tries to broaden the horizon by examining the whole European rail network with a more specific zoom on the Belgium area. The first application shows how the procedure can be applied in case of availability of fine-grained data and for metropolitan/regional analysis, allowing a precise detection of possible bottlenecks in the system and The presented case studies show that the method allows indicative evaluations on the use of the system and comparative analysis between different elementary components, providing a first identification of ‘ weak ’ links or nodes for which, then, specific and detailed analyses should be carried out, taking into account more in depth their actual configuration, the technical characteristics and the real composition of the traffic (i.e. other elements influencing the rail capacity, such as: the adopted operating systems, the station traffic/route control & safety system, the elastic release of routes, the overlap of block sections, etc.).

Method In particular the approach proposes a schematization of typical elements of a rail network (stations and line segments) to be applied in case of lack of more detailed data; in the authors' opinion the strength points of the presented procedure stem from the flexibility of the applied synthetic methods and from the joint analysis of nodes and lines. The article, after building a quasiautomatic model to carry out several analyses by changing the border conditions or assumptions, even presents some general abacuses showing the variability of capacity/utilization of the network's elements in function of basic parameters. Results This has helped in both the presented case studies: one focuses on a detailed analysis of the Naples' suburban node, while the other tries to broaden the horizon by examining the whole European rail network with a more specific zoom on the Belgium area. The first application shows how the procedure can be applied in case of availability of fine-grained data and for metropolitan/regional analysis, allowing a precise detection of possible bottlenecks in the system and the individuation of possible interventions to relieve the high usage rate of these elements. The second application represents an on-going attempt to provide a broad analysis of capacity and related parameters for the entire European railway system. It explores the potentiality of the approach and the possible exploitation of different 'Open and Big Data' sources, but the outcomes underline the necessity to rely on proper and adequate information; the accuracy of the results significantly depend on the design and precision of the input database. Conclusion In conclusion, the proposed methodology aims to evaluate capacity and utilisation rates of rail systems at different geographical scales and according to data availability; the outcomes might provide valuable information to allow efficient exploitation and deployment of railway infrastructure, better supporting policy (e.g. investment prioritization, rail infrastructure access charges) and helping to minimize costs for users.

Introduction
In recent years the European Union has devoted significant efforts to improving rail competitiveness at European scale; due to the low external and environmental costs, railways can be considered a key factor for the sustainable development of a more competitive and resource-efficient transport system (European Commission, White Paper (2011) [1]). Among the many issues identified by European policy makers, rail network bottlenecks are becoming a cause of concern particularly in certain corridors as increasing traffic eventually leads to congestion and degraded performance of the railway system.
Set against this background, and taking also into account the forecasts of rail traffic across the whole Europe from 2010 to 2030 or even to 2050 (see for example [2]), some relevant policy questions are inevitable: Is the actual rail infrastructure really able to absorb the forecasted traffic, without significant impacts on the punctuality of the services? Would the already planned interventions on the European railway infrastructure guarantee an adequate available capacity and consequently adequate reliability and level of service? Will the congestion on some parts of the network become an extremely limiting issue for passenger or freight trains? To which extent would the coveted competition in an open railway market be influenced by capacity scarcity, mainly during peak hours or along more profitable corridors?
Not surprisingly EU DIRECTIVE 2012/34 (related, among other things, to the tough task of allocating infrastructure capacity) specifies that the infrastructure managers should clearly indicate congested railways in their Railway Network Statements (NS); these are documents presenting in detail the physical and operational characteristics of the networks. Just as an example, the Italian NS (BProspetto Informativo della Rete^) for 2014 [3] indicates as congested the doubletrack lines with a daily flow higher than 200 trains/day in both running directions, considering average level of heterogeneity.
Clearly, an accurate capacity estimation of a rail network is the starting point for more efficient exploitation and deployment of railway infrastructure and for better supporting policies (e.g. investment prioritization); it requires a robust methodology and very detailed data (infrastructure, timetables, rolling stock, etc.). However, one of the main difficulties faced in defining a broad analysis of capacity and related parameters for the entire European railway system (i.e., travel times, reliability, connectivity, costs/benefits, access charges, accessibility, etc.) stems from the lack of available or usable data.
Although, for example, timetables are generally in the public domain, there is still the perception of such data as commercially sensitive information; hence the difficulty in identifying a harmonized, comprehensive and detailed European database. Various attempts to improve this situation are currently on-going, especially but not limited to infrastructure data (i.e. the International Union of Railway's ERIM Project [4] and RailTopoModel [5], the RailML initiative [6], the European Railway Agency's Register of Infrastructure [7], the MERITS database [8], etc.).
In this context, this contribution proposes a method thought for the capacity and utilisation analysis of complex interconnected rail networks, and having a dual aim: on one side it allows an efficient and theoretically robust examination of suburban (small-scale) rail systems and on the other side it provides a solid approach to be applied, with few additional and consistent assumptions, for feasibility or strategic analysis of wide (international, large-scale) networks by efficiently exploiting the use of Big Data and/or available Open Databases.
To underline the importance of both the levels of the problem, it is worth to remind how the European Commission with the railway packages and the related directives, after having fully opened to competition the markets for rail freight services and for international passenger transport (long distance, i.e. large-scale), currently is focusing also on national markets for domestic passenger transport services (i.e. regional, smallscale) which remain largely closed and are still considered the bastions of national monopolies.
Clearly, the capacity of rail infrastructure is a complex issue depending upon several factors; the benefits of creating a transnational method for its assessment are highlighted also in the UIC Code 406 (2004) [9]. Indeed in the last years the scientific literature has devoted great efforts in addressing this issue; many contributions provide an accurate distinction (synthetic, analytical, simulation models) and description of different methodologies (see [10][11][12][13] or [14]). Several approaches address the assessment of line capacity (as described in [15]); Landex et al. in [16], for example, focus on the application of the UIC code 406 while [17] and [18] describe the Capacity Utilization Index (CUI) procedure applied in UK. Other authors analyse the issue at station level: Malavasi et al. [19] provide a review of capacity methods for complex railway nodes and a detailed description of some synthetic approaches; Lindner in [20] tries to extend the applicability of the UIC Code 406 even to the stations. Also the UIC, in recent studies, present a net distinction between line [21] and node [22] capacity, offering a comparative analysis of different synthetic or analytical methodologies for their evaluation. Finally Watanabe et al. in [23] propose a different methodology for identifying 'bottleneck' stations based on passenger flows, volumes of transfer passengers through the stations and on multi-objective optimization with a generic algorithm.
Regarding the rail system as a whole, it is not straightforward to give a unique measure of capacity because of complexity and diversification of components (lines, stations or their subparts) but it is possible to estimate a global capacity value by referring to the lowest local values. Indeed several papers focus also on the issue of capacity at network level; for example [24] suggests an analytical approach while [25] proposes a queuing model for capacity assessment of a railway system.
In the authors' view, the strongest points of the approach presented in the next paragraphs stem from the flexibility of the synthetic methods (e.g. easily implementable in an automatic or semi-automatic way by means of spreadsheets or software like Matlab and usable either with detailed data or in case of more aggregate level of information by making relevant assumptions) and from the joint analysis of nodes and lines (allowing to identify bottlenecks among all the elements of the network). After all it is quite intuitive to expect that for double-track lines the critical elements may be represented by the stations while for the single-track sections, the bottlenecks or the major utilization may correspond to the line.
Of course, all the above mentioned literature and also the proposed procedure refer to synthetic or analytic methodologies; it is a different matter when considering simulation models and algorithms (e.g. see chapter 10 of [10] or [26]). Several contributions have already provided simulation analyses, with practical matches and verifications, based on more or less small/local rail networks which are digitally represented through detailed descriptions of their track layouts, signalling systems, block sections, operating rolling stock etc. (see, for example, [27,28] or [29]). Several commercial software products are already available on the market (e.g. Opentrack, Railsys, etc.) allowing a very detailed simulation of the railway system and of its traffic, but requiring of course equally detailed input data.
This article and the presented methodology do not intend to undermine the great value of more complex procedures or even detailed simulation software products already widely available and appreciated by the scientific and technical communities to analyse and represent the operation and the bottlenecks of a rail networks. Rather this contribution seeks to place itself on a different level and to evaluate the issue from a different perspective. While rail companies (e.g. Infrastructure Managers, Rail Undertakings) and sometimes Transport and Rail Authorities/Regulators have access to very detailed and fine-grained data for the rail system of competence/interest (e.g. all the events in the stations and along the lines, such as block sections' occupations or releases, are recorded and stored instant by instant), other institutions or even research centres may rely on different sources of data, such as publicly available Open and Big Data (for a prospective analysis on using Big and/or Open Data in railways, see [30,31]). We may think, for example, about funding institutions (European Commission, Development Banks, etc.) which would benefit of rail network analyses for their policies or interventions on wide areas (e.g. whole countries). Basically the presented methodology might be particularly valuable in case of feasible studies (when time, cost and complexity of more detailed approaches would be less appealing) or in case of analysis based on coarse data, when and if more detailed information are not available. Of course the results, as better explained in the next paragraphs, are less precise than the ones obtained with more accurate methodologies and should be handled with care. They may provide a first indication of the usage or of likely bottlenecks but such indications should be verified with more detailed and localized analyses by means of simulation or more comprehensive methods before significant actions might be taken. Anyway the procedure allows narrowing the focus on particular areas/zones, for which then other tools may offer a better picture based on more and more circumstantial data.
Regarding the structure of this contribution, after this introduction the article describes the approach and the method, characterized by the differentiation in lines and nodes for the schematization of the network; moreover real applications to a small suburban network and to the European railway system are presented, for both testing and validating the applicability and the results of the proposed procedure.

Methodology
The proposed method aims to evaluate capacity and utilisation rates of complex interconnected rail networks at different geographical scales (coverage) and according to data availability, by analysing jointly all the components of the system (i.e. stations and lines) in order to identify critical or weak points (i.e. bottlenecks).
Indeed the evaluation of carrying capacity of complex railway networks is a typical problem to be examined in metropolitan areas where the same infrastructure is used for different services (metropolitan, regional, national, passenger, freight, etc.). The frequency of these services is usually fairly high, constant during specific periods of the day (basic interval schedules) and variable according to seasons and years (demand configuration). In these circumstances, the most common problems to be considered include the identification of the infrastructural critical elements as well as the definition of the most effective actions for the full exploitation of the carrying capacity.
With regard to the strategic analysis of large (e.g. European-wide) railway systems, the studies are often bound by lack or incompleteness of data; even if there are several attempts to create comprehensive and standardised databases, and even if the good practices of open access and analysis of big data are mitigating notably the issue, it is not yet easy to acquire detailed infrastructure, timetables and rolling stock data for each European country.
This contribution tries to address both these problems; it is worth to notice that the main differences in the two described scenarios can be synthesized in terms of distances and frequencies: high speeds, low-frequencies, high distances between stops characterize the long-distance circulations (wide network analysis) while low-speeds, high-frequencies, short distances between stops characterize suburban and metropolitan rail services.

Schematisation of the network
In order to analyse the infrastructure, this contribution considers four different basic components of the whole rail network: halt, passing or terminus stations and line segments between consecutive nodes. However the proposed research at the moment does not focus yet on terminus stations since they always deserve more detailed and specific analysis as a consequence of both the topological complexity (large number of switches and high variability of the track configuration, several lines converging in the node, etc.) and the particularity of the services (longer dwell times to allow the reversal of the running direction, organisation of the timetable and stop times to eventually guarantee interchanges and connectivity between different services, high number of served trains with consequent high utilisation of tracks and platforms, etc.); anyway a possible further research development may be represented by a synthetic and standardised analysis even for these elementary components of the railway system.

Standardised schemes
The definition of the typical/standardized schemes for the stations has taken in account both the rail traffic distancing system, based on the block sections, and the topological plan (e.g. by considering that the existence of switches and connections between parallel tracks determines a considerable extension of the entering and exit areas of the station).
Conventionally the study has defined the stations as nodes with a variable topology due to the presence of switches and where it is possible to provide passenger services. A further distinction has been applied between terminus and passing stations (see Fig. 1.b, c) based on configuration and type of offered services. Terminus nodes usually present longer dwell times, being characterised by a change in the running direction of trains and terminus services for some routes; entering and exiting switch areas are overlapped, with consequent higher utilisation of the same infrastructure and more conflicts between incoming and outgoing paths. Passing stations, instead, present a configuration with two distinct zones for the entrance and the exit of trains.
Moreover, the halt station is defined as a facility with a fixed configuration (i.e. only the main tracks/platforms, see Fig. 1.a) and allowing for passenger services.
Besides the described elements, the rail lines are divided into segments between consecutive nodes; the number of block sections for each of these segments depends on the spacing/signalling system adopted and on the distance between the stations. In particular: & for double-track lines the analysis assumes an automatic block system, with the number of block sections (if calculated) given by the ratio of the distance between the consecutive rail stations along the considered route on the conventional length assumed for the block sections. & for single track lines, instead, the block section is represented by the whole segment between the consecutive stations/junctions.

Theoretical bases
In order to assess the analysis of a rail network, the suggested approach proposes the evaluation of the capacity and utilization for each element of the system. The capacity measure and utilization rate for each line will be, thus, determined by the smallest values calculated for its constituents. It is quite intuitive to expect that for double-track lines the critical elements may be represented by stations or stops while for single-track routes, the bottlenecks or the major utilization may correspond to line segments.
It should be observed that the detected measures of capacity and 'congestion' are theoretical values depending not only on the infrastructure (topological) configuration, but also on the composition of the traffic flows. The presented macroscopic method allows indicative evaluations and comparative analysis on the use of the system, with a first identification of eventual weaknesses and bottlenecks of the network for which, then, more specific and detailed investigations should be carried out, taking into account more deeply their actual configuration, the technical characteristics and the real composition of the traffic.
Basically the approach indicates elements which should be kept under observation, but it might happen that the sections or stations identified as critical would result to be less problematic with a more accurate and complete analysis. The proposed procedure, in fact, considers mainly the topological configuration of the system (length of the line, distances between consecutive stations, number of block section per segment, extension of the stations' areas and number of platforms, etc.), the composition of the rolling stock (suburban/ metropolitan, regional, long distance or freight trains) and the performance of the vehicles (speed, acceleration, deceleration) while neglecting other elements influencing, in a more or less direct way, the rail capacity, such as: the adopted operating systems, the cyclic (or not) clocking of the services (influencing considerably the regularity), the station equipment (traffic/ routes control and safety system), the elastic release of routes, the overlap of block sections, etc.
For this reason the procedure rather than provide a precise and unique measure of capacity, shows a range of possibilities, leaving (if necessary) to a successive analysis, based on more detailed data, the identification of a unique value or the further shrinking of the variability interval obtained by applying this methodology.
Regarding the composition of the traffic flows, the number of passing or stopping trains in the stations and the percentage of trains stopping on main (track-side) or lateral (siding) platforms has been obtained by actual (stations') timetables, as better explained below; an additional 10 % of trains has been assumed to take into account the freight, the out-of-service and/or the deadhead movements not included in the timetable. The daily operating hours have been set to values smaller than 24 in order to consider the closure and/or the maintenance of the infrastructure.
Of course the structure and the automatic scheme of the model built for our analysis allows the evaluation and the comparison of the results according to different possible assignment and operating scenarios.

Line's analysis
The capacity of a line's segment between consecutive stations is estimated through the analytical method proposed in its first edition by the International Union of Railways (UIC) in the leaflet 405R (see [15] or [32]). To summarise briefly the main characteristics of this approach, it is based on the following formula: The average minimum headway for each line is calculated by using a weighted average of the minimum headway between two consecutive trains of the same category: The procedure considers three different typologies of train: long-distance passenger trains (L), local/regional passenger trains (R) and freight trains (M, this last category encloses also out-of-service and empty runs); of course the factors α L , α R and α M in the previous formula represent the percentages of the categories on the total of trains.
The expansion margin was introduced as a result of experiences of many European rail organisations (included UIC) to account for the utilisation of the system. This margin is expressed as a rate of the average minimum headway between convoys; for short periods of time (peak hour), common values of this rate vary between 0.3 and 0.4, while for longer periods (full day) usually values between 0.6 and 0.8 are adopted.
It is worth reminding that the application of the methodology can be developed in a quite automatic way (as we did in our applications) by using simply spreadsheets or other software like Matlab; it means that any change in the parameters and basic assumptions could be addressed easily and quickly.
Double-track lines To calculate the minimum headways for the three categories of convoy and for each segment of double-track lines (and per direction), the utilised procedure has assumed that the line is provided with an automatic block signalling system with three aspects (assumption consistent with our case studies and with the majority of the main European rail network, see for example Fig. 2). It means that the minimum spatial distance between two consecutive trains is constituted by a first block section to guarantee the braking distance of the train (and thus safety conditions), plus a second block section to guarantee the not disrupted circulation (i.e. a running train should always find the approaching signal 'clear' to avoid unnecessary acceleration/deceleration phases and disturbed circulation), plus a distance for the sight of the signal and the clearing of the section and finally plus a distance equal to the train length for the release of the block system (the rear of the train must pass the clearing point).
In practice, the minimum headway for each category will be calculated as (see also Fig. 3): where: & l b. represents the length of the block section (actual or assumed as described better in the case studies) & L the length of the convoy & V L,R,M is the speed relative to the considered category & t s is the sum of the sighting and clearing times

Single-track lines
For single-track lines, the approach assumes that each segment between two stations (and/or stops) can be occupied only by one train per time, independently from its running direction. Even if this assumption is reasonable in case of two trains running in opposite directions, in reality it might be possible to operate more trains in the same direction between two consecutive stations with appropriate equipment and safety rules; anyway this is not always implemented since the traffic of a single line is usually balanced both ways. The special case of unbalanced traffic at this stage is left out of the proposed macro approach. The minimum headway (as time) for each category can be calculated as: where & t V L,R,M represents the travel time with constant speed (l b is the length of the section between two consecutive stations/ stops minus the accelerating and braking distances): & t a and t d represent the acceleration and deceleration times (acceleration and deceleration values are indicated with a and d) & t p represents an additional time for the preparation (electro-mechanical creation and block) of the itinerary

Capacity and utilization of stations
Halt stations The halt station (see Fig. 1.a) is treated as part of a line, so embedded in a block section; this means that we can calculate the capacity with (1) where in the expression of the minimum headway times, for the trains stopping in the station, we consider also the accelerating/braking times and the dwell times.
Passing stations For the capacity evaluation of passing stations, the procedure proposes the recourse to synthetic approaches, in particular the Phottoff method [34]. This method assumes that trains could arrive at any instant of an assigned time period (T) with the same probability; it does not require an assigned timetable because the methodology is based on a global quantitative analysis of the traffic in the period T. Its great advantage is the simplicity of application; for a more detailed description of the method see [19] or [33].
Practically, based on a fixed topological configuration of the station (see Fig. 1.c) and varying only the number of lateral (siding) platforms, we have analysed the incompatibility (conflicts) among of the possible routes and calculated the average number of compatible routes (i.e. a route is compatible with another route if they can be commanded at the same time, that is, if a train can pass through the first one while another train passes through the second one; on the contrary, incompatible routes are never enabled at the same time): where: & N: total number of movements (N = Σ n i = Σn j ); & n i : number of movements concerning the route i; & n j : number of movements concerning the route j; & the summation in the denominator is extended to all the couples of incompatibles routes.
The percentages of services stopping at specific platforms can be obtained by the station timetables which indicate the planned platform for each train. Beside the average number of compatible routes (based mainly on the topological configuration of the station and on the percentage of trains per each route), the method requires also the determination of the average interdiction time between incompatible routes, calculated again as weighted average on the categories of trains: Fig. 3 Scheme for the calculation of the blocking time for double-track lines by [10] For each category, the average interdiction time is obtained by a weighted average of the interdiction times for each couple of incompatible routes: The interdiction times between routes are calculated based on the assumed topological configurations of the stations (see Fig. 1.c) and, for services stopping in the station, they are given by: In reality, depending on the type of incompatibility between the two routes and on the assumed topology, the interdiction times have been assumed as the sum of either all the factors in (10) or only part of them; e.g. for passing services we have considered only the travel time at constant speed and the extra time for the formation of the route.
Basically the coefficient of utilization of the station (namely U) is determined in function of the total occupation time (indicated with B in the following formula) and the total operating time (T) by means of the equation: It is worth here to underline how both the Phottoff and the UIC 405 methods, besides their easiness and quick applicability, present also the further added value to allow a rough estimate of possible delays generated in each elementary component of the system as a function of the utilisation rate.

Simplified abacuses
To allow the examination of several lines and stations and also a sensitivity analysis of the results by varying some basic assumptions or parameters, the proposed methodology has been easily developed in a semi-automatic spreadsheet; anyway for the analysis of wide networks (e.g. the European rail network presented as case study), given the huge amount of data and components to be processed, it could be even more convenient to have some general abacuses for capacity or utilization evaluations based on predefined parameters (variables of the problem).
In this paragraph, for example, we present the abacuses produced to help in the European case study. They are based on specific assumptions and they are particularly useful to understand how changes in one parameter or another could influence the capacity of a network's component; it is quite straightforward to modify the basic factors in order to obtain similar graphs based on different hypotheses (and according to the different needs and scenarios to be evaluated). In the following, first we present the abacuses produced for the capacity evaluation of railway lines, and then the ones concerning the stations.
With regard to European-wide capacity analysis of railway lines, one of the main problems is represented by the lack of detailed timetable and infrastructure data for all the segments; thanks to open-access databases and new data formats (e.g. General Transit Feed Specification-GTFS [34], RailML, etc.), detailed information is available for parts of or entire country networks. Anyway there is still the strong need of a standardised and comprehensive database at European level, providing such information. It is instead quite easy to collect some basic parameters for the whole European rail network, such as the average actual speed allowed on different segments, the number of tracks and the signalling system. Thus the abacuses assume variability in the missing information, in order to get at least a likely range of measures. Figure 4, for example, presents an abacus for (daily) capacity evaluation of double-track lines, assuming different and plausible lengths of block sections along the analysed segment.
By utilizing the calculation presented in paragraph 2.3.1 for the capacity of lines provided with an automatic block signalling system with three aspects, we have represented the curves of capacity as a function of speed, assuming different lengths of the block sections; the yellow area represents a likely capacity range based on the following basic assumptions:  (1)). & Block section's length at least equal to the maximum braking distance of the train according to the speed on the line (for safety reasons, and according to the best practices in railways); in particular the previous figure reports the curves of capacity assuming a block section with a length of 1, 2 or 3 times the maximum braking distance as a function of speed and by considering a constant deceleration value (assumed in the range 0.5-0.6 m/s 2 in Fig. 3).
Basically the yellow area in Fig. 4 is enclosed between the curves corresponding to average block section's lengths of 1.5 km (station distance of 10 km; i.e. left border) and 4 km (station distance of 20 km; i.e. bottom border) or corresponding to a section's length equal at least to the braking distance (top-right border).
A similar approach, but with different basic parameters, has been applied also for single-track lines, as showed in Fig. 5. In this case the discriminant for the different capacity curves is represented by the distance between consecutive stations, assumed variable between 5 and 30 km. Anyway in our analysis we restricted our focus to the range 8-20 km.
To present a wider and more complete picture, Fig. 6 reports also the variability of capacity as a function of the distance between consecutive stations assuming different values for speed (single-track lines) and for the length of the block sections (double-track lines).
Regarding the stations, the variable for determining the utilisation rate (ratio of the number of train movements on the capacity) is represented by the total number of movements in the stations.
For each type of station described in paragraph 3.2, the method has assumed the following dimensions and characteristics: & the passing station has been defined with a conventional total length of 2250 m, divided into three different areas of 750 m, i.e. the platforms, the entering and the exit zones (switch areas); this configuration assumes a length for both incoming or outgoing paths equal to 1,5 km. In particular beside the two platforms corresponding to the main

Sample of abacus for capacity evaluaƟon of double track line (per direcƟon)
deceleraƟon between 0.5 and 0.6 m/s , secƟon length = braking distance, staƟon's average distance = 10 km deceleraƟon between 0.5 and 0.6 m/s , secƟon length = 2 Ɵmes the braking distance, staƟon's average distance =10 km deceleraƟon between 0.5 and 0.6 m/s , secƟon length = 3 Ɵmes the braking distance, staƟon's average distance = 10 km fixed secƟon length = 1 km, staƟon's average distance = 10 km fixed secƟon length = 1.5 km, staƟon's average distance = 10 km fixed secƟon length = 2 km, staƟon's average distance = 10 km fixed secƟon length = 2.5 km, staƟon's average distance = 10 km fixed secƟon length = 3 km, staƟon's average distance = 10 km fixed secƟon length = 3.5 km, staƟon's average distance = 10 km fixed secƟon length = 4 km, staƟon's average distance = 10 km fixed secƟon length = 4 km, staƟon's average distance = 20 km  Figure 7 is related to the halt stations, identified as nodes provided only with the main (track-side) platforms (i.e. no lateral ones) and allowing passenger services. According to the analytical procedure and to the standard station's scheme previously described, the figure provides an estimation of utilization rate as a function of total number of served trains and with different assumptions on the percentage of stopping and passing services. The light-grey area corresponds to the range of variability we have focused on for our European case study, that means minimum 20 % and maximum 80 % of stopping trains.
Moreover several abacuses for passing stations with 1, 2, 3 or 4 passing tracks have been produced too; Fig. 8, for example, reports the graphs in the hypothesis of 4 side platforms (i.e. 6 platforms in total) and dwell time of 1 min, while Fig. 9 reports the abacus in case of 5 platforms and 3 min of dwell time.
It is important to notice that these last two figures provide the utilisation rate (assuming 20 daily operating hours) as a     From all the described abacuses it is quite evident how the dwell time influences the utilization rate of the stations, together with the number of used passing sidings and the percentage of movements assigned to them (and thus to each itinerary). As expected, the higher are these three factors, the higher is the utilisation rate of the station. Anyway also in this case, the main aim of the abacuses is to provide a likely range    300  310  320  330  340  350  360  370  380  390  400  410  420  430  440  450  460  470  480  490  500  510  520  530  540  550  560  570  580  590  600  610  620  630  640  650  660  670  680  690  700  710  720  730  740  750  760  770  780  790 Figure 11 reports the likely variability of capacity for all the considered elements of a rail network. In particular we have considered the same daily operating time of 18 h for all the four elementary components described in paragraph 3.2 (for comparison purpose) and we have calculated the practical capacities corresponding to a buffer time (expansion margin t r in (1)) equal to 60 % of the average minimum headway for the lines and the halt stations (considered embedded in a block section) and a maximum utilization rate of 60 % for the passing stations (i.e. the reported practical capacity is equal to the 60 % of the theoretical one calculated by applying Phottoff).

UƟlizaƟon rate of typical passing staƟons (with 1, 2, 3 or 4 lateral tracks) in funcƟon of the number of served trains (Dwelling Ɵme = 1 or 3 minutes)
U20h for all station's schemes, with no trains using lateral tracks and 80% of trains on main tracks (1 and 2) stopping in the station for 1 minute U20h for all station's schemes, with no trains using lateral tracks and 80% of trains on main tracks (1 and 2) stopping in the station for 3 minute U20h for the 3 tracks scheme, with 50% of total trains using lateral track and 50% of trains on main tracks (1 and 2), of wich the 80% (40% of total) stopping in the station for 1 minute U20h for the 3 tracks scheme, with 50% of total trains using lateral track and 50% of trains on main tracks (1 and 2), of wich the 80% (40% of total) stopping in the station for 3 minute U20h for the 4 tracks scheme, with 50% of total trains using lateral tracks (uniformly) and 50% of trains on main tracks (1 and 2), of wich the 80% (40% of total) stopping in the station for 1 minute U20h for the 4 tracks scheme, with 50% of total trains using lateral tracks (uniformly) and 50% of trains on main tracks (1 and 2), of wich the 80% (40% of total) stopping in the station for 3 minute U20h for the 5 tracks scheme, with 50% of total trains using lateral tracks (uniformly) and 50% of trains on main tracks (1 and 2), of wich the 80% (40% of total) stopping in the station for 1 minute U20h for the 5 tracks scheme, with 50% of total trains using lateral tracks (uniformly) and 50% of trains on main tracks (1 and 2), of wich the 80% (40% of total) stopping in the station for 3 minute U20h for the 6 tracks scheme, with 50% of total trains using lateral tracks (uniformly) and 50% of trains on main tracks (1 and 2), of wich the 80% (40% of total) stopping in the station for 1 minute U20h for the 6 tracks scheme, with 50% of total trains using lateral tracks (uniformly) and 50% of trains on main tracks (1 and 2), of wich the 80% (40% of total) stopping in the station for 3 minute

European rail network case study
The application of the methodology to the European railway network has been based on the UNECE's (United Nations Economic Commission for Europe) rail census data [35] and on the ETISPLUS dataset for 2005 [36]; the former provide information regarding length, traffic (annual and daily), number of tracks, etc. for the European main network at corridor level and they have been integrated with the speed values for each link from the latter database.
Of course both the databases are quite wide and 'generic', meaning that they have not been designed and populated according to the needs of our procedure or for capacity evaluations. It follows that there are some limitations or 'approximations' in the outputs, such as: & the UNECE database provides for each corridor only information on the eventual length of segments with one or two tracks, i.e. it is not possible to split the single or double-track sections; in our analysis, the capacity of the whole corridor is conditioned by the capacity of single-track sections, if any (representing the critical elements of the line); & the average maximum speed per corridor is unique and it refers only to long-distance passenger trains (no distinction among of train's categories); & the train counts are available only as total (no distinction between freight and passenger trains). Moreover for some links they seem to be overestimated (see Fig. 12, in particular for Belgium) & the available data are related only to lines. Stations are not treated or analysed.
Anyway, to show the potential and the scope of our approach, we first have produced European maps based on this integrated database (well aware of the described limitations) and then we also focused on a more specific and detailed geographic area (Belgium), to exploit the potentiality offered by Big Data, in particular by the 2016 GTFS timetable data provided by iRAIL [37] for Belgium. Figure 12 reports the results of the first analysis; it provides the maps at European level and for the main rail network of: number of tracks, number of trains, average maximum allowed speed and capacity utilization measures according to the lower and upper limits presented in Figs. 4 and 5.
Summarizing, based only on number of trains and maximum speed per corridor, assuming for the whole European network a classic automatic block signalling system with three aspects and utilizing the abacuses presented for double and single- Fig. 11 Variability of practical capacity for all the network elements track lines, it is quite straightforward to obtain a range of utilization measures for each link of the rail system.
In particular the two bottom maps in Fig. 12 allow identifying different critical levels for the links: & The rail segments with utilization rate higher than 0.6 in the lower limits map (corresponding to the upper border of the abacuses) but with values lower than 0.6 in the upper limits map (corresponding to the bottom border of the abacuses) may represent highly-utilized links in the specific case they present long block sections. That means they are likely congested links to be analysed in more detail (especially where the utilization is around or higher than 1), but they do not represent for sure bottlenecks of the system or at least they could be bottlenecks which might be upgraded with infrastructural interventions (e.g. shortening the length of block sections by introducing additional signals); & The links with utilization rates higher than 0.6 in the upper limit maps (and thus congested also in the hypothesis of short/normal block sections) are more likely bottlenecks and for these segments we suggest a detailed analysis. In particular the links with values higher than 1 (overcongested) are expected to be particularly critical sections; anyway a better analysis of them shows how their measures are affected by the above described limitations of the adopted databases. They may represent 'weak' links but the overestimated congestion is mainly due to approximations in the data.
To better explain the last issue, we have focused , for example, on the Ancona-Foligno line (circled in red in the bottom-right map) by downloading and analysing better the timetable and the schematic plan [38] available on the RFI (Rete Ferroviaria Italiana, i.e. Italian Infrastructure Manager) website. It has been noticed that, despite the corridor is characterized by single-track segments (bottlenecks) with distances between consecutive stations (e.g. Albacina, Genga and Serra S. Quirico) of around 7.5 km and low permissible speeds (max 95 km/h), the total number of trains indicated in the integrated database ETISPLUS/UNECE is related to the whole corridor. In reality on the specific single-track segments, the number of operating trains is significantly less (around 50 from the analysis of the 2016 timetable); this shows that the corridor, even being 'weak' due to its characteristics, is not over-congested.
Similarly, the analysis of the IRAIL 2016 timetable for Belgium (other bottleneck area from Fig. 12) shows that the total number of trains assumed in the integrated database for this country seems to be overestimated; as already mentioned, we have tried to better exploit the iRAIL GTFS data mainly for station analysis.
In particular, the great advantage of this timetable dataset is that the file provides for each stop even the number of the planned platform assigned to the convoy. It means that beside information on number, composition, frequency of trains, etc., it is quite straightforward to obtain the number of utilized platforms for each Belgian station and the percentages of flows assigned to them. Figure 13 reports the station's utilization analysis based on these data and as described in the previous paragraphs; for easiness of application, the analysis has focused only on (passing and halt) stations with number of platform up to six, which are the great majority of the total. In particular the map on the left shows the utilization rates assuming an operating time of 20 h, an average dwell time for each station and each train of 3 min, a percentage of trains assigned to the lateral platforms of 0% (min) or 50% (max) and a share of 20% of the trains using the main-track platforms (1 and 2) which pass without stopping (i.e. no passenger service). The map on the right, instead, shows the utilization rates for the same Belgian stations but assuming a daily operating time of 20 h, the actual percentages of trains using side platforms, 20% of passing trains on the main platforms and dwell times equal to 1, 3 or 5 min.
In both the maps we have bordered in red the histograms with values higher than 0.5; the results show that high utilization rates can be expected with high utilization of the side platforms (i.e. 50 % or more) and relatively high (≥ 3 min) dwell times (as evident in the left map), or with the current share of movements among of the different platforms but with even higher stopping times (≥ 5 min, i.e. right map). Anyway, even in these hypotheses, only 5 stations deserve attention and may be analysed with more detail: & Wetteren and Aarschot present utilization rates less than 60%. In particular Wetteren appears to be highly utilized only in the left map; & Lokeren and Bruxelles Schuman are characterized by utilization values slightly above 0.6 in both the described scenarios; & Halle in reality can be considered mainly as a terminus station and thus should not be considered in this kind of analysis, deserving a more specific and detailed evaluation. Anyway also for this node, the utilization rate is about 0.6 in the worst considered scenario, and so not particularly critical.

Naples' suburban rail network
To better explore the applicability and the potentiality of the proposed methodology, we have also carried out a more specific and detailed analysis of the Naples' suburban rail network (see Fig. 14). In particular we have analysed the lines Naples-Formia [39] and Naples-Battipaglia [40]. The former proceeds towards Rome and before the opening of the High Speed line, it was the main and faster rail connection between the Italian capital and Naples; the latter instead is part of the main rail corridor connecting Naples with the south of Italy, in particular with Calabria and Sicily.
In particular the line Naples-Battipaglia includes several parallel sections (see Fig. 14) with different characteristics and travelled by various types of passenger trains (High Speed, Intercity and Regional): & the conventional line from Naples Central station to Salerno passing by Torre Annunziata is mainly used by regional trains and it is further divided into two (double-track and electrified) lines between Nocera Inferiore and Salerno; in detail the section via Cava dei Tirreni is a complementary line offering mostly local services; & the High Speed/High Capacity (HS/HC) line from Naples to Salerno passes by P.C. Vesuvio and reconnects with the traditional line at Bivio Santa Lucia; High Speed trains run on it. Since this HS/HC line at the moment is utilised by a limited number of trains and it is not exactly part of the Neapolitan suburban network, we focused only on the more congested and critical traditional line.
Detailed data related both to the infrastructure and to the timetable [41] for all the Italian lines are available, and free downloadable from the RFI (Rete Ferroviaria Italiana) and Trenitalia websites (Fig. 15); based on this data it is possible to obtain the block section's lengths, the maximum permissible speed for each category of train (i.e. the operational plan of the line, namely BFascicolo Linee^ [39], reports three categories -A, B and C -of speeds relative to freight, regional and long distance trains), the number of trains per segment and the number of trains stopping in each station. Moreover from the station's timetable it is possible to know the number of used platforms in each node and the percentage of trains assigned to them.
With all these figures, it is easy to proceed with the already described approach for both stations and lines; the only Fig. 14 Schematic layout of the Naples' rail network (left) and of the Naples-Battipaglia line (right) [40] missing information is related to the number of freight trains. We have assumed an additional percentage of 10% of trains per link or station to take into account the freight, the out-ofservice and/or the empty movements (not included in the timetables). Figure 16 reports for both the analysed lines and per direction the utilization rate per section; since the lengths of the block sections in this case are well known, the ranges in the graphs correspond to a buffer time (i.e. expansion margin, t r , in formula (1)) equal to 60% (lower limit) or 80% (upper limit) of the average minimum headway (i.e. t fm , in formula (1)).
The results show how the line Naples-Formia is not excessively utilized; it counts slightly more than 120 trains (both directions) between Naples and Aversa (versus the around 150 included in the BTrenitalia^timetable for 2003, thus before the completion of the High Speed line between Naples and Rome). Currently the main critical sections (44 in direction Formia-Naples and 1 in direction Naples-Formia) are related to the Naples Central station.
In reality, even if this station deserves a separate and specific analysis (out of the scope of this contribution), our procedure considers only the main tracks entering into the Naples' node, while it is quite clear that in approaching the station, the line branches into several tracks/platforms characterized by a lower utilization level. For the line Naples-Battipaglia, instead, the situation is different; the most critical sections (25 from Naples towards Battipaglia while 10 and 11 in the opposite direction) correspond to the segment between Bivio S. Lucia and Salerno, characterized by block section's lengths of around 5 km in both the directions and by high heterogeneity of services, being travelled by High Speed, InterCity, and part of the Regional trains (in addition to the 10% of the total assumed for freight, out-of-service and empty services). Moreover, Fig. 17 reports the utilization rates for the stations along both the Naples-Formia and the Naples-Battipaglia lines, with daily operating time of 20 h and dwell times of 1, 2 or 3 min; of course we have neglected Naples Central station, since it is a terminus station with a quite complex configuration and operating timetable, and thus it deserves a separate and specific analysis.
Looking at the results summarized in Fig. 17, the station more utilized on the Naples-Formia corridor is represented by Aversa, with utilization rates acceptable in all the dwell time hypotheses. On the contrary, the station of Salerno on the Naples-Battipaglia line seems to be quite congested; in reality, despite the high number of trains circulating in the station, its configuration is quite particular as shown in Fig. 18 (from OpenRailwayMap, i.e. http://www.openrailwaymap.org/). It is characterized by terminus tracks/services and different line segments (we have analysed the ones from Nocera Inferiore via Bivio Santa Lucia, from Nocera Inferiore via Cava dei Tirreni and towards Battipaglia, see also Fig. 14). Even if our procedure indicates a high utilization, this station (as the Naples Central one) should be kept out of the analysis and should be subjected to a specific and more detailed examination.

Conclusion
This contribution proposes a synthetic methodology for capacity and utilisation analysis of complex interconnected rail networks, and it has a dual scope since it allows both a theoretically robust examination of a suburban rail system and a solid approach to be applied, with few additional and consistent assumptions, for feasibility or strategic analysis of wide Fig. 15 Extracts of the schematic infrastructure plan [40] (left) by RFI and timetable [41] (right) for the Naples-Salerno line by Trenitalia networks (by efficiently exploiting the use of Big Data and/or available Open Databases).
In particular the approach proposes a schematization of typical components of a rail network (stations and line's segments) to be applied in case of lack of more detailed data; in the authors' opinion the strength points of the presented procedure stem from the flexibility of the synthetic methods and from the joinanalysis of nodes and lines.The methodology does not aim to replace more complex procedures (e.g. simulation by specialist software) for analysing and representing the operation and the bottlenecks of a rail networks; rather it might be considered complementary to them. While rail companies have access to very detailed and fine-grained data for the rail system of competence/interest, other institutions or even research centres have to rely on different sources of data (e.g. Open and Big Data). The presented methodology might be particularly valuable in case of feasible studies (when time, cost and complexity of more comprehensive approaches would be less appealing) or in case of analysis based on coarse data, when and if more detailed information are not available. The results are less precise than the ones obtained with more accurate procedures and should be handled with care; anyway they might provide a first indication of the usage or of likely bottlenecks but such indications should then be verified with more detailed and localized investigations by means of simulation or more comprehensive methods before significant actions might be taken.
The contribution, after building a quasi-automatic model to carry out several analyses by changing the border conditions or assumptions, presents also some general abacuses showing the variability of capacity/utilization of the network's elements in function of basic parameters.
This has helped in both the presented case studies: one focuses on a detailed analysis of the Naples suburban node, while the other tries to broaden the horizon by examining the whole European rail network with a more specific zoom on the Belgium area.
Both the applications show that the methodology allows indicative evaluations on the use of the system and comparative analysis between different elementary components, providing a first identification of 'weak' links or nodes. The procedure allows narrowing the focus on particular areas/zones, for which then other tools may offer a better picture based on more and more circumstantial data; specific and detailed analyses should be carried out, by taking into account more in depth the actual configuration and the technical characteristics of these critic elements and the real composition of the traffic.
An interesting and feasible further development of this research could consider a capacity analysis carried out using a simulation approach (performed even on a small network for which detailed infrastructure and timetable information are available, e.g. the Naples node) for comparison, validation and/or for a sensitivity analysis of the results of the proposed synthetic and macro methodology (as proposed for example in [42]).
Disclaimer The views expressed are purely those of the authors and may not in any circumstances be regarded as stating an official position of the European Commission.
Open Access This article is distributed under the terms of the Creative Comm ons Attribution 4.0 International License (http:// creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.