Accuracy analysis of software for the estimation and planning of photovoltaic installations

As the use of photovoltaics expands, with more and more commercial and residential users investing on solar energy systems around the globe, there is substantial demand for relatively simple, easy-to-use software packages for the planning and performance estimation of photovoltaic installations by installers and architects. In this paper, the calculative accuracy of TRNSYS, Archelios, Polysun, PVSyst, PV*SOL and PVGIS is being examined in comparison to the real electrical energy generated by a grid-connected 19.8kWp photovoltaic installation. The assessment has been performed by using the climatic data which have been recorded at the site of the real photovoltaics (PV) park over the same calendar year. Our results displayed that the software packages tend to overestimate the global irradiation received by the PV modules but still significantly underestimate the electrical energy generated by the installation.


Background
Albeit they are plagued by problems such as intermittency and relatively low efficiency [1], photovoltaics (PV) are the most popular form of renewable energy for electricity generation and their market penetration is expected to continually rise globally over the next few years [2]. The high demand has led to extensive research during the past few decades, improving the efficiency and other performance characteristics of the PV panels while significantly reducing their cost [3]. Due to their inert nature and motionless operation, PV systems are ideal for use in or near populated areas [4], especially on commercial and residential buildings [5,6], making them a vital addition to smart grids [7]. Small, distributed PV installations are considered to be a vital component to sustainable architecture [8].
The largest percentage of the PV market is held by private owners and small businesses, investing on small to medium grid-connected installations or building integrated systems [9,10], with the interest on small-scale residential applications continually growing across the globe [11][12][13][14], especially in the EU after the 2010/31/EU Energy Performance Building Directive [15], which suggests that buildings should require 'nearly zero energy' by 2020.
As such, demand for software capable of performing energetic and economic analysis quickly and accurately was evident.
This commercial demand has led into the development of many PV analysis and planning software packages, mainly designed to be used for the composing of technical and economical essays by the PV installers and architects, especially during the initial design phases of a project [16]. Despite the vast growth of PV installations and the ever-growing number of available commercial software for their simulation and economic assessment, there has been little to no research regarding the calculative accuracy of commercial PV simulation software packages. Research studies are focused on developing methods which can be used to create new software packages, such as new simulation techniques [17,18], models which can be used to compute the effects of shading [19], methodologies which may be used for the estimation of solar irradiance on inclined surfaces [20], the modelling of PV panels [21] and the simulation of the power electronics involved in PV installations [22,23]. Undoubtedly, commercial PV simulation software packages are using combinations of these models and methodologies, each of which is scientifically sound; however, their accuracy as a complete software package against real-world data has not been assessed. Still, commercial software packages are being used as the basis of many scientific studies and papers [24][25][26][27][28]; therefore, their accuracy should be thoroughly examined against real-world data and for various conditions.
In this article, we will be comparing the actual irradiance and energy data which have been extracted from the monitoring station of a 19.8-kW p grid-connected PV installation to the results obtained from commercial PV simulation software packages which had the climatic data of the installation imported.

Photovoltaic installation
For the means of this study, long-term performance data have been extracted from an operational PV park in order to be compared to the artificial data created by the software. This free-standing PV park is situated at an open field at the foothills of Rhodope, near the settlement Dionis, in Thrace, Greece. The land plot measures nearly 2,000 m 2 large, without any obstructions which could cause shading near the installation. The exact geographical coordinates of the location are 40.98 N, 25.56 E. Table 1 displays the specifications of each of the 90 SILCIO SE220 photovoltaic panels used by the PV park, which have a total surface of 150 m 2 . Steel mounts are being used, placing the panels at a tilt angle of 30°and facing the true south. The installed power of the PV park is 19.8 kW, directly connected to the low voltage (230 V AC) distribution grid. For the means of connecting to the grid, three SMA Sunny Mini Central 7000TL have been used, each with a maximum power output of 7 kW and an efficiency of 97.7% at maximum output. Two strings of 15 panels each have been connected per inverter, leading to an inverter configuration oversized by about 6%.
The park has a Sunny Sensorbox weather station installed, which features Si module sensors for the global irradiation, PT100 sensors for the panel and ambient temperature, as well as a multi-directional anemometer for the wind speed. The solar irradiation sensors have been calibrated with the help of Kipp & Zonen CM 11 pyranometers [29].

Climatic data
The climatic data of the area were extracted from the PV park log files, all in hourly values and for a full calendar year (Additional file 1). These files contain the global irradiation received on horizontal surface (H), the irradiation on the inclined surface at 30°(H 30 ), the ambient temperature (T av ), the module temperature (T c ) and the wind speed (U w ). These data which have been converted to average monthly values are shown in Table 2.

The software
The climatic data extracted from the operational park have been entered into five commercial software packages which are available to us and are being alphabetically summarized in Table 3. PVGIS is not a commercial software and has been included in this study only in order to assess its accuracy as a quick assessment online tool.
TRNSYS is capable of receiving hourly climatic data, and thus, the radiation on the inclined plane H 30 , the ambient temperature T and the wind speed U have all been imported as is, in their hourly values.
Archelios is the only commercial software which requires the input of the global, diffuse and direct irradiation on the horizontal plane, as well as the ambient temperature and wind speed, in monthly values, if the user seeks to generate a custom climatic data file. As such, the global irradiation on the horizontal should be split to diffuse and direct using either a mathematical model or the conversion ratio available for the area of the installation. As this study explores the accuracy of the simulation software assuming that it will be used as a daily tool and from users without high scientific background, the simplest option has been selected, and that is using the diffuse-to-global ratio available for the area, which has been extracted from climatic maps. The results are shown in Table 4.
Polysun requires the entry of the monthly global irradiation on the horizontal plane, ambient temperature and wind speed. PVSyst has similar requirements, but the wind speed is treated as optional. The monthly values of these three data sets have been entered into both of these software packages.
PV*SOL requires the entry of the monthly global irradiation on the horizontal plane and ambient temperature. Even though the algorithm of the software does take wind speed into account, there is no option to manually enter it into the software while creating a custom climatic data set and the software appears to be randomly generating wind profiles based on the location of the installation.
For the case of the PVGIS, which does not support the manual input of any climatic data, the integrated Climate-SAF database data is being used. The estimated total system losses have been entered to be 7%.

Methods
In all of the software used for the means of this study, three identical types of losses are taken into account: the cable losses which have been calculated to be 1%, the module mismatch losses which have been estimated to be 2% and the system quality losses which have been found to be 1.7%. The losses of the inverter are also being calculated by the model included in each software. For the case of PVGIS, as the software does not calculate the losses of an inverter, additional losses of 2.3% have also been added, bringing the total estimated losses up to 7%. All other losses that each commercial software may have the ability to calculate have been disregarded. The PV park is being treated as a free-standing installation with good ventilation. We should note that the aforementioned software packages are being assessed only for their calculative accuracy. Other criteria, such as the user interface, features and support, are beyond the scope of this study. The final selection of a commercial software package should be a process based on multiple criteria.
In order to compare the goodness of fit between the data calculated from the simulation software and the data acquired by the logs of the installation, we used the following statistical parameters: root mean square error (RMSE), mean absolute deviation (MAD), absolute percentage error (MAPE) and model efficiency (EF).
The root mean square error is given by Equation 1: Where H t is the recorded value, F t is the value derived from the simulation software and n is the number of periods (months). RMSE gives the deviation between the experimental and the calculated values, and it should be as close to zero as possible.   The MAD parameter is used to measure statistical dispersion and is given by Equation 2: The MAPE parameter, also known as mean absolute percentage deviation or MAPD, is used to assess the accuracy of the model and is given by Equation 3: The model efficiency test displays the goodness of fit between the calculated and experimental values, while the highest value it can get is 1. EF is given by Equation 4: Where z is the mean value of the experimental data.

Results and discussion
Energy and solar irradiation results Figure 1 displays the energy generation of the real PV park and the figures which each software calculated for a full calendar year. TRNSYS, which is using the irradiation data which have been recorded on the inclined plane, displays nearly perfect accuracy.
As it can be seen from Figure 2, which displays the deviation of the calculated results from the real production of the PV park, the four commercial PV analysis and planning applications underestimate the electrical energy generation of the installation every single month of the year, with the sole exception of PVGIS, which is making calculations based on its own climatic database and thus displays great fluctuations each month. Figure 3 displays the real global irradiation received by the PV modules per month in comparison to that calculated by each software. Even though the commercial software packages underestimate the energy generation of the PV installation, they actually calculate the global irradiation received by the panels of the PV park to be higher than the real global irradiation on the inclined plane, as it can be seen in Figure 4. This is a paradox, indicating that the PV planning and analysis applications  overestimate the solar irradiation received by the PV panels of the park even though they underestimate the electricity generation. It also is interesting to note that even though PVGIS greatly overestimates the global irradiation received by the PV panels, the electricity generation results are significantly lower than the real output of the PV park.
As it can also be seen from Figure 4, while the global irradiation received by the PV modules that the software packages calculate coincides, for certain months, with the global irradiation recorded by the sensors of the park, the calculated electricity generation is significantly underestimated. As such, while the calculated global irradiation of Archelios for February, Polysun for June, PVSyst for November and PV*SOL for October coincides with the actual global irradiation received by the modules, the calculated electricity generation during those months is respectively 7.6%, 7.5%, 10.2% and 6.1% lower than that of the real PV park. Therefore, it becomes clear that regardless of the method each application is using to convert the provided global irradiation on the horizontal plane into the global irradiation on the plane of the PV modules, the main source of the electricity generation calculation error comes from the PV cell model which each software is using. Table 5 displays the annual electrical energy generation and annual global irradiation received on the inclined plane.

Error analysis
According to the statistical parameters used, the data calculated from the simulation software better approximates the experimental data when the values of RMSE, MAD and MAPE are close to zero and the values of EF approach unity.
As TRNSYS is utilizing the recorded global irradiation on the inclined plane directly and suffers no irradiation conversion errors, it was expected that the accuracy of the energy generation simulation would be very high. As it can been seen from Table 6, the energetic results from TRNSYS display a model efficiency of 99.7%. Leaving TRNSYS aside and moving to the PV planning and analysis software, Archelios appears to be delivering the most accurate energy generation results over the course of a  year, followed by Polysun. However, the error becomes significant, with the MAPE of the results from Archelios surpassing 5.1%. PVSyst and PV*SOL display similar error results in all four tests, a MAPE of about 9.1% and a model efficiency of about 92.5%.
PVSyst and PV*SOL display similar error results when their energy generation figures are being tested; however, as it can be seen from Table 7, the latter appears significantly more accurate than the former when testing the calculated solar irradiation on the inclined plane. PV*SOL also displays the best received global irradiation calculation accuracy and the lowest irradiation overestimation out of all the tested applications. Despite it displaying the most accurate energetic results, the error analysis indicates that Archelios significantly overestimated the irradiation received by the PV installation.
Combining both the accuracy of the calculated energy generation and of the calculated received global irradiation, Polysun appears to be the most accurate software amongst the PV estimation and planning applications. Polysun displays the second most accurate energy generation results and, at the same time, the second most accurate received global irradiation results as well. On the other hand, Archelios displayed the most accurate energy generation results but had the least accurate received global irradiation calculation, significantly overestimating the irradiation received by the system, while PV*SOL displayed the most accurate calculation of the global irradiation received by the system but the accuracy of the energy generation results was the second worst, excluding PVGIS.

Conclusions
The calculative accuracy of five commercial applications has been tested and evaluated through a set of four statistical parameters, namely RMSE, MAD, MAPE and EF. The statistical parameters have been used as indicators of the goodness of the estimation of the electrical energy and the global solar irradiation on the inclined surface.
The PV planning and analysis applications generally overestimate the irradiation received by the PV panels but, at the same time, underestimate the energy generation.
The software which is based and or depends on the PVGIS irradiation database may be significantly inaccurate, especially if the study is being performed for specific months or short time periods.
Even though the conversion of the global irradiation on the horizontal plane to the global irradiation on the plane of the modules is a source of calculation error, the main error source comes from the model of the PV cell. Finally, regardless of the deviations which the commercial software packages displayed, it can be considered that the results are generally useful and that their features and ease of use makes them a vital tool for planning and quickly assessing the performance of a PV installation.
Future studies may be performed to assess which parts of the complete simulation model used by the software packages are significant sources of error. Furthermore, by evaluating how each simulation parameter affects the accuracy of each model, the simulation models may be improved accordingly to provide more accurate results. As displayed in this paper, the main source of error comes from the model of the PV cell; future research may be performed to evaluate the accuracy of the PV simulation model used by each software package against real-world performance data and improve these models to better reflect real-world performance results. Furthermore, research on the error induced by the solar irradiation conversion model of each software package from the horizontal to the inclined plane should be performed, in order for the models of each software package to be improved so as to accurately convert the irradiation data from the horizontal to the inclined plane.