International Journal of Sustainable Energy Planning and Management Vol. 32 2021 79 *Corresponding author - email: dapi@dtu.dk International Journal of Sustainable Energy Planning and Management Vol. 32 2021 79–94 ABSTRACT Simulating energy systems is vital for energy planning to understand the effects of fluctuating renewable energy sources and integration of multiple energy sectors. Capacity expansion is a powerful tool for energy analysts and consists of simulating energy systems with the option of investing in new energy sources. In this paper, we apply clustering based aggregation techniques from the literature to very different real-life sector coupled energy systems. The purpose is to provide a robust comparison of methods to complement the literature, in which methods are either not compared or compared on very similar energy systems. We systematically compare the aggregation techniques with respect to solution quality and simulation time. Furthermore, we propose two new clustering approaches with promising results. We show that the aggregation techniques result in solution time savings between 75% and 90% with generally very good solution quality. To the best of our knowledge, we are the first to analyze and conclude that a weighted rep- resentation of clusters is beneficial. Furthermore, to the best of our knowledge, we are the first to recommend a clustering technique with good performance across very different energy systems: the k-means with Euclidean distance measure, clustering days and with weighted selection, where the median, maximum and minimum elements from clusters are selected. A deeper analysis of the results reveals that the aggregation techniques excel when the investment decisions correlate well with the overall behavior of the energy system. We propose future research directions to remedy when this is not the case. Finally, we believe that to further strengthen the research area, a library of benchmarks instances should be developed. Time Aggregation Techniques Applied to a Capacity Expansion Model for Real-Life Sector Coupled Energy Systems Mette Gamsta, Stefanie Buchholzb, David Pisingerb* a Energinet, Tonne Kjærsvej 65, 7000 Fredericia, Denmark b DTU Management, Akademivej 358, 2800 Kgs.Lyngby, Denmark September 14, 2021 Keywords Capacity expansion; energy system models; time aggregation; clustering; solution time reduction http://doi.org/10.5278/ijsepm.6400 1. Introduction Simulating energy systems is vital for energy planning. The green transition demands increasing introduction of fluctuating renewable energy sources and integration of multiple energy sectors. Simulations are necessary to understand the behavior in such sector coupled energy systems. Capacity expansion consists of simulating energy systems with the option of investing in energy sources. Much work in the literature considers solution meth- ods for the capacity expansion problem but focuses on single methods or specific energy systems, see e.g. [1–3]. In this paper, a capacity extension model consists of a year in one-hour resolution, i.e. of 8760 hours. The underlying energy system may be large and consist of many areas (e.g. geographical areas or bidding zones) and energy types (e.g. power, district heating, gas). Solving the NP-hard capacity expansion problem is thus 80 International Journal of Sustainable Energy Planning and Management Vol. 32 2021 Time Aggregation Techniques Applied to a Capacity Expansion Model for Real-Life Sector Coupled Energy Systems needs, etc. [2]. A general mathematical model for the CEP is summarized as: min total system costs + investment costs (1) s.t. production + import + storage discharge = demand + export + storage charge (2) physical constraints on production units (3) available RES (4) storage and electric vehicle constraints (5) capacity on interconnection lines (6) utilization ≤ investment (7) min investment ≤ investment ≤ max investment (8) The objective function (1) ensures that an investment takes place when the savings of utilizing the investment exceed the investment cost. The total system costs include fuel costs, emission costs, import and export costs, operation and management costs, startup costs etc. Balance constraints (2) ensure that supply and demand meet in every area and in every hour. Physical con- straints on production units (3) include efficiency, tech- nical production limits, production technology (condensation, back pressure, etc.) and ramping. Constraints (4) consider available RES subject to cur- tailment options. Storage and electric vehicle constraints (5) include capacities, losses, charge and discharge rates. Constraints (6) ensure that capacities on interconnection lines are satisfied. “Utilization” in constraints (7) represents how the investment is utilized in terms of production (for pro- duction units and RES), inventory level (for storage units) or import and export (for interconnection lines). The constraints say that the investment must be large enough to facilitate the desired utilization. Finally, bounds (8) ensure that investments are within the user defined bounds. The CEP is widely used for optimizing the configura- tion of future energy systems. Examples of applications are non-trivial power systems [5], integration of renew- able energy [6], large energy systems with sector cou- pling such as power, gas, transport and heating [1, 7, 8]. Incorporating the UC in the CEP is essential to ana- lyze the need for flexibility capacities and the integra- tion of RES [9, 10]. This is particularly the case in systems with much RES [11] or when alternative flexibility sources are analyzed [12]. The hourly time time consuming and often intractable. Aggregating time steps is a common method to reach tractability. The lit- erature suggests a wide variety of aggregation tech- niques. Most studies, however, consider specific systems [3] and only few contributions compare their results with the literature [2]. The novelty of our work lies in analyzing the effect of time aggregation methods on the real-life sector coupled energy systems. To the best of our knowledge, we are the first to apply and compare multiple time aggregation methods on significantly different energy systems. This provides much insight in the potential of time aggrega- tion techniques without the risk of overfitting the meth- ods to specific energy systems. Furthermore, we propose two new aggregation meth- ods with promising results. We analyze methods for selecting cluster representatives and conclude that weighted selection has superior performance. Finally, we provide a deeper analysis of the achieved results to highlight interesting future research areas in time aggre- gation techniques. To summarize, the paper addresses gaps in the current literature of time aggregation tech- niques applied to capacity expansion models: • comparison of methods on very different, real- life sector coupled energy systems • analysis of selection strategies in the clustering methods • recommendation on a method with overall good performance across very different energy systems The paper is structured as follows. The literature review in Section 2 considers work from the literature to motivate the contributions of this paper. The imple- mented aggregation techniques are presented in Section 3 and the real-life energy systems in Section 4. The clustering methods are evaluated on the energy systems in Section 5. The evaluation leads to the proposal of two new aggregation techniques in Section 5.4. Section 6 contains future work and conclusions are drawn in Section 7. 2. Literature review The NP-hard Unit Commitment Problem (UC) simulates an energy system, where demand must be met every hour [4]. The Capacity Expansion Problem (CEP) extends the UC with investment decisions. This enables analyzes of introduction of new technologies, energy mix in case of rapid technology development, flexibility International Journal of Sustainable Energy Planning and Management Vol. 32 2021 81 Mette Gamst, Stefanie Buchholz, David Pisinger resolution in the UC limits the CEP investments to the day ahead market, instead of considering balancing needs in an intra day or operational setting, which both consist of finer time resolutions. Recall that the UC is NP-hard. Several studies have considered simplified approaches to include the UC in the CEP [9, 13], e.g., by only considering a subset of the UC constraints [14, 15]. Other approaches to handle the complexity are sophisticated solution approaches such as Benders decomposition [16, 17] and Dantzig-Wolfe decomposition [18]. The resulting problem, however, remains very difficult to solve. A popular approach is time aggregation, where a subset of the 8760 hours of the year is solved. This reduces the size of the problem to make it more tracta- ble, but with the cost of losing precision. Several litera- ture studies apply time aggregation and conclude that the quality is satisfying [5, 19, 20]. The next section further elaborates time selection methods. 2.1. Time aggregation techniques for the Capacity Expansion Problem Many different time aggregation techniques exist, span- ning from simple heuristic selections [21] to optimiza- tion methods [20]. Heuristic approaches may be too simple and are at times associated with insufficient capture of variability [22] while the optimization approaches suffer from high computational efforts [20]. A compromise between quality and computational trac- tability is achieved by using clustering procedures [23]. A survey along with a proposed classification and tabu- lar overview of time aggregation methods can be found in [2]. This literature study reviews literature on clustering methods, on selecting elements from clusters, and on comparing methods. Finally we discuss how this paper contributes to closing gaps in the literature. 2.1.1. Clustering techniques The aim of all clustering approaches is to minimize the similarity between clusters while maximizing the simi- larity within each cluster [24]. Clustering approaches differ in how they group elements into clusters and how they select elements from each cluster. An example of a clustering technique can be found in [25], which clusters days according to the hierarchical clustering procedure and where the day closest to each cluster centroid is chosen. Other popular approaches are k-means [26] and fuzzy clustering [27]. Clustering techniques can be categorized into either Exclusive (each element is assigned to only one cluster) or overlapping cluster techniques (each element is assigned to all clusters with a degree of membership) [2]. In relation to time aggregation, most clustering approaches belong to the Exclusive category, although the use of some overlapping clustering techniques, such as a fuzzy clustering, also exists [2, 28, 29]. The most common Exclusive techniques are the Hierarchical clus- tering and the k-means clustering; the former builds a hierarchy of clusters through a sequence of nested parti- tions, while the latter initializes a grouping which is then iteratively improved [30]. 2.1.2. Selecting elements from clusters Elements must be selected from each cluster to represent the full time horizon. [31] provides an overview of selection strategies including cluster average, element closest to the cluster average and random element selec- tion. Cluster average is criticized for smoothing the profiles [19, 32] which underestimates the need for stor- age capacity and storage technologies [33]. Random selection shows good results in [2] compared to average, minimum and maximum element selection. Comparisons of element selections are also seen in [25] and [28]. After the selection, the elements are weighted such that the aggregation reflects the relative importance of the elements in the original problem. Typically, fixed weighting is applied, assuming each cluster element to be equally important [29]. The weighting could also choose only to represent a partition of the clusters [34]. To our knowledge, there is no clear conclusion regarding the best selection criteria nor the best weighting strategy. 2.1.3. Comparison of aggregation methods [35] compare clustering procedures selecting days. They compare k-means clustering, fuzzy cmean clustering and hierarchical clustering with varying linkage criteria. Selection strategies are the mean and the median ele- ment selection. The selected element is weighted (repeated) according to the number of elements in its cluster. The paper analyzes electricity demand only, and it only considers how well the original data is repre- sented by the clustering - not the quality of the invest- ment results. They conclude that the k-means clustering using median representative outperforms other cluster- ing procedures, independently of the number of clusters. [29] also compare different clustering methods select- ing days. They compare hierarchical clustering with 82 International Journal of Sustainable Energy Planning and Management Vol. 32 2021 Time Aggregation Techniques Applied to a Capacity Expansion Model for Real-Life Sector Coupled Energy Systems minmax linkage criterion and dynamic time warping; a double clustering strategy with k-means clustering and the mentioned hierarchical clustering; and a pure k-means clustering. The mean element is selected from each cluster, and selected elements are weighted (repeated) according to the number of elements in the clusters. The comparison is based on a single dataset covering three regions. The aggregation methods are compared on the resulting investments. They conclude that the hierarchical clustering and the double clustering have best performance. In [36], five clustering approaches selecting days are compared: A k-means and a k-medoids clustering each based on a Euclidean distance metric and cluster centers as representative elements. A dynamic time warping barycenter averaging clustering. A k-shape clustering with a shape-based distance metric. A hierarchical clus- tering with Euclidean distance metric. The comparison is based on two different mathemat- ical models representing different types of investments: one based on a battery and one based on a gas turbine. They conclude that the centroid-based clusterings repli- cate the operational part well. In [37], three approaches to selecting days are ana- lyzed in a sector-coupled energy system with storage. All three approaches use clustering to select days and weigh (repeat) the selected days to form the full system. The three approaches differ in how to maintain storage levels across the selected days; from no coupling to full detailed coupling. They analyze the approaches on a small synthetic energy system and conclude that the more coupling and more selected days, the better perfor- mance but also slower run times. Different approaches to reducing the problem size are analyzed in [38]: down-sampling from hourly time res- olution to e.g. six-hourly time resolution; clustering days; and heuristically selecting days. The approaches are tested on a representation of the power system in Great Britain with three different settings for renewable energy and storages and on 25 climate years. They con- clude that the best performance is achieved by combin- ing clustering with heuristic selection of extreme days. Other contributions compare methods on quite small energy systems. [39] applies a mixed integer formula- tion clustering model on two small energy systems: the University of Parma and a single building. They com- pare their method with k-means and k-medoids cluster- ing and conclude that their model has better performance. In [40] six aggregation methods are compared on an energy system representing a single building. The clus- tering considers demand days and they conclude that k-medoids has best performance. In [2], three clustering procedures are compared to four non-clustering aggregation techniques. Also, three new aggregation techniques are proposed; one based on dynamically blocking days, one based on optimizing the statistical representation of selected days; and one based on double clustering including correlation as distance measure. All methods are compared on three instances inspired by the Danish power system. The study con- cludes that the double clustering has best performance, but that several approaches perform almost as well. 2.2. Hypothesis and contribution of this paper The common approach in the literature is to solve spe- cific energy systems to perfection, which makes it diffi- cult to compare the results. [35] conclude that k-means has best performance, [29] hierarchical and double clus- tering, in [36] medoid based selection is best for invest- ment decisions and [2] show that simple heuristic based aggregation approaches perform well. [36] compares across different energy systems, but these energy sys- tems are based on different underlying models which makes it difficult to draw conclusions for the CEP. A similar pattern can be seen for element selection strategy. [28] considers median and mean representa- tives with median as best option, [25] considers selec- tion of centroid or historical day representation with centroid as best option, and [2] considers minimum, maximum, mean and random element selections with random as best option. To our knowledge, selecting multiple elements from each cluster has not been addressed before. In [28] and [29], a cluster representative is weighted by repeating it, instead of selecting multiple elements from each cluster. A different approach is not to solve each problem to perfection, but to find a technique, which provides over- all good performance among different problems [41]. This is the approach we investigate in our work. The main contribution of this paper is a detailed and structured comparison of different time aggregation approaches on four very different energy systems, based on the same mathematical formulation. With the energy systems being realistic in size and detail, the conclusions are widely applicable. Since the aggregation technique comparison also includes a very simple approach, this paper furthermore illuminates the relation between aggregation technique International Journal of Sustainable Energy Planning and Management Vol. 32 2021 83 Mette Gamst, Stefanie Buchholz, David Pisinger complexity and performance. The paper further contrib- utes by comparing different selection strategies when elements are to be selected from each cluster and by considering both single and multiple selections from each cluster. Also, to the best of our knowledge, this paper is the first to illustrate the benefits of considering clustering weightings in the selection. This paper focuses on the energy system capacity expansion model, Sifre. The full mathematical formula- tion of Sifre is available at [42] and the capacity expan- sion module of Sifre is available in Appendix A in [43]. Investment decisions are supported for production units, renewable units, storage, electric vehicles and intercon- nection lines. When solving the investment problem, a full year is simulated. Sifre LP relaxes the problem to limit the sim- ulation solution time. The integer variables in the unit commitment problem are LP relaxed and investments are linear instead of discrete (e.g. invest between 0 and 500 MW in a production unit, instead of investing in zero, one or two production units, each of size 150 MW). Still, solving the problem may take many hours because of the problem instance size. This paper implements the aggregation techniques as part of Sifre, but we still consider the results and ana- lyzes valid for other capacity expansion models such as TIMES, Balmorel, EnergyPLAN and energyPRO [35, 44–46]. 3. Solution methods Numerous aggregation approaches are suggested in the literature. Buchholz et al. [2] survey the many approaches and computationally compare aggregation strategies from the literature. According to their study, the follow- ing approaches show superior performance: • Dummy Selection, where every 13th element is selected from the residual load curve • Statistical Representation, which selects 10000 random samples and from this select the sample that best represents the means and standard deviation of the original data • Optimized Selection, which has same objective as Statistical Representation. Instead of investi- gating 10000 random samples, this approach finds the optimal sample with respect to means and standard deviation of the original data • k-means Clustering with squared Euclidean distance measurement • Cluster Clustering which first applies k-means clustering with squared Euclidean distances. Each resulting cluster is re-clustered using hier- archical agglomerative clustering with dynamic time warp distance measure and complete link- age criterion (minimizes the maximum distance between two elements; one in each cluster) • Level Correlation Clustering, which first applies fuzzy clustering with squared Euclidean dis- tances. Then it applies hierarchical agglomera- tive clustering according to element correlations To scope the work in this paper, we decide to focus on the clustering methods (the three last methods). We also include Dummy Selection due to its simplicity. As the sector coupled energy systems consist of many timeseries (and not just the residual load), we select every 13th element from each timeseries in Dummy Selection. 3.1. Configuration of the clustering approaches The survey in [2] shows promising results when cluster- ing days into 28 clusters. We thus apply this configura- tion. Both the Cluster Clustering and the Level Correlation Clustering generates 7 outer clusters, each of which are re-clustered into 4 sub clusters. The k-means clustering and fuzzy clustering algo- rithms depend on an initial cluster. We divide the simu- lation period evenly into the number of desired clusters. E.g. consider a year of 365 days, where the number of desired clusters is 28 and where days are clustered. Then the first 13 days are assigned to the first cluster, the next 13 days to the next cluster etc. 3.2. Data dimensions The proposed methods are extended to handle complex sector coupled energy systems, by making them con- sider all fluctuating timeseries data. This means that the methods consider demand of all energy types (e.g. power, district heating, gas), RES production, import/ export prices, fuel prices and availability profiles for production units and interconnection lines. The clustering approaches consider every fluctuating timeseries separately (instead of summing them into e.g. a residual load curve) and we also maintain the chrono- logical order (in contrast to duration curves). Demand is negated to make the selection of cluster elements more intuitively understandable. The minimum sum element in a cluster represents a day with low production and high demand. Similarly, the maximum sum element rep- resents a day with high production and little demand. 84 International Journal of Sustainable Energy Planning and Management Vol. 32 2021 Time Aggregation Techniques Applied to a Capacity Expansion Model for Real-Life Sector Coupled Energy Systems The clustering approaches must calculate the distance between two days. This is done for each matching pair of timeseries for each hour (e.g. the RES production by offshore wind park Horns Rev 1 for each of the two days). All differences are summed across hours and timeseries to produce the final distance between the two cluster days. 3.3. Selecting days from clusters Two approaches can be considered for deciding the number of days to select from each cluster. Either one day from each cluster (denoted non-weighted or fixed weighted), or a weighted number of days from each cluster. The benefit of the latter is that typical days and outliers in the full dataset remain (somewhat) typical and outlying in the aggregated dataset. The weight is set according to the cluster size: frequency total number of days in simulation number of clusters = (9) weight = max 1, round cluster size frequency � � � � � � � � � � � � (10) The total number of selected days may exceed the number of clusters for weighted selection. Weighted selection in the literature consists of selecting a single element from each cluster and then repeating this ele- ment a number of times [25, 28, 29]. We propose to instead select a weighted number of elements from each cluster. The benefit of this is that time chronology is maintained, i.e., once selection of elements has finished, the original order of the selected elements is applied. Also, selecting existing elements instead of generating new, should represent the original data better. Several strategies are investigated for deciding which days to select from each cluster: Minimum sum, i.e. the day(s) with smallest sum; Maximum sum, i.e. the day(s) with largest sum; Median sum, i.e. the day(s) with median sum; Closest to Cluster Mean, i.e. the day(s) with shortest distance to the cluster mean, and Random i.e. randomly chosen day(s). Closest to Cluster Mean is cal- culated as follows: The mean of a day is calculated for every hour. The distance from an element to the mean is the total Euclidean distance in the 24-dimensional space. 3.4. Test setup The time aggregation techniques are compared to the optimal solution of each data instance. Since only part of the problem is solved by the time aggregation tech- niques, the objective function values cannot be com- pared out of the box. It is, however, possible to generate two full year simulations with fixed investments: one simulation with optimal investments and another simu- lation with investments from using a time aggregation technique. The objective function values of these two simula- tions can then be compared. But the objective function values will not include investment costs and will thus be difficult to understand in relation to investment deci- sions. Also, the objective function value is of very little interest in the analyzes in Energinet, where focus is on the energy mix, the flows, etc. For this reason, we decide to only compare the investment decisions. The perfor- mance measure hence becomes: i i aggregated i optimal i i opti Investment Investment Investment �� mmal� (11) where i is an index for the investments, Investmenti aggregated is the investment decision made by the aggregation tech- nique and Investmenti optimal the investment decision from the optimal solution. 4. Test instances The aggregation techniques are tested on four signifi- cantly different sector coupled energy systems, all stem- ming from analyzes in Energinet and where data is based on overall assessments. The energy systems are different instances of the LP model summarized in Section 2. An energy system consists of the following components: • Areas, which represent an energy type and a geographical region, possibly attached an energy demand • External areas represent an energy type and a geographical region. They only have a price per MWh for each hour attached. They can only be connected to the rest of the system via an interconnection line • Production units (or Conversion units or Generation units) convert energy types; examples are CHPs, CCGTs and compressors, • Renewable units (RES) produce energy based on a production profile International Journal of Sustainable Energy Planning and Management Vol. 32 2021 85 Mette Gamst, Stefanie Buchholz, David Pisinger Table 1: Detailed description of test instances Energy system Areas External areas Production units RES Storages Inter- connectors Electric vehicles Demands DK classic 74 6 294 60 36 8 2 66 DK detailed 211 9 396 88 54 109 16 94 Gas 74 2 70 7 6 11 0 15 PtX 27 8 31 1 16 10 0 3 • Storages are any types of storages, e.g., batteries or water tanks. Storages can also be used to model line pack in gas systems • Electric vehicles which must be charged before requested driving time, however, the time of charging is flexible The sector coupled energy systems are described in Table 1. The DK classic instance consists of a representation of the Danish power and district heating system in 2020, see Figure 1. The investment decisions focus on heat production and consist of two CHPs, three heat boilers and three heat pumps: a total of 8 investments. The DK detailed instance proposes a Danish power and district heating system in 2050, see Figure 2. The number of electricity areas are split into eight areas to represent possible future grid bottlenecks. Also, the pro- duction system includes PtX technologies (Power to X technologies), hence fuels are represented in greater detail than in DK classic and include parts of the trans- portation sector. The investment decisions focus on seven PtX plants, modelled through fourteen condensing power plants, seven heat pumps and one storage: a total of 22 investments. The Gas instance consists of a subpart of the Danish gas transmission and distribution systems in 2020, see Figure 3. The instance introduces large amounts of biogas and investigates investments in two compressors from gas distribution systems to the gas transmission system and one investment in connecting distribution systems directly: a total of 3 investments. The PtX instance models a Power to X cluster as illus- trated in Figure 4. The investment possibilities decide how to dimension the PtX cluster and consists of 19 production units, one heat pump and one interconnec- tion line: a total of 21 investments. Figure 2: An overview of the DK detailed instance. The electricity areas are highlighted. District heating is modelled as 59 areas, as for DK Classic. Figure 1: An overview of the DK classic instance. The red circles represent electricity areas, the blue lines interconnection lines to neighboring electricity areas. The blue dots show district heating areas in Denmark, which are modelled as 59 district heating areas in the dataset. 86 International Journal of Sustainable Energy Planning and Management Vol. 32 2021 Time Aggregation Techniques Applied to a Capacity Expansion Model for Real-Life Sector Coupled Energy Systems Figure 4: An overview of the PtX instance. The system integrates many energy types. Figure 3: An overview of the Gas instance. The left figure illustrates the overall system. A total of 10 distribution systems are modelled in varying detail. The investments are colored red. The right figure illustrates an example of a modelled gas distribution system, where NG is short for natural gas, 40B is 40 bar and 4B is 4 bar. 5. Results The computational evaluation is conducted on a 10 core 2,4 GHz machine with 128 GB RAM, using Gurobi 8.1 as solver. The following abbreviations are used in the remainder of this section: Clustering meth- ods: k for k-means, cc for Cluster Clustering, and lc for Level Correlation clustering. Selection strategies: min for minimum sum, max for maximum sum, median for median sum, and cmean for closest to cluster mean. Finally, we have w for weighted selection, n for non- weighted (fixed weighted) selection, and 28 to repre- sent the 28 generated clusters. Run time results are seen in Table 2 and solution quality gaps in Table 3. Results are analyzed in the following sections. For more details and deeper discussions, the interested reader is referred to [43]. 5.1. Time usage Time reductions are plotted in Figure 5. Note that the solution times also include pre- and postprocessing of the data instances and not only time for solving the linear program. The time usage savings are consistent across the time aggregation techniques. The average time saving is 90%, which is very satisfying. The time savings are slightly smaller for the DK classic and Gas instances, which could indicate that these instances spend relatively more time on pre- and postprocessing data than the DK detailed and PtX instances. International Journal of Sustainable Energy Planning and Management Vol. 32 2021 87 Mette Gamst, Stefanie Buchholz, David Pisinger Table 2: Run times in minutes. DK-classic DK-detailed Gas PtX Full test instance 36,45 536,06 11,71 14,69 Dummy Selection 4,38 18,29 1,48 1,47 k,min,w,28 4,21 17,42 1,34 1,54 k,max,w,28 5,83 22,65 1,32 1,45 k,median,w,28 3,99 24,64 1,38 1,49 k,cmean,w,28 4,25 15,42 1,46 1,65 k,random,w,28 4,10 22,32 1,39 1,45 cc,min,w,28 5,74 43,48 1,63 1,07 cc,max,w,28 5,58 25,04 1,66 1,49 cc,median,w,28 6,16 48,25 1,62 1,06 cc,cmean,w,28 6,04 33,26 1,67 1,51 cc,random,w,28 5,66 31,73 1,71 1,21 lc,min,w,28 6,21 35,76 1,54 2,15 lc,max,w,28 8,20 27,42 1,93 1,53 lc,median,w,28 7,01 21,86 1,90 1,68 lc,cmean,w,28 6,70 35,30 1,90 1,07 lc,random,w,28 7,34 34,78 1,91 1,82 k,min,n,28 5,95 10,19 1,26 1,27 k,max,n,28 3,89 16,82 1,25 0,88 k,median,n,28 5,35 16,72 1,24 0,86 k,cmean,n,28 5,72 17,40 1,25 0,86 k,random,n,28 4,21 15,35 1,26 1,17 cc,min,n,28 6,56 21,20 1,36 1,15 cc,max,n,28 6,48 19,24 1,44 1,01 cc,median,n,28 5,08 18,80 1,45 0,86 cc,cmean,n,28 5,19 15,89 1,41 0,76 cc,random,n,28 5,24 25,55 1,40 0,77 lc,min,n,28 7,01 21,67 1,71 1,14 lc,max,n,28 4,83 17,72 1,75 1,05 lc,median,n,28 6,26 24,05 1,72 1,11 lc,cmean,n,28 6,41 35,26 1,74 0,93 lc,random,n,28 6,54 21,10 1,74 1,00 Table 3: Solution quality: the lower percentage, the better performance. DK-classic DK-detailed Gas PtX Dummy Selection 8% 5% 25% 4% k,min,w,28 101% 3% 47% 2% k,max,w,28 65% 18% 25% 3% k,median,w,28 17% 4% 24% 0% k,cmean,w,28 19% 6% 39% 2% k,random,w,28 16% 8% 40% 2% cc,min,w,28 105% 4% 38% 1% cc,max,w,28 30% 23% 12% 7% cc,median,w,28 17% 2% 37% 2% cc,cmean,w,28 13% 5% 35% 1% cc,random,w,28 17% 5% 28% 2% lc,min,w,28 23% 4% 61% 1% lc,max,w,28 90% 23% 17% 5% lc,median,w,28 31% 6% 32% 1% lc,cmean,w,28 42% 9% 28% 0% lc,random,w,28 6% 2% 23% 0% k,min,n,28 102% 4% 51% 1% k,max,n,28 66% 7% 40% 6% k,median,n,28 9% 11% 23% 1% k,cmean,n,28 13% 4% 41% 3% k,random,n,28 5% 2% 25% 3% cc,min,n,28 75% 7% 51% 2% cc,max,n,28 74% 17% 49% 9% cc,median,n,28 45% 2% 59% 4% cc,cmean,n,28 23% 3% 57% 4% cc,random,n,28 37% 6% 55% 8% lc,min,n,28 33% 4% 68% 1% lc,max,n,28 57% 20% 49% 66% lc,median,n,28 40% 6% 35% 2% lc,cmean,n,28 47% 8% 29% 2% lc,random,n,28 49% 7% 33% 3% Generally, the time savings are slightly smaller for the weighted selection algorithms. Recall the weighting from Section 3.3; rounding the number of elements to select from a cluster may increase the total number of selected days. Indeed, the weighted selections result in more than 28 selected days, see Table 4. 5.2. Weighted vs. non-weighted selection Weighted selection has better performance than non- weighted selection with respect to solution quality in 62% of the time aggregated simulations. The results are illus- trated in Figure 6. In 37 of 60 cases, the investment gap decreases with weighted selection. If gaps are averaged across instances, the gap decreases with weighted selection in 11 out of 15 cases. The average of all gaps is 21% for weighted selection and 26% for non-weighted selection. This confirms that weighted selection better rep- resents the full dataset and that outliers are balanced well against the rest of the dataset. The improved quality may partly be due to the increased number of selected days, 88 International Journal of Sustainable Energy Planning and Management Vol. 32 2021 Time Aggregation Techniques Applied to a Capacity Expansion Model for Real-Life Sector Coupled Energy Systems 60% 50% 40% 30% 20% 10% 0% weigthed non-weigthed k,m ax ,28 k,c me an ,28 cc ,m in, 28 cc ,m ed ian ,28 cc ,ra nd om ,28 Ic, ma x,2 8 Ic, mi n,2 8 Ic, me dia n,2 8 Ic, ran do m, 28 cc ,m ax ,28 Ic, cm ea n,2 8 k,m in, 28 k,m ed ian ,28 k,r an do m, 28 cc ,cm ea n,2 8 Figure 6: Quality gap percentages averaged across the four instances. 100% 95% 90% 85% 80% 75% DK classic Du mm y s ele cti on k,m ax ,w ,28 k,c me an ,w ,28 cc ,m in, w, 28 cc ,m ed ian ,w ,28 cc ,ra nd om ,w ,28 Ic, ma x,w ,28 Ic, mi n,n ,28 Ic, me dia n,n ,28 Ic, ran do m, n,2 8 cc ,m ax ,n, 28 Ic, cm ea n,w ,28 k,m in, n,2 8 k,m ed ian ,n, 28 k,r an do m, n,2 8 cc ,cm ea n,n ,28 DK detailed Gas PtX Figure 5: Time usage reductions in percent. Table 4: The number of selected days. DK classic DK detailed Gas PtX Dummy selection 28 28 28 28 k,n,28 28 28 28 28 cc,n,28 28 28 28 28 lc,n,28 28 28 28 28 k,w,28 31 34 34 36 cc,w,28 37 38 39 39 cc,w,28 38 37 36 38 see Table 4. It is possible to increase the number of selected days for the non-weighted algorithms and com- pare the results. This would, however, require that the non-weighted algorithms generate more clusters, which again would make comparison more difficult. Instead, we continue to compare the algorithms with 28 clusters. The interested reader is referred to Appendix B in [43] for results for non-weighted selection with more clusters. 5.3. Selection strategy The strategies for selecting elements in each cluster per- form differently across the instances. Results averaged International Journal of Sustainable Energy Planning and Management Vol. 32 2021 89 Mette Gamst, Stefanie Buchholz, David Pisinger across the four instances are illustrated in Figure 7. Clearly, the minimum sum and maximum sum selec- tions have worst performance. Random and median selection vary slightly, while closest to cluster mean gives consistent results. The same pattern is seen, when considering results for weighted selection only, see Figure 8. Selecting only the minimum or maximum sum elements represents the clusters less well. Random per- forms well which indicates that always selecting the median or closest to cluster mean elements may be too strict. 5.4. New approaches to promote diversification in selected days Random selection performs well but due to its random nature, results are not consistently good. To eliminate 45% 40% 35% 30% 25% 20% 15% 10% 5% 0% min,28 max,28 median,28 cmean,28 random,28 k cc lc Figure 7: Quality gap percentages averaged across instances and weighted and non-weighted selection. the randomness, we instead seek to mimic differentiated selection. We propose the MedianMaxMin selection. The approach is only relevant in weighted selection where more than one element may be selected from each clus- ter. First the median element is selected. If more ele- ments are to be selected from the cluster, the maximum element is selected. Again, if more elements are to be selected, the minimum element is selected. If even more elements are to be selected from the cluster, the selection order repeats. We also propose the kk-means clustering approach (in short kk). The outer clustering is k-means with squared Euclidean distances where the initial clusters are gener- ated as explained in Section 3.1. The inner clustering is also a k-means with squared Euclidean distances, but this time the initial clusters are formed around the median, maximum sum and minimum sum elements (in the outer cluster). The two approaches are tested. Run times are seen in Table 5 and solution gaps in Table 6. Run times are consistent with the remaining aggregation approaches. Solution gaps are illustrated in Figure 9. MedianMaxMin selection generally performs better than the other selec- tions strategies. Good results are especially achieved together with k-means, cluster clustering and kk-means. Kk-means performs overall well, however, without outperforming the other clustering approaches. It gives consistent results except for min selection, which gen- erally performs poorly regardless of clustering approach. 40% 35% 30% 25% 20% 15% 10% 5% 0% Dummy selection Dummy selection max,w,28 cmean,w,28min,w,28 median,w,28 random,w,28 k c c l c Figure 8: Quality gap percentages averaged across instances for weighted selection only. Table 5: Run times in minutes. DK-classic DK-detailed Gas PtX k,MedianMax- Min, w, 28 5,67 28,39 1,30 1,52 cc,MedianMax- Min, w, 28 5,45 21,55 1,65 2,26 lc,MedianMax- Min, w, 28 5,27 27,24 1,57 1,28 kk,min, w, 28 3,86 23,35 1,48 1,20 kk,max, w, 28 7,35 21,78 1,48 1,36 kk,median, w, 28 5,56 13,88 1,50 1,45 kk,cmean, w, 28 5,56 26,14 1,61 1,70 kk,random, w, 28 4,07 24,19 1,54 1,26 kk,MedianMax- Min, w, 28 6,33 23,10 1,51 1,35 90 International Journal of Sustainable Energy Planning and Management Vol. 32 2021 Time Aggregation Techniques Applied to a Capacity Expansion Model for Real-Life Sector Coupled Energy Systems 6. Further analysis The results reveal different quality across time aggrega- tion techniques and data instance. The data instances are analyzed to better understand the results; specifically, we analyze the behavior of the investments in the opti- mal solution for the full test instances and what this means to the clustering approaches. Average gaps for the instances are seen in Figure 10. DK-classic: The investments are mainly utilized in the winter period. They are driven by district heating demand. Data in the instance, however, also contains many other fluctuating timeseries, especially connected to the electricity system: demand, RES production, elec- tricity prices in neighboring countries and capacity restrictions on interconnection lines. The clustering methods end up generating clusters and selecting days, which are not relevant to the investments. Gas: The investments are utilized in the summer period, where demand is low. In these hours, excessive biogas is either moved between distribution systems or sold to the transmission system. Gas demand is, how- ever, not the only varying data in the instance. Electricity prices (considered by gas-fueled CHPs) vary throughout the year. Line pack is modelled as storage space with highest value in the spring. Gas demand varies more outside the summer period (the higher demand, the higher absolute variation). Hence the clustering approaches end up generating clusters and selecting days from other seasons than the summer and the solu- tion quality suffers. DK-detailed: The investments follow RES produc- tion and electricity demand. Most timeseries in the datset are related to RES production and electricity, which explains the good solution quality. PtX: The fluctuation of the timeseries correspond well to the entire production system, including the opti- mal investments. All time aggregation methods thus have good performance. 6.1. Overall best aggregation method Given the analysis in this section and the results in the previous, we investigate which aggre gation method show most promising results. Dummy selection performs well but the method is not robust towards investments, which are utilized in only part(s) of the simulated year. This is the case for the Gas instance where Dummy Selection ends up with a 25% gap. For this reason, it may not generally be the best approach. It, however, benefits from being very simple to implement and to understand from the analyst’s point of view. 40% 35% 30% 25% 20% 15% 10% 5% 0% DK classic DK detailed Gas PtX Figure 10: Average solution gap for each instance across the clustering methods. Table 6: Solution quality: the lower percentage the better performance. DK-classic DK-detailed Gas PtX k,medianmax- min, w, 28 4% 4% 15% 1% cc,medianmax- min, w, 28 16% 2% 21% 2% lc,medianmax- min, w, 28 29% 7% 31% 1% kk,min,w,28 104% 3% 43% 3% kk,max,w,28 41% 6% 7% 9% kk,median,w,28 29% 9% 20% 2% kk,cmean,w,28 14% 2% 37% 3% kk,random,w,28 12% 2% 48% 3% kk,medianmax- min, w, 28 7% 9% 23% 6% 40% 35% 30% 25% 20% 15% 10% 5% 0% Du mm y s ele cti on Dummy selection ma x,w ,28 cm ea n,w ,28 mi n,w ,28 me dia n,w ,28 ran do m, w, 28 me dia nm ax mi n,w ,28 kklccck Figure 9: Quality gap percentages averaged across instances for weighted selection and dummy selection. International Journal of Sustainable Energy Planning and Management Vol. 32 2021 91 Mette Gamst, Stefanie Buchholz, David Pisinger For the clustering approaches, we have already con- cluded that best performance is achieve with weighted selection and with other selection methods than min and max. The overall best performing method in our survey is k-means clustering with weighted selection and with the MedianMinMax selection strategy. The k-means is simple to implement and the MedianMinMax strategy diversifies selection without introducing the uncertainty of randomness. We recommend this method but are also aware that this is a close call. This could indicate that the perfor- mance bottleneck no longer lies in the clustering or selection itself. This is investigated in the next section on future work. Comparing our results to the literature presented in Section 2, we lean towards the same conclusions as in [35] and in [38] but with a different selection and weighting strategy. The benefits of MedianMaxMin selection is similar to [38], which supplements cluster- ing with selecting extreme days to achieve diversifica- tion in the selected days. The similar performance of the clustering methods across very different energy systems, indicates why literature studies have concluded on dif- ferent approaches: The strength of a method may depend on the specific energy system. This topic is also dis- cussed further in the next section on future work. 6.2. Future Work The clustering methods suffer from generating clusters based on data fluctuations irrelevant to the investment decision. We have identified four ideas to further dive into this. Future work could focus on methods to better repre- sent data. One method could be to normalize data to take on values between e.g. -1 and 1. This could lead to a fairer comparison of data stemming from different sources, e.g. comparing capacities with prices. This would, however, also erase the absolute amounts and thus treat e.g. large demands equally to small demands. Fluctuations in small timeseries may cause unimportant days to be selected and thus negatively affect the cluster- ing approach. Future work could also focus on dimensionality reduction, e.g. by considering the subset of data needed to represent the statistical behavior of each day, or by considering the subset of data which correlates with the investment decisions. Future work could focus on constructing and sharing benchmark instances representing challenging and dif- ferent sector coupled energy systems with more invest- ment decisions and where the utilization of the investment decisions is not correlated. The instances could be extended to include seasonal storages to further investi- gate the methods in [37]. This would allow better com- parisons of clustering methods and could result in clearer recommendations. Finally, future work could be to use other energy sim- ulation models to evaluate the clustering approaches, for example the TIMES, Balmorel, EnergyPLAN and ener- gyPRO models [35, 44–46]. 7. Conclusion In this paper, we have investigated the performance of clustering techniques across very different energy sys- tems to give a recommendation of a method with overall good performance. This contrasts the current approach of developing clustering techniques performing well on specific energy systems and thus contributes to closing a gap in the research literature. The clustering techniques all select a subset of days from the datasets, which cover a full year. The applied methods are k-means, hierarchical clustering and a double clustering procedure applying a fuzzy clustering, followed by a hierarchical clustering considering ele- ment correlations. Also, we proposed a new method consisting of double k-means clustering. The methods cluster days and then selects a number of days from each cluster. We have tested several selec- tion strategies from the literature: min, max, median, closest to cluster mean and random. We have also pro- posed a new selection strategy, MedianMaxMin, which selects elements in the named order. Finally, we have investigated the effect of selecting a single element from each cluster or a weighted number of elements from each cluster. All in all, this resulted in a comparison of 41 aggre- gation techniques, and the results were benchmarked against the full datasets. The comparison is evaluated on how well the investment decisions are matched. The methods were tested on four very different energy sys- tems to investigate performance consistency and to ana- lyze if certain energy system aspects are more difficult to replicate through aggregation. The tests showed that all aggregation techniques resulted in significant time reductions between 78% and 97%. The tests also revealed that weighted selection out- performed selecting exactly one element from each cluster. 92 International Journal of Sustainable Energy Planning and Management Vol. 32 2021 Time Aggregation Techniques Applied to a Capacity Expansion Model for Real-Life Sector Coupled Energy Systems To the best of our knowledge, this has not been analyzed or concluded previously in the literature. Selecting minimum or maximum elements from each cluster was generally not a good strategy. The new selec- tion method, MedianMaxMin, and clustering method, kk, both performed consistently well. Especially k-means with MedianMaxMin selection showed very good per- formance, and this is also the clustering approach we recommend. We also tested Dummy Selection, which simply selects every 13th day. Overall, it performed surpris- ingly well. Considering its simplicity, it could be a good alternative to the more complex clustering methods as it is easy to implement and understand. Future work could focus on how data is considered when clustering. In this paper, all timeseries are consid- ered. A closer analysis of the test instances revealed that this may not be the best approach as data irrelevant to the investments caused the aggregation techniques to select days, which were also irrelevant to the investment decisions. Future work could also focus on constructing a library of benchmark instances. This would strengthen the research area of time aggregation techniques applied to capacity expansion models, as this would better allow for systematical comparison of methods. Combining qualitative methods with quantitative methods as proposed in [47] is also an interesting path of future work. References [1] T. Brown, D. Schlachtberger, A. Kies, S. Schramm and M. Greiner, ”Synergies of sector coupling and transmission extension in a cost-optimised, highly renewable European energy system”. Energy 160, 2018. https://doi.org/10.1016/j. energy.2018.06.222. [2] S. Buchholz, M. Gamst and D. Pisinger, ”A Comparative Study of Aggregation Techniques in relation to Capacity Expansion Energy System Modeling”. TOP 2019, vol. 27, no. 3, pp. 353- 405, 2019. https://doi.org/10.1007/s11750-019-00519-z. [3] O. M. Babatunde, J. L. Munda and Y. Hamam, ”A comprehensive state-of-the-art survey on power generation expansion planning with intermittent renewable energy source and energy storage”. International Journal of Energy Research, 2019. https://doi. org/10.1002/er.4388. [4] C. Baldwin, K. Dale, and R. Dittrich, ”A Study of the Economic Shutdown of Generating Units in Daily Dispatch”. Power Apparatus and Systems, Part III, Transactions of the American Institute of Electrical Engineers 78, pp. 1272-1282, 1960. https://doi.org/10.1109/AIEEPAS.1959.4500539. [5] N. Koltsaklis and M. Georgiadis, ”A multi-period, multi- regional generation expansion planning model incorporating unit commitment constraints”. Applied Energy 158, pp. 310- 331, 2015. https://doi.org/10.1016/j.apenergy.2015.08.054. [6] V. Oree, S. Z. Sayed Hassen and P. Fleming, ”Generation expansion planning optimisation with renewable energy integration: A review”. Renewable and Sustainable Energy Reviews 69, pp. 790-803, 2017. https://doi.org/10.1016/j. rser.2016.11.120. [7] J.C. Osorio-Aravena, A. Aghahosseini, D.B.U. Caldera, E. Munoz-CerΌn, and C. Breyer. Transition toward a fully renewable-based energy system in chile by 2050 across power, heat, transport and desalination sectors. International Journal of Sustainable Energy Planning and Management, 25:77–94, 2020. https://doi.org/10.5278/ijsepm.3385. [8] M.G. Prina, D. Moser, R. Vaccaro, and W. Sparber. EPLANopt optimization model based on energyplan applied at regional level: the future competition on excess electricity production from renewables. International Journal of Sustainable Energy Planning and Management, 27:35–50, 2020. https://doi. org/10.5278/ijsepm.3504. [9] B. Hua, R. Baldick and J. Wang, ”Representing Operational Flexibility in Generation Expansion Planning Through Convex Relaxation of Unit Commitment”. IEEE Transactions on Power Systems 33, pp. 2272-2281, 2017. https://doi.org/10.1109/ TPWRS.2017.2735026. [10] J. P. Deane, A. Chiodi, M. Gargiulo and B. O’Gallachoir, ”Soft- linking of a power systems model to an energy systems model”. Energy 42, pp. 303-312, 2012. https://doi.org/10.1016/j. energy.2012.03.052. [11] B. Palmintier and M. Webster, ”Impact of Operational Flexibility on Electricity Generation Planning With Renewable and Carbon Targets”. IEEE Transactions on Sustainable Energy, 2015. https://doi.org/10.1109/TSTE.2015.2498640. [12] K. Poncelet, E. Delarue, and W. D’haeseleer, ”Unit commitment constraints in long-term planning models: Relevance, pitfalls and the role of assumptions on flexibility”. Applied Energy 258, 113843, 2019. https://doi.org/10.1016/j.apenergy.2019. 113843. [13] A. Viana and J. P. Pedroso, ”A new MILP-based approach for unit commitment in power production planning”. International Journal of Electrical Power & Energy Systems Volume44 (1), pp. 997-1005, 2013. https://doi.org/10.1016/j.ijepes.2012.08.046. [14] M. Welsch, P. Deane, M. Howells, B. O’GallachΌir, F. Rogan, M. Bazilian and H. Rogner, ”Incorporating flexibility requirements into long-term energy system models – A case study on high levels of renewable electricity penetration in ˜ https://doi.org/10.1016/j.apenergy.2019.113843 https://doi.org/10.1016/j.apenergy.2019.113843 https://doi.org/10.1016/j.ijepes.2012.08.046 International Journal of Sustainable Energy Planning and Management Vol. 32 2021 93 Mette Gamst, Stefanie Buchholz, David Pisinger Ireland”. Applied Energy 135, pp. 600-615, 2014. https://doi. org/10.1016/j.apenergy.2014.08.072. [15] H. K. Ringkjøb, P. Haugan, and I. Solbrekke, ”A review of modelling tools for energy and electricity systems with large shares of variable renewables”. Renewable and Sustainable Energy Reviews 96, pp. 440-459, 2018. https://doi. org/10.1016/j.rser.2018.08.002. [16] A. Schwele, J. Kazempour and P. Pinson, ”Do unit commitment constraints affect generation expansion planning? A scalable stochastic model”. Energy Systems 2018. https://doi. org/10.1007/s12667-018-00321-z [17] C. L. Lara, D. S. Mallapragada, D. J. Papageorgiou, A. Venkatesh and I. E. Grossmann, ”Deterministic electric power infrastructure planning: Mixed-integer programming model and nested decomposition algorithm”. European Journal of Operational Research 271(3), pp. 1037-1054, 2018 https://doi. org/10.1016/j.ejor.2018.05.039. [18] A. Flores-Quiroz, R. Palma-Behnke, G. Zakeri and R Moreno, ”A column generation approach for solving generation expansion planning problems with high renewable energy penetration”. Electric Power Systems Research 136, pp. 232- 241, 2016. https://doi.org/10.1016/j.epsr.2016.02.011 [19] K. Poncelet, E. Delarue, D. Six, J. Dueinck and W. D’haeseleer, ”Impact of the level of temporal and operational detail in energy- system planning models”. Applied Energy, vol. 162, no. 58, pp. 631-643, 2016. https://doi.org/10.1016/j.apenergy.2015.10.100. [20] K. Poncelet, H. Hoschle, E. Delarue, A. Virag and W. D’haeseleer, ”Selecting representative days for capturing the implications of integrating intermittent renewables in genera- tion expansion problems”. IEEE Transactions on Power Systems, 2016. https://doi.org/10.1109/TPWRS.2016.2596803. [21] M. Fripp, ”Switch: A Planning Tool for Power Systems with Large Shares of Intermittent Renewable Energy”. Environmental Science & Technology 46, pp. 6371–6378, 2012, https://doi. org/10.1021/es204645c. [22] J. H. Merrick, ”On representation of temporal variability in electricity capacity planning models”. Energy Economics 59, pp. 261-274, 2016. https://doi.org/10.1016/j.eneco.2016.08.001. [23] L. Kotzur, P. Markewitz, M. Robinius and D. Stolten, ”Impact of different time series aggregation methods on optimal energy system design”. Renewable Energy 117, 2017. https://doi. org/10.1016/j.renene.2017.10.017. [24] W. Fisher, ”On Grouping for Maximum Homogeneity”. Journal of The American Statistical Association 53, pp. 789-798, 1958. https://doi.org/10.1080/01621459.1958.10501479. [25] P. Nahmmacher, E. Schmid, L. Hirth and B. Knopf, ”Carpe Diem: A Novel Approach to Select Representative Days for Long-Term Power System Models with High Shares of Renewable Energy Sources”. Energy, vol. 112, pp. 430-442, 2016. https://doi.org/10.1016/j.energy.2016.06.081. [26] J. B. MacQueen, ”Some Methods for classification and Analysis of Multivariate Observations”. Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability 1, pp. 281–297, 1967. [27] J. C. Dunn, ”A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters”. Journal of Cybernetics, 3, pp. 32–57, 1973. https://doi.org/10.1080/ 01969727308546046. [28] M. ElNozahy, M. Salama and R. Seethapathy, ”A probabilistic load modelling approach using clustering algorithms”. IEEE Power and Energy Society General Meeting, pp. 1-5, 2016. https://doi.org/10.1109/PESMG.2013.6672073. [29] Y. Liu, R. Sioshansi and A. J. Conejo, ”Hierarchical clustering to find representative operating periods for capacity-expansion modeling”. IEEE Transactions on Power Systems, vol. 33, no. 3, pp. 3029-3039, 2017. [30] J. Han, M. Kamper and J. Pei, ”10 – Cluster Analysis: Basic Concepts and Methods”. in Data Mining (Third Edition), Morgan Kaufmann, 2012, pp. 443 - 495. [31] R. Green, I. Staffell and N. Vasilakos, ”Divide and Conquer? K-means Clustering of Demand Data Allows Rapid and Accurate Simulations of the British Electricity System”. IEEE Transactions on engineerring management, vol. 61, no. 2, pp. 251-260, 2014. https://doi.org/10.1109/TEM.2013.2284386. [32] M. Nicolos, A. Mills and R. Wiser, ”The importance of high temporal resolution in modeling renewable energy penetration scenarios”. 9th Conference on Applied Infrastructure Research, 2011. [33] IRENA (2017), ”Planning for the Renewable Future: Long- term modelling and tools to expand variable renewable power in emerging economies”. International Renewable Energy Agency, Abu Dhabit, 2017. [34] D. Rogers, R. Plante, R. Wong and J. Evans, ”Aggregation and disaggregation Techniques and Methodology in Optimization”. Operations Research, vol. 39, no. 4, pp. 553-582, 1991. https:// doi.org/10.1287/opre.39.4.553. [35] F. Wiese, R. Bramstoft, H. Koduvere, A. Alonso, O. Balyk, J. Kirkerud, Å. Tveten, T. Bolkesjø, M. Münster and H. Ravn, ”Balmorel open source energy system model”. Energy Strategy Reviews, Vol. 20, pp. 26-34, 2018. https://doi.org/10.1016/j. esr.2018.01.003. [36] H. Teichgraeber and A. Brandt, ”Clustering methods to find representative periods for the optimization of energy systems: An initial framework and comparison”. Applied Energy, vol. 239, pp. 1283-1293, 2019. https://doi.org/10.1016/j. apenergy.2019.02.012. https://doi.org/10.1016/j.apenergy.2014.08.072 https://doi.org/10.1016/j.apenergy.2014.08.072 https://doi.org/10.1016/j.rser.2018.08.002 https://doi.org/10.1016/j.rser.2018.08.002 https://doi.org/10.1007/s12667-018-00321-z https://doi.org/10.1007/s12667-018-00321-z https://doi.org/10.1016/j.ejor.2018.05.039 https://doi.org/10.1016/j.ejor.2018.05.039 https://doi.org/10.1016/j.apenergy.2015.10.100 https://doi.org/10.1109/TPWRS.2016.2596803 https://doi.org/10.1021/es204645c https://doi.org/10.1021/es204645c https://doi.org/10.1016/j.renene.2017.10.017 https://doi.org/10.1016/j.renene.2017.10.017 https://doi.org/10.1080/01969727308546046 https://doi.org/10.1080/01969727308546046 https://doi.org/10.1016/j.esr.2018.01.003 https://doi.org/10.1016/j.esr.2018.01.003 https://doi.org/10.1016/j.apenergy.2019.02.012 https://doi.org/10.1016/j.apenergy.2019.02.012 94 International Journal of Sustainable Energy Planning and Management Vol. 32 2021 Time Aggregation Techniques Applied to a Capacity Expansion Model for Real-Life Sector Coupled Energy Systems [37] P. Gabrielli, M. Gazzani, E. Martelli and M. Mazzotti, ”Optimal design of multienergy systems with seasonal storage”. Applied Energy 219, pp. 408-424, 2018. https://doi.org/10.1016/j. apenergy.2017.07.142 [38] S. Pfenninger, ”Dealing with multiple decades of hourly wind and PV time series in energy models: A comparison of methods to reduce time resolution and the planning implications of inter- annual variability”. Applied Energy 197, pp. 1-13, 2017. https://doi.org/10.1016/j.apenergy.2017.03.051. [39] M. Zatti, M. Gabba, M. Freschini, M. Rossi, A. Gambarotta, M. Morini and E. Martelli. ”k-MILP: A novel clustering approach to select typical and extreme days for multi-energy systems design optimization”. Energy 181, pp 1051-1063, 2019. https:// doi.org/10.1016/j.energy.2019.05.044. [40] T. Schütz, M. H. Schraven, M. Fuchs, P. Remmen and D. Müller, ”Comparison of clustering algorithms for the selection of typical demand days for energy system synthesis”. Renewable Energy 129 Part A, pp. 570-582, 2018. https://doi.org/10.1016/j. renene.2018.06.028. [41] R. Anand, D. Aggarwal and V. Chahar, ”A Comparative Analysis of Optimization Solvers”. Journal of Statistics and Management Systems 20, 2017. https://doi.org/10.1080/ 09720510.2017.1395182. [42] ”Sifre – Simulation of Flexible and Renewable Energy Systems”. [Online]. Available: https://energinet.dk/-/ media/0C7AA9C78EBE428580CAB85E120129CB.pdf. [43] M. Gamst, S. Buchholz and D. Pisinger, ”Time Aggregation Techniques Applied to a Capacity Expansion Model for Real- Life Sector Coupled Energy Systems”. arXiv, 2020. arXiv:2012.10244. [44] R. Loulou, U. Remne, A. Kanudia, A. Lehtila and G. Goldstein, ”Documentation for the MARKAL Family of Models – PART 1”. [Online]. Available: https://iea-etsap.org/MrklDoc-I_ StdMARKAL.pdf. [45] H. Lund, J. Z. Thellufsen, P. A. Østergaard, P. Sorknæs, I. R. Skov, and B. V. Mathiesen, ”EnergyPLAN – Advanced analysis of smart energy systems”. Smart Energy, Vol. 1, 2021, https:// doi.org/10.1016/j.segy.2021.100007 [46] A. N. Andersen and S. Frandsen, ”Development of a computer- based tool energyPRO, for simulation and optimisation of operational strategy for CHP biomass - fired plants”. [Online]. Available: https://www.osti.gov/etdeweb/biblio/20235617 [47] P.F. Borowski ”New Technologies and Innovative Solutions in the Development Strategies of Energy Enterprises”. HighTech and innovation Journal, vol. 1, 2020, 39-58. https://doi.org/10.1080/09720510.2017.1395182 https://doi.org/10.1080/09720510.2017.1395182 https://iea-etsap.org/MrklDoc-I_StdMARKAL.pdf https://iea-etsap.org/MrklDoc-I_StdMARKAL.pdf http://www.osti.gov/etdeweb/biblio/20235617 http://www.osti.gov/etdeweb/biblio/20235617