Comparison of machine learning techniques for SoC and SoH evaluation from impedance data of an aged lithium ion battery


ACTA IMEKO 
ISSN: 2221-870X 
June 2021, Volume 10, Number 2, 80 - 87 

 
ACTA IMEKO | www.imeko.org June 2021 | Volume 10 | Number 2 | 80 

Comparison of machine learning techniques for SoC and SoH 
evaluation from impedance data of an aged lithium ion 
battery 

Davide Aloisio1, Giuseppe Campobello2, Salvatore Gianluca Leonardi1, Francesco Sergi1, 
Giovanni Brunaccini1, Marco Ferraro1, Vincenzo Antonucci1, Antonino Segreto2, Nicola Donato2 

1 Institute of Advanced Energy Technologies “Nicola Giordano”, National Research Council of Italy, Salita S. Lucia sopra Contesse, 5 - 98126,  
 Messina, Italy  
2 University of Messina, Department of Engineering, C.da di Dio, Vill. S.Agata, 98166 Messina, Italy  

 
Section: RESEARCH PAPER  

Keywords: Machine Learning; Electrochemical impedance spectroscopy EIS; Lithium-ion battery; State of Charge; State of Health 

Citation: Davide Aloisio, Giuseppe Campobello, Salvatore Gianluca Leonardi, Francesco Sergi, Giovanni Brunaccini, Marco Ferraro, Vincenzo Antonucci, 
Antonino Segreto, Nicola Donato, Comparison of machine learning techniques for SoC and SoH evaluation from impedance data of an aged lithium ion 
battery, Acta IMEKO, vol. 10, no. 2, article 12, June 2021, identifier: IMEKO-ACTA-10 (2021)-02-12 

Section Editor: Ciro Spataro, University of Palermo, Italy 

Received January 18, 2021; In final form April 29, 2021; Published June 2021 

Copyright: This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, 
distribution, and reproduction in any medium, provided the original author and source are credited. 

Funding: This work was funded by the Italian Ministry of Economic Development under the programme “Ricerca di Sistema”, project electrochemical storage 

Corresponding author: Davide Aloisio, e-mail: aloisio@itae.cnr.it   

 
1. INTRODUCTION 

As well known, Machine Learning (ML) is a subfield of 
computing, an Artificial Intelligence (AI) technique that provides 
machines with the ability to learn from the field data without 
explicit programming [1]. In particular, ML can be really useful 
in applications that try to extract some information or unknown 
properties (‘features’) from the dataset (usually called ‘training 
set’) coming from data warehouses or data lakes. Information 
extracted from this kind of data analyses can be used to develop 
prediction models for systems behaviour (subject to certain 
operative conditions and under some constraints). In particular, 
the battery behaviour characterisation is quite complex to be 
described through analytical models, mainly due to many 
parameters act in determining the ageing evolution (e.g. charge 

and discharge current rates, operative temperature, depth of 
discharge (DoD) reached, state of charge (SoC) during rest 
periods and so on. Therefore, the combination of the 
aforementioned parameters makes the systems hard to model via 
analytical equations. This is particularly evident for Li-Ion 
batteries, for which it is more difficult to describe 
electrochemical processes with analytical equations, due to the 
nonlinearities present in their behaviours. The analytical models 
require, in addition to input data of the actual working conditions 
(current, temperature, etc.), the knowledge of many parameters 
(geometry, density and porosity of materials, etc.) These data are 
not always available or easily measurable and can vary over time 
(e.g. due to ageing). Therefore, the analytical models can be 
affected by inaccuracy.  

ABSTRACT 
State of charge estimation and ageing evolution of lithium ion (Li-Ion) batteries are key points for their massive applications in the 
market. However, the battery behavior is very complex to understand because many parameters act in determining their ageing 
evolution. Therefore, traditional analytical models employed for this purpose are often affected by inaccuracy. In this context, machine 
learning techniques can provide a viable alternative to traditional models and a useful tool to characterize the batteries behavior.  
In this work, different machine learning techniques were applied to model the impedance evolution over time of an aged cobalt based 
Li-Ion battery, cycled under a stationary frequency regulation profile for grid application. The different ML techniques were compared 
in terms of accuracy to determine the state of charge and the state of health over the battery ageing phenomena. Experimental results 
showed that ML based on Random Forest algorithm can be profitably used for this purpose. 

mailto:aloisio@itae.cnr.it


ACTA IMEKO | www.imeko.org June 2021 | Volume 10 | Number 2 | 81 

In this context, ML techniques represent a viable alternative 
and a useful tool for modelling the battery behaviour. ML 
algorithms learn directly from experimental data, reducing the 
complexity of modelling, usually due to the high number of 
parameters and empirical adjustments needed. In addition, 
according to the recent literature, the application of ML 
techniques in the prediction of the ageing of Li-Ion batteries 
shows errors in the range between 0.5% - 5.5% [2]-[4]. This range 
of accuracy is considered a good compromise among algorithm 
complexity, effort spent on model development and reliability of 
results. 

Many physical and electrical parameters are characteristic of 
the chemical reactions inside Li-ion batteries. Therefore, these 
relations could be used as tools for the battery state modelling 
[5], [6]. Typically, these features are derived from charging and 
discharging curves, since typical battery management systems 
(BMS) are able to collect current-voltage data. Hence these are 
the most commonly used parameters for real time battery 
monitoring [7], [8]. Various approaches based on the use of 
different parameters were proposed in the literature to train 
machine learning models. Among battery parameters, single-
points terminal voltage, current, temperature, charge/discharge 
profiles [2], [9], [10] or their geometrical characteristics [11] were 
employed for this purpose. However, much more information 
about the status of the battery can be extracted from the 
impedance spectra recorded by means of electrochemical 
impedance spectroscopy (EIS) [12]. Indeed, the impedance 
spectrum of a lithium cell contains rich information on all 
materials properties, interfacial phenomena and electrochemical 
reactions.  

From a practical point of view, many of these can be 
extrapolated by the Nyquist diagram in which inverse imaginary 
part of impedance is plotted against the real one for each 
investigated frequency (of solicitations). In the case of Li-Ion 
batteries, the Nyquist diagram consists of four distinct regions 
typically belonging to the frequency range between 10 mHz to 
10 kHz [13]. In the low frequency region, an almost linear trend 
in the Nyquist plot is representative of the solid diffusion of 
lithium ions through the electrodes material. In the medium-high 
frequencies range, one or more semicircles usually represent the 
impedance of either charge transfer phenomena or passivation 
layers on the electrodes surface (solid electrolyte interphase-SEI). 
The intersection of the impedance spectrum with the real axis 
(pure ohmic impedance) represents the cell internal resistance. 
Finally, the high frequency region is representative of inductive 
phenomena. Since each one of these phenomena are strictly 
related to temperature, to SoC and to the state of health (SoH) 
of the cell, then the analysis of the impedance data can be used 
to monitor the status of the battery [14]. However, due to the 
large number of data involved in a single EIS spectrum and the 
amount of information it can contain, the use of conventional 
data analysis methods may be difficult. Also, because of the 
difficulties in measuring the impedance while the cells are active, 
EIS is not widely used [15]. To overcome this drawback, 
increasing attention is paid to the implementation of ML 
approaches, either to aid the fitting of the parameters of 
equivalent circuits able to describe the battery impedance [16], or 
by directly modelling the entire impedance spectrum [17]. 

In this paper, ML is addressed to identify possible 
methodologies to estimate the SoC and SoH of Li-Ion battery 
from EIS data, mainly aiming at developing a feasible model easy 
to be integrated in a battery management system (BMS). 
Implementation in BMSs of techniques able to extend batteries 

useful life, estimating the possible replacement time (estimation 
of Remaining Useful Life, or RUL), is considered a key research 
activity in the field [18]. 

In Section 2, some state-of-the-art of ML techniques applied 
to SoH and RUL estimation are reviewed. Section 3 describes the 
experimental procedures employed to age the Li-Ion cell; the 
main parameters extrapolated to create the dataset for the 
algorithm; and the methods for their collection. Section 4 
describes the methodology used to carry out the first selection of 
ML algorithm and the validation of the model. Section 5 presents 
the main results related to the use of different classifiers to model 
both SoC and capacity loss of the Li-Ion cell. Finally, in 
Section 6, the main observations are summarised. 

2. ML ALGORITHMS FOR STATE OF HEALTH (SOH) AND 
REMAINING USEFUL LIFE (RUL) EVALUATION: A BRIEF 
REVIEW 

Thanks to the remarkable computational capabilities of 
today’s systems, learning algorithms applied to large quantities of 
data have often become the preferred approach in the search and 
identification of complex system behaviour, and therefore 
represent a valid tool for SoH estimation of batteries. In these 
techniques, a large amount of data, constituted by main battery 
parameters, are collected continuously up to the end of their life. 
The dataset analysis of the battery life, performed by learning 
algorithms, allows extracting non-linear relationships among the 
various parameters. The knowledge derived from this kind of 
information can allow a careful management of the battery, 
helping to extend the useful life and giving reliable prediction on 
possible replacement times, with obvious positive impact on 
costs and investments.  

ML techniques such as Fuzzy Logic (FL), Support Vector 
Machine (SVM) and Artificial Neural Networks (ANN) have 
extensively been applied for the estimation of the health of 
batteries, and a brief review can be found in [3]. In most cases, 
SoH is estimated by determining battery capacity and internal 
resistance, parameters strictly related to SoH, from input 
variables behaviour analysis (current, temperature, voltage, etc.) 
An application of Fuzzy logic with a potential use in portable 
devices is reported in [19], where Electrochemical Impedance 
Spectroscopy (EIS) technique was used for the dataset creation. 
However, improper hypotheses in the Fuzzy rules [3] and 
reduced set of observations can lead to substantial errors. The 
Support Vector Machine is a regression algorithm which 
converts nonlinearities in a lower dimension space to a linear 
model developed in a higher-dimensional one [20]. Examples of 
application of this technique applied to SoH are reported in [21]-
[25]. In particular, in [25] an online method for SoH estimation 
was developed determining support vectors by means of pieces 
of charging curves. SoH with less than 2% error for 80% of all 
the cases for commercial NMC LI-ion batteries was achieved. 
The accuracy of the results is strongly dependent on the noise 
and operational conditions; hence, other data manipulation 
techniques (particle filter, Bayesian technique) have to be used in 
conjunction with SVM to increase the robustness of the 
estimation [26], thus increasing the complexity of 
implementation. Relevance Vector Machine (RVM) is suggested 
as a possible improvement of this approach in [20]. Artificial 
Neural Networks (ANNs) are probably the most used approach, 
inspired by the biological functioning of the human brain, for 
modeling nonlinear systems. SoH estimation using an 
independently recurrent neural network (in RNN) was realised in 


ACTA IMEKO | www.imeko.org June 2021 | Volume 10 | Number 2 | 82 

[3]. Here SoH was predicted accurately with a root means square 
error (RMSE) of 1.33% and mean absolute error (MAE) of 
1.14%. The main limitation is the need of a detailed analysis on 
experimental dataset. Different chemistries can require a precise 
identification and understanding of input parameters.  
In [27], an improved neural network method based on the 
combination of a LSTM (long-short-term memory) and (PSO) 
Particle Swarm Optimisation was developed. The methodology 
proposed here uses some additional techniques in each part of 
the learning process, such as PSO for optimisation of the 
weights, dynamic incremental learning for SoH model updating, 
CEEMDAN method to denoise raw data, with the aim of 
increasing the accuracy of the model [27].  
Another hybrid approach can be found in [28], where false 
nearest neighbour method was used in conjunction with a mixed 
LSTM and convolutional neural network (CNN) as a solution for 
unreliable sliding window sizes, a problem commonly present in 
data-driven RUL evaluation approaches. The complexity and 
topology of ANN used in these works is actually classified as 
deep learning, an evolution of the machine learning concept 
coined for neural networks which exploits the concept of multi-
layer perceptron (MLP). A comparison of deep learning and 
different other common techniques showing its potential and 
advantages of data driven approaches was presented in [4]. The 
outcomes showed the goodness of deep neural networks 
(DNN), which are suitable when high accuracy is needed. 
However, also this technique is not easy to be implemented due 
to higher computational complexity and resources needed [4].  

A lot of other techniques and approaches can be found in the 
literature. Although out of the scope of the present work, the 
goal of a possible implementation in BMS suggests the choice of 
low-complexity approaches to reduce computational resources 
needed and thus leading to lower energy consumption [29]. A 
possible alternative is given by Random Forest algorithms. They 
generally use reduced computation resources, and thus can result 
preferable in comparison to other techniques analysed, based on 
SVM and NN. In general, Linear regressors or Random Forest 
response is faster than complex model and is easily interpreted. 
However, it has to be underlined that the accuracy in Random 
Forest models is related to the number and size of trees and 
therefore to the availability of memory [1], [30]. 

3. DATASET COLLECTION AND CREATION 

The present work was aimed at the development of a method 
to identify the degradation level induced by the use of Li-Ion 
batteries in a primary frequency regulation (FR) service. More 
precisely, the activity was focused on the identification of the 
main parameters indicating the state of battery degradation. For 

this purpose, cylindrical-type 18650 Li-Ion-ion cells (Table 1) 
were cycle aged according to a test profile extrapolated by the 
standard IEC 61427-2 [31]. 

The standard profile requires that the storage system is able 
to provide symmetrical charging and discharging phases at 
constant power of 500 kW and 1000 kW, respectively, with a 
voltage range of 400–600 V. Therefore, the profile was adapted 
to the single cell characteristics. Moreover, in order to enhance 
the degradation of the cell (thus limiting the overall duration 
required for data collection) FR ageing tests were accelerated by 
operating at an ambient temperature of 45 °C. In fact, the 
degradation processes of Li-Ion batteries are accelerated by 
temperature increase [32]. 

The ageing tests were performed by a dual-channel Bitrode 
FTV1 battery cycler. In addition, the cell was tested under 
temperature-controlled atmosphere in an Angelantoni Discovery 
DM 340 BT climatic chamber.  

The FR ageing profile with actual power steps imposed to the 
cell is shown in Figure 1. The full ageing protocol consisted of a 
first charge of the cell up to 100% SoC and then an execution of 
the FR profile. Once the cell reached the lower voltage cut-off 
threshold (discharged), a recharge up to 100% SoC was 
performed and then the cycle was restarted. 

The ageing level was defined in terms of residual capacity 
retained by the cell. This information was obtained from periodic 
check-ups carried out on the cell, approximately every 10 days of 
operation. Parametric check-ups of the cells performed the 
extraction of residual capacity and impedance evaluations by 
means of EIS technique. Both analyses were carried out through 
a high reliability Autolab 302N potentiostat/galvanostat (whose 
potential accuracy and current accuracy are both ±0.2% of the 
full-scale value). It is worth noting that, due to instruments 
calibration and performance, measurements were considered 
reliable and having no impact of uncertainty on the model. The 
robustness of the model will be investigated in a future work. 

Capacity tests, constituted by a galvanostatic discharge at 
nominal c-rate and room temperature, allowed to extrapolate the 
characteristic parameters indicative of the SoH of the cell. The 
recorded discharge curves at the begin of life (BoL) and different 
SoH levels are reported in Figure 2a. In particular, Residual 
capacity (Cd) and Residual energy (Ed) were collected and used 
as output variables of the database. The value of Cd was obtained 
by integrating the actual current (Id) between begin of discharge 
(t0) and end of discharge (tf), within the upper and lower voltage 
cut off limit 

𝐶𝑑 = ∫ 𝐼𝑑(𝑡)d𝑡
𝑡𝑓

𝑡𝑜
 .  (1) 

Table 1. Main characteristics of the tested Li-Ion cell. 

Description Value 

Nominal voltage 3.7 V 

Nominal capacity 1.1 Ah 

Max charge current 4 A 

Max discharge current 10 A 

Maximum voltage 4.2 V 

Minimum voltage 2.5 V 

Discharge temperature -30 ÷ 60 °C 

Charge temperature 0 ÷ 60 °C 

Chemistry LiCoO2-LiNiCoMnO2/Graphite 

 
Figure 1. Power profile used to age the battery according to a frequency 
regulation profile extrapolated from the international standard IEC 61427-2. 


ACTA IMEKO | www.imeko.org June 2021 | Volume 10 | Number 2 | 83 

The quantity Ed was obtained by integrating the actual power 
(id) between the begin of discharge (t0) and the end of discharge 
(tf), within the upper and lower voltage cut off limits. 

𝐸𝑑 = ∫ 𝑃𝑑(𝑡) d𝑡
𝑡𝑓

𝑡𝑜
 ,  (2) 

where 𝑃𝑑(𝑡) = 𝑉(𝑡) ∙ 𝐼𝑑(𝑡), with 𝑉(𝑡) and 𝐼𝑑(𝑡) representing 
the instantaneous values of voltage and current, respectively. 

The recorded discharge curves at begin of life (BoL) and 
different SoH levels are reported in Figure 2a. SoH levels were 
defined as capacity loss of the cell identified during each 
parametric check-up. 

As input variables of the algorithm, the complex impedance 
values were collected at different frequencies and SoH levels of 
the cell. Such information came from EIS analysis carried out in 
correspondence of parametric check-ups. To create the database, 
the impedance of the cell was recorded at different SoCs (100%, 
75%, 50%, 25%, 0%) at BoL and every ten days of operation 
under FR cycle, until a loss of capacity (Closs) of about 8% was 
reached. Moreover, the loss of capacity (Closs, as effect of 
ageing) was used as indicative parameter of the cell SoH. Nyquist 
plots of the impedance recorded for different SoCs at BoL and 
five different SoH levels are reported in Figure 2b.  

The impedance was recorded in the frequency range between 
10 mHz to 10 kHz with ten points per decade, which leads to 61 

values for each SoC. Finally, the data set used for the case study 
consists of 1830 impedance measurements. Table 2 contains 
some statistical information on the dataset used. 

4. METHODOLOGY 

The above-mentioned dataset has been used to test various 
classification and regression techniques through the Scikit-learn 
tool [33], an open-source library for machine learning developed 
in Python. Among them, K-nearest Neighbors (KNN), Linear 
Discriminant Analysis (LDA), Gaussian Naive Bayes (GNB), 
Support Vector Classification (SVC), Decision Tree (DT), Linear 
Regressor, Lasso, Ridge and Random Forest were considered for 
performances comparison. 

In order to avoid an influence on the results by the particular 
previous partitioning, a cross-validation technique was also used. 
In particular, in this phase the original data set was partitioned 
into 5 subsets (folds) used for tests and training. In the case of 
regressors, the value of the MAE and the determination 
coefficient (R2) was calculated for each round. Similarly, the 
accuracy (ACC) was measured for the classifiers. 

The models were then compared on the basis of the average 
values of the aforementioned metrics obtained in the 5 validation 
rounds. The standard deviation (STD) was also determined from 
the same metrics, which provides information on the robustness 
of the model (in fact, lower values of STD generally correspond 
to more robust models). 

5. RESULTS 

5.1. Data analysis 

First, correlation coefficients were analysed to investigate 
relationships among impedance measurements and the 
corresponding SoC and Closs values. 

Correlation coefficients of SoC and Closs, specifically 
achieved for the RF cycle, are summarised in Table 3 for both 
rectangular and polar forms of the impedance. The analysis of 
the correlation coefficients shows that the highest correlation 
value is between the Closs measurements and the real part of the 
impedance (Re(Z)), for which a correlation coefficient of 0.471 
was obtained. It is also possible to observe that the correlation 
coefficient obtained between Closs and the impedance module 
(Abs(Z)) is just smaller (0.456). The similarity between these two 
correlation coefficients suggests that, for the purpose of Closs 
modelling, it is possible to use either the module or the real part 
of the impedance.  

In the case of SoC, the highest correlation value is obtained 
with the impedance phase values (Arg(Z)) for which a correlation 
coefficient of 0.337 was obtained. Rectangular coordinate values, 
on the other hand, are uncorrelated to SoC. Therefore, it can be 
assumed that, for SoC modelling, the phase values of impedance 
are the most useful, at least for this set of data. As a consequence, 
it is to be expected that machine learning algorithms will perform 
better with the use of impedance values represented in polar 
coordinates rather than rectangular ones. 

Table 2. Statistical data of used dataset. 

 f (Hz) Re{Z} (Ω) Im{Z} (Ω) SoC (%) C_loss (%) 

Range min-max 0.01-10000 0.041-0.207 -0.072-0.067 0-100 0-8.27 

Mean 797 0.0762 0.0019 50.00 4.54 

STD 1951.6 0.0212 0.0175 35.36 2.69 

 
Figure 2. a) Discharge curves for extrapolation of output variables; b) Nyquist 
plot of the impedance used as input variables of the database. 


ACTA IMEKO | www.imeko.org June 2021 | Volume 10 | Number 2 | 84 

The above analysis was repeated considering only EIS 
impedance data corresponding to frequency values lower than 
350 Hz. Henceforward, in this work, we will refer to this data set 
as filtered data. Indeed, as also reported in [34] where a similar 
lithium cell was used, the most important features induced by 
ageing on physico-chemical processes were observed only for the 
negative imaginary part of the impedance, which, in our case, 
matches with the selected filtered frequency range. 

Also, it is well known that EIS at moderate and high 
frequency is strongly dependent on the experimental setup and 
cables, thus leading to measurement errors and scattered data 
[35].  

Accordingly, the comparison of correlation coefficients 
reported in Table 3 and Table 4 (for original and filtered data, 
respectively) reveals a marked improvement in SoC correlation 
when only low frequency measures are considered. In particular, 
in the case of filtered data, a correlation coefficient of 0.706 was 
obtained between the SoC and the impedance phase, which is 
relatively higher in comparison to the value of 0.337 obtained for 

the original dataset. As a consequence, machine learning 
algorithms are expected to provide higher performance results 
when trained with the filtered dataset. 

5.2. Comparison of machine learning algorithms 

Performance of several machine learning classifiers and 
regressors were evaluated and compared. Among them, K-
nearest Neighbors (KNN), Linear Discriminant Analysis (LDA), 
Gaussian Naive Bayes (GNB), Support Vector Classification 
(SVC) and Decision Tree (DT), were considered as 
representative classification algorithms. The aforementioned 
algorithms were compared in terms of accuracy achieved on both 
original dataset and filtered dataset, using a cross-validation 
method on 5 folds.  

Table 5 shows the average values and the standard deviation 
of the accuracy obtained for the above classifiers in the case of 
SoC prediction obtained by training the algorithms with the 
original dataset. 

It is possible to observe how the use of polar representation 
leads to an improvement in accuracy for all classifiers with an 
increase between 40% and 270%, depending on the classifier. 
For both representations (rectangular/polar), the Decision Tree 
(DT) exhibits a better performance, obtaining an average 
accuracy equal to 0.915 with the polar representation.  

This analysis was repeated considering the filtered dataset, i.e. 
removing high frequency points from the original dataset. As can 
be seen from Table 6, filtering improves accuracy of almost all 
classifiers (for the sake of clarity, in Table 6 only results for polar 
coordinates are reported). 

For better comparison, in Figure 3, a box plot shows the 
median value (orange line), quartiles, and range of accuracy 
values (minimum and maximum values) in the case of algorithm 
trained considering filtered (Figure 3b) and original (Figure 3a) 
datasets. From the comparison between Figure 3a and Figure 3b 
it can be observed that for the LDA classifier the use of filtered 
values leads to a marked improvement in performance. 
Moreover, in the case of DT, in addition to an increase of the 
average accuracy value, there is also a significant reduction of 
data dispersion that justifies the reduction of the standard 
deviation in Table 6, obtained in the case of filtered data.  

Similar considerations can be carried out using for 
comparison purpose the F1 metric [36]. In fact, as shown in 
Figure 4 where the macro-average F1 score obtained for the 
same classifiers (for both filtered and un-filtered datasets) is 
reported, DT classifier achieves better results even considering 
the macro-averaged F1 metric instead of the accuracy.  

Finally, Figure 5 shows the confusion matrix obtained for the 
DT classifier in the case of the filtered dataset for an 80/20 
distribution, i.e. with 80% of the data used for training and 20% 
for testing. A total of 272 SoC values were tested and only 16 of 
them were wrongly classified, thus obtaining an accuracy on the 
specific test set of 94.31%. Therefore, the achieved classification 
can be effectively used to evaluate the state of charge of the 
battery starting from the impedance and, in particular, to predict 
when the state of charge is below 50%. It is worth mentioning 
that, the choice of using classifiers instead of regressors is related 
to the specific application. In a few cases, in fact, classifiers able 
to simply detect discrete values of SoC can be useful for detecting 
when specific critical threshold levels have been reached, i.e. the 
20% of capacity reduction commonly used for automotive 
applications. 

It is worth noting that, in the previous analysis, the default 
values of Scikit-learn was used for all classifiers, i.e. all classifiers 

Table 3. Correlation matrix for impedance measures evaluated on 
original/unfiltered data 

 Re{Z} (Ω) Im{Z} (Ω) Abs{Z} (Ω) Arg{Z} (Ω) 

Closs +0.471 +0.044 +0.456 -0.002 

SoC -0.166 -0.103 -0.170 -0.337 

 
Table 4. Correlation matrix for impedance measures evaluated only on low-
frequency data. 

 Re{Z} (Ω) Im{Z} (Ω) Abs{Z} (Ω) Arg{Z} (Ω) 

Closs +0.477 +0.119 +0.458 -0.001 

SoC -0.213 -0.1239 -0.215 -0.706 

 
Table 5. Accuracy of some classifiers used for modeling the SoC starting for 
unfiltered data in rectangular and polar representation. 

Classifier Representation Z Mean STD 

LDA Rectangular 0.234 0.027 

LDA Polar 0.333 0.045 

GNB Rectangular 0.196 0.020 

GNB Polar 0.370 0.053 

SVC Rectangular 0.192 0.014 

SVC Polar 0.380 0.068 

KNN Rectangular 0.222 0.072 

KNN Polar 0.383 0.088 

DT Rectangular 0.329 0.020 

DT Polar 0.915 0.047 

 
Table 6. Accuracy of some classifiers used for modeling the SoC in polar 
representation for filter and unfiltered data. 

Classifier Mean  STD  Mean  STD  

 (original data) (filtered data) 

LDA 0.333 0.045 0.602 0.192 

GNB 0.370 0.053 0.397 0.027 

SVC 0.380 0.068 0.374 0.066 

KNN 0.383 0.088 0.392 0.084 

DT 0.915 0.047 0.938 0.024 


ACTA IMEKO | www.imeko.org June 2021 | Volume 10 | Number 2 | 85 

were applied without any optimisation. This fact partially justifies 
why most classifiers exhibit poor performances. In addition, it is 
well known that LDA, like other linear classifiers and regressors 
such as Ridge and Lasso, adapts well to linear models while the 
dependence of SoC on impedance curves does not. Nevertheless, 
it is generally better to test and compare them due to their lower 
computational complexity. 

Therefore, a similar analysis was carried out for Linear, Lasso, 
Elastic, Ridge, Gradient Boosting, Ada Boost and Random 
Forest regressors with the main difference that, in the case of 
regressors, performance was measured in terms of MAE and the 
determination coefficient (R2).  

The regressor with the best performance in terms of both R2 
and MAE was the Random Forest.  
The distributions of the predicted values by Random forest 
regressor, when trained with the filtered data, are reported in 
Figure 6a and Figure 6b for the modeling of SoC and Closs, 
respectively. Figure 6 also reports the average values and 
standard deviations of R2 and MAE. In particular, in the case of 
SoC, an average value of R2 of 0.98 was achieved (see Figure 6a). 
In comparison to [37], which considered unfiltered data, a 
significant reduction was obtained in the MAE, from 2.65 to 
1.87. In addition, as discussed in the following subsection, the 
use of filtered data leads to models with lower complexity. 

5.3. Analysis of the Random Forest parameters 

Different tradeoffs between performance and complexity of 
machine learning algorithms can be obtained by a proper tuning 
of related parameters. In the specific case of the Random Forest, 
the most important parameters that impact on both performance 
and overall complexity are the number of trees (n_estimators) 
and the maximum depth of trees (max_depth). Generally, 

increasing one or both of such parameters improve performance 
at the cost of greater complexity and estimation time.  

Table 7 shows the R2 and MAE metrics obtained with 
Random Forest for some combinations of n_estimators and 
max_depth considering the original set, i.e. unfiltered data. It is 
possible to observe that R2 and MAE metrics are most affected 
by the max_depth parameter. In particular, a maximum value of 
R2 equal to 0.97 is achieved by setting max_depth = 30. The use 
of higher values increases computational complexity without 
significant performance advantages.  

As regards the other parameter investigated (i.e., 
n_estimators), there is no substantial difference in the values of 
R2 and MAE obtained by fixing max_depth = 30 and using 
n_estimators values higher than 100. This analysis leads to 
conclude that, in the case of unfiltered data, the optimal values 
of the Random Forest parameters that maximise performance 
are max_depth = 30 and n_estimators = 100, which are the 
parameters used in [37]. 

The same analysis was conducted for filtered data, and the 
related results are summarised in Table 8. In this case, better 

a)  

b)  

Figure 3. Accuracy of machine learning algorithms on the SoC estimation 
a) unfiltered, b) filtered data. 

a)  

b)  

Figure 4. F1 metric results for a) unfiltered and b) filtered data. 

 
Figure 5. Confusion matrix of DT classifier. 


ACTA IMEKO | www.imeko.org June 2021 | Volume 10 | Number 2 | 86 

results are achieved even with lower values of the parameters. 
For instance, the performance obtained using filtered data with 
max_depth=10 and n_estimators=10 is better than when using 
unfiltered data for max_depth=30 and n_estimators=100. 

Thus, by training the algorithm with filtered data we obtained 
models with better performance and lower complexity. 

6. CONCLUSIONS 

Starting from impedance measurements, different machine 
learning techniques were analysed as predictors of the state of 
charge and the loss of capacity of a lithium battery, subjected to 
a frequency regulation profile for grid applications. According to 
the results, the following conclusions can be drawn: 

- for the training of machine learning techniques, the use 
of impedance values expressed in polar form is to be 
preferred; 

- Decision Trees and Random Forest provided superior 
performance compared to the other machine learning 
techniques analysed; 

- using low frequency data for training Random Forest 
regressor improved performance in terms of R2 and 
MAE for both state of charge and capacity loss 
prediction and largely reduced overall complexity. 

ACKNOWLEDGEMENT 

Special thanks to the Italian Ministry of Economic 
Development for funding this activity. 

REFERENCES 

[1] G. Hackeling, Mastering Machine Learning With scikit-learn: 
Packt Publishing, 2014. 

[2] L. Ren, L. Zhao, S. Hong, S. Zhao, H. Wang, L. Zhang, Remaining 
useful life prediction for lithium-ion battery: A deep learning 
approach, IEEE Access 6 (2018), pp. 50587-50598.  
DOI: 10.1109/ACCESS.2018.2858856 

[3]  P. Venugopal, State-of-health estimation of Li-ion batteries in 
electric vehicle using IndRNN under variable load condition, 
Energies 12(22) (2019), art. 4338.  
DOI: 10.3390/en12224338 

[4] P. Khumprom, N. Yodo, A data-driven predictive prognostic 
model for lithium-ion batteries based on a deep learning algorithm, 
Energies 12(4) (2019), art. 660.  
DOI: 10.3390/en12040660 

[5] J. Meng, G. Luo, M. Ricco, M. Swierczynski, D.I. Stroe, R. 
Teodorescu, Overview of lithium-ion battery modeling methods 
for state-of-charge estimation in electrical vehicles, Applied 
Sciences 8(5) (2018), art. 659.  
DOI: 10.3390/app8050659 

[6] C. Lin, A. Tang, W. Wang, A review of SOH estimation methods 
in Lithium-ion batteries for electric vehicle applications, Energy 
Procedia 75 (2015), pp. 1920-1925.  
DOI: 10.1016/j.egypro.2015.07.199 

[7] C. Weng, Y. Cui, J. Sun, H. Peng, On-board state of health 
monitoring of lithium-ion batteries using incremental capacity 
analysis with support vector regression, Journal of Power Sources, 
235 (2013), pp. 36-44.  
DOI: 10.1016/j.jpowsour.2013.02.012 

[8] R. R. Richardson, C. R. Birkl, M. A. Osborne, D. A. Howey, 
Gaussian process regression for in situ capacity estimation of 
lithium-ion batteries, IEEE Transactions on Industrial 
Informatics 15(1) (2019), pp. 127-138.  
DOI: 10.1109/TII.2018.2794997 

[9] X. Xu, N. Chen, A state-space-based prognostics model for 
lithium-ion battery degradation, Reliability Engineering and 
System Safety 159 (2017), pp. 47-57.   
DOI: 10.1016/j.ress.2016.10.026 

[10] M. A. Patil, P. Tagade, K. S. Hariharan, S. M. Kolake, T. Song, T. 
Yeo, S. Doo, A novel multistage support vector machine based 
approach for Li ion battery remaining useful life estimation, 
Applied Energy 159 (2015), pp. 285-297.  
DOI: 10.1016/j.apenergy.2015.08.119 

Table 7. R2 and MAE values obtained by Random Forest technique varying 
parameters n_estimators and max_depth (on unfiltered data). 

n_estimators max_depth R2 MAE 

10 5 0.79 (0.01) 10.47 (0.29) 

10 10 0.93 (0.01) 4.49 (0.38) 

10 30 0.96 (0.01) 3.32 (0.50) 

10 50 0.96 (0.01) 3.34 (0.26) 

25 30 0.97 (0.01) 3.11 (0.24) 

50 50 0.97 (0.01) 3.02 (0.34) 

100 5 0.80 (0.01) 10.43 (0.24) 

100 10 0.93 (0.01) 4.43 (0.29) 

100 30 0.97 (0.01) 3.02 (0.36) 

100 50 0.97 (0.01) 3.03 (0.34) 

1000 30 0.97 (0.01) 2.99 (0.30) 

 
Table 8. R2 and MAE values obtained by Random Forest technique varying 
parameters n_estimators and max_depth (on filtered data). 

n_estimators max_depth R2 MAE 

10 5 0.95 (0.01) 4.62 (0.30) 

10 10 0.98 (0.00) 1.94 (0.22) 

10 30 0.98 (0.00) 1.94 (0.22) 

10 50 0.98 (0.00) 1.84 (0.29) 

25 30 0.98 (0.00) 1.89 (0.15) 

50 50 0.98 (0.00) 1.91 (0.18) 

100 5 0.95 (0.01) 4.50 (0.20) 

100 10 0.98 (0.00) 1.89 (0.13) 

100 30 0.98 (0.00) 1.8 (0.21) 

100 50 0.98 (0.00) 1.87 (0.22) 

1000 30 0.98 (0.00) 1.83 (0.18) 

a)  

b)  

Figure 6. Random Forest distribution on filtered data for a) SoC and b) 
capacity loss. 

https://doi.org/10.1109/ACCESS.2018.2858856
https://doi.org/10.3390/en12224338
https://doi.org/10.3390/en12040660
https://doi.org/10.3390/app8050659
https://doi.org/10.1016/j.egypro.2015.07.199
https://doi.org/10.1016/j.jpowsour.2013.02.012
https://doi.org/10.1109/TII.2018.2794997
https://doi.org/10.1016/j.ress.2016.10.026
https://doi.org/10.1016/j.apenergy.2015.08.119


ACTA IMEKO | www.imeko.org June 2021 | Volume 10 | Number 2 | 87 

[11] C. Lu, L. Tao, H. Fan, Li-ion battery capacity estimation: A 
geometrical approach, Journal of Power Sources 261 (2014), 
pp. 141-147.  
DOI: 10.1016/j.jpowsour.2014.03.058 

[12] D. I. Stroe, M. Swierczynski, A. I. Stan, V. Knap, R. Teodorescu, 
S. J. Andreasen, Diagnosis of lithium-ion batteries state-of-health 
based on electrochemical impedance spectroscopy technique, 
2014 IEEE Energy Conversion Congress and Exposition 
(ECCE), Pittsburgh, PA, 14-18 September 2014, pp. 4576-4582. 
DOI: 10.1109/ECCE.2014.6954027 

[13] D. Andre, M. Meiler, K. Steiner, C. Wimmer, T. Soczka-Guth, D. 
U. Sauer, Characterization of high-power lithium-ion batteries by 
electrochemical impedance spectroscopy. I. Experimental 
investigation, Journal of Power Sources 196(12) (2011), pp. 5334-
5341.  
DOI: 10.1016/j.jpowsour.2010.12.102 

[14] F. Huet, A review of impedance measurements for determination 
of the state-of-charge or state-of-health of secondary batteries, 
Journal of Power Sources 70(1) (1998), pp. 59-69.  
DOI: 10.1016/S0378-7753(97)02665-7 

[15] I. Masmitjà Rusinyol, J. González, G. Masmitjà, S. Gomáriz, J. del-
Río-Fernández, Power system of the Guanay II AUV, Acta 
IMEKO 4(1) (2015), pp. 35-43.  
DOI: 10.21014/acta_imeko.v4i1.161 

[16] S. Buteau, J. R. Dahn, Analysis of thousands of Electrochemical 
impedance spectra of lithium-ion cells through a machine learning 
inverse model, Journal of the Electrochemical Society 166(8) 
(2019), art. A1611.  
DOI: 10.1149/2.1051908jes 

[17] Y. Zhang, Q. Tang, Y. Zhang, J. Wang, U. Stimming, A. A. Lee, 
Identifying degradation patterns of lithium ion batteries from 
impedance spectroscopy using machine learning, Nature 
Communications 11 (2020), art. 1706.  
DOI: 10.1038/s41467-020-15235-7 

[18] F. Liu, X. Liu, W. Su, H. Lin, H. Chen, M. He, An online state of 
health estimation method based on battery management system 
monitoring data, International Journal of Energy Research 44(8) 
(2020), pp. 6338-6349.  
DOI: 10.1002/er.5351 

[19] P. Singha, R.Vinjamuria, X. Wangb, D. Reisner, Design and 
implementation of a fuzzy logic-based state-of-charge meter for 
Li-ion batteries used in portable defibrillators, Journal of Power 
Sources 162(2) (2006), pp. 829-836.  
DOI: 10.1016/j.jpowsour.2005.04.039 

[20] S. B. Sarmah, P. Kalita, A. Garg, X.-d. Niu, X.-W. Zhang, X. Peng, 
D. Bhattacharjee, A review of state of health estimation of energy 
storage systems: Challenges and possible solutions for futuristic 
applications of Li-Ion battery packs in electric vehicles, Journal of 
Electrochemical Energy Conversion and Storage 16(4) (2019), art. 
040801.  
DOI: 10.1115/1.4042987 

[21] A. Nuhic, T. Terzimehic, T. Soczka-Guth, M. Buchholz, and K. 
Dietmayer, Health diagnosis and remaining useful life prognostics 
of lithium-ion batteries using data-driven methods, Journal of 
Power Sources 239 (2013), pp. 680-688.  
DOI: 10.1016/j.jpowsour.2012.11.146 

[22] Z. Chen, M. Sun, X. Shu, R. Xiao, J. Shen, Online state of health 
estimation for lithium-ion batteries based on support vector 
machine, Applied Sciences 8(6) (2018), art. 925.  
DOI: 10.3390/app8060925 

[23] V. Klass, M. Behm, G. Lindbergh, A support vector machine-
based state-of-health estimation method for lithium-ion batteries 
under electric vehicle operation, Journal of Power Sources, vol. 
270 (2015), pp. 262-272.  
DOI: 10.1016/j.jpowsour.2014.07.116 

[24] J. Meng, L. Cai, G. Luo, D.-I. Stroe, R. Teodorescu, Lithium-ion 
battery state of health estimation with short-term current pulse test 
and support vector machine, Microelectronics Reliability 88-90 

(2018), pp. 1216-1220.  
DOI: 10.1016/j.microrel.2018.07.025 

[25] X. Feng, C. Weng, X. He, X. Han, L. Lu, D. Ren, M. Ouyang, 
Online state-of-health estimation for Li-Ion battery using partial 
charging segment based on support vector machine, IEEE 
Transactions on Vehicular Technology 68(9) (2019), pp. 8583-
8592.  
DOI: 10.1109/TVT.2019.2927120 

[26]  M. Berecibar, I. Gandiaga, I. Villarreal, N. Omar, J. Van Mierlo, 
P. Van den Bossche, Critical review of state of health estimation 
methods of Li-ion batteries for real applications, Renewable and 
Sustainable Energy Reviews 56 (2016), pp. 572-587.  
DOI: 10.1016/j.rser.2015.11.042 

[27]  J. Qu, F. Liu, Y. Ma, J. Fan, A neural-network-based method for 
RUL prediction and SOH monitoring of lithium-ion battery, 
IEEE Access 7 (2019), pp. 87178-87191.  
DOI: 10.1109/ACCESS.2019.2925468 

[28] G. Ma, Y. Zhang, C. Cheng, B. Zhou, P. Hu, Y. Yuan, Remaining 
useful life prediction of lithium-ion batteries based on false nearest 
neighbors and a hybrid neural network, Applied Energy, 253 
(2019), art. 113626.  
DOI: 10.1016/j.apenergy.2019.113626 

[29] R. La Rosa, A. Y. S. Pandiyan, C. Trigona, B. Andò, S. Baglio, An 
integrated circuit to null standby using energy provided by MEMS 
sensors, Acta IMEKO 9(4) (2020), p. 144 -150.  
DOI: 10.21014/acta_imeko.v9i4.741 

[30] G. Campobello, D. Dell’Aquila, M. Russo, A. Segreto, Neuro-
genetic programming for multigenre classification of music 
content, Applied Soft Computing 94 (2020), art. 106488. 
DOI: 10.1016/j.asoc.2020.106488 

[31] International Standard IEC 61427-2: Secondary cells and batteries 
for renewable energy storage- General requirements and methods 
of test - Part 2: on-grid applications, ed, 2015. 

[32] S. Ma, M. Jiang, P. Tao, C. Song, J. Wu, J. Wang, T. Deng, W. 
Shang, Temperature effect and thermal impact in lithium-ion 
batteries: A review, Progress in Natural Science: Materials 
International 28(6) (2018), pp. 653-666.  
DOI: 10.1016/j.pnsc.2018.11.002 

[33] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, 
O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, 
Scikit-learn: Machine learning in Python, The Journal of machine 
Learning research 12 (2011), pp. 2825-2830. Online [Accessed 09 
June 2021]  
http://jmlr.org/papers/v12/pedregosa11a.html 

[34] V. J. Ovejas, Impedance Characterization of an LCO-
NMC/Graphite Cell: Ohmic Conduction, SEI Transport and 
Charge-Transfer Phenomenon, Batteries 4(3) (2018), art. 43. 
DOI: 10.3390/batteries4030043 

[35] T. F. Landinger, G. Schwarzberger, A. Jossen, A novel method for 
high frequency battery impedance measurements, IEEE 
International Symposium on Electromagnetic Compatibility, 
Signal & Power Integrity (EMC+SIPI), New Orleans, LA, USA, 
22-26 July 2019, pp. 106-110.   
DOI: 10.1109/ISEMC.2019.8825315 

[36] M. L. Zhang, Z. H. Zhou, A review on multi-label learning 
algorithms, IEEE Transactions on Knowledge and Data 
Engineering 26(8) (2014), pp. 1819–1837.  
DOI: 10.1109/TKDE.2013.39 

[37] D. Aloisio, G. Campobello, S. G., Leonardi, A. Segreto, 
N. Donato, A machine learning approach for evaluation of battery 
state of health, 24th IMEKO TC4 International Symposium and 
22nd International Workshop on ADC and DAC Modelling and 
Testing, Palermo, Italy, 14-16 September 2020, pp. 129-134. 
Online [Accessed 09 June 2021]  
https://www.imeko.org/publications/tc4-2020/IMEKO-TC4-
2020-25.pdf 

 
https://doi.org/10.1016/j.jpowsour.2014.03.058
https://doi.org/10.1109/ECCE.2014.6954027
https://doi.org/10.1016/j.jpowsour.2010.12.102
https://doi.org/10.1016/S0378-7753(97)02665-7
http://dx.doi.org/10.21014/acta_imeko.v4i1.161
https://doi.org/10.1149/2.1051908jes
https://doi.org/10.1038/s41467-020-15235-7
https://doi.org/10.1002/er.5351
https://doi.org/10.1016/j.jpowsour.2005.04.039
https://doi.org/10.1115/1.4042987
https://doi.org/10.1016/j.jpowsour.2012.11.146
http://dx.doi.org/10.3390/app8060925
https://doi.org/10.1016/j.jpowsour.2014.07.116
https://doi.org/10.1016/j.microrel.2018.07.025
https://doi.org/10.1109/TVT.2019.2927120
https://doi.org/10.1016/j.rser.2015.11.042
https://doi.org/10.1109/ACCESS.2019.2925468
https://doi.org/10.1016/j.apenergy.2019.113626
http://dx.doi.org/10.21014/acta_imeko.v9i4.741
https://doi.org/10.1016/j.asoc.2020.106488
https://doi.org/10.1016/j.pnsc.2018.11.002
http://jmlr.org/papers/v12/pedregosa11a.html
https://doi.org/10.3390/batteries4030043
https://doi.org/10.1109/ISEMC.2019.8825315
https://doi.org/10.1109/TKDE.2013.39
https://www.imeko.org/publications/tc4-2020/IMEKO-TC4-2020-25.pdf
https://www.imeko.org/publications/tc4-2020/IMEKO-TC4-2020-25.pdf