BLACK BOX EFFICIENCY MODELLING OF AN ELECTRIC DRIVE UNIT UTILIZING METHODS OF MACHINE LEARNING

The increasing electrification of powertrains leads to increased demands for the test technology to ensure the required functions. For conventional test rigs in particular, it is necessary to have knowledge of the test technology's capabilities that can be applied in practical testing. Modelling enables early knowledge of the test rigs dynamic capabilities and the feasibility of planned testing scenarios. This paper describes the modelling of complex subsystems by experimental modelling with artificial neural networks taking transmission efficiency as an example. For data generation, the experimental design and execution is described. The generated data is pre-processed with suitable methods and optimized for the neural networks. Modelling is executed with different variants of the inputs as well as different algorithms. The variants compare and compete with each other. The most suitable variant is validated using statistical methods and other adequate techniques. The result represents reality well and enables the performance investigation of the test systems in a realistic manner.


INTRODUCTION
The steadily advancing climate change requires a reduction of greenhouse gases. With annual emissions of 160,000 kilotons of CO2 equivalent, the transportation sector in Germany has great potential for savings (German Environment Agency, 2020). As a result, there is a demand for and promotion of climate-friendly solutions for mobility. The electrification of vehicles is a key tool for the reduction of CO2 emissions (Hoekstra, 2019). Currently, a corresponding increase in electric drives is discernible. Still, the potential for climate-neutral transportation also brings challenges in vehicle technology. The functionality of electric powertrains must be validated on test rigs. Increased dynamics of electric powertrains must be considered. Knowledge of the physical limits of the test rig, like the dynamics, is not always available. Especially with conventional test rigs, the required dynamics may exceed the accessible ones. In the best case, the planned test setup can then only be carried out by means of an adaptation, in the worst case, not at all.
The use of testing technology models enables an early statement about the feasibility of the tests. In doing so, the test rig can be modelled in its entirety . The model includes the electrics, the mechanics as well as the control system. By means of white-box modelling, the model is represented theoretically and idealized. Complex processes, such as friction, can hardly be represented by theoretical modelling. They can be represented using the conventional approaches of experimental modelling, such as the description by characteristic maps and curves . The calculation is performed using measured data at specified operating points. Between the measured operating points, the data is interpolated. Patterns in the measured data are recognized only sparsely. Due to the necessary intermediate steps for the calculation, modelling with maps also has a relatively low computational efficiency.
By using innovative approaches for modelling, the available information content of the measured data as well as the computational efficiency can be increased (Bauer, Beck, Stütz & Kley, 2021). The aim of this work is to provide an experimental model of the efficiency of an electrical machine. The result is a function that continuously describes the efficiency based on the mechanical inputs speed and torque. For this purpose, the mechanical, electrical and thermal influences on the system are investigated in detail. The procedure for modelling the function is described in detail. Suitable experiments are planned and carried out to generate the data. The measured data is pre-processed by suitable methods. The model is built using artificial neural networks (ANN). To optimize the model, different inputs as well as different training algorithms are compared. The result is validated by different approaches, such as a statistical evaluation of the error.

STATE OF THE ART
The product creation process is described in particular with the help of the V-Model (see Fig. 1). Simulations can already be carried out in the component specification. These enable tests without a physical Device Under Test (DUT). (Stütz, Bauer & Kley, 2019)   As time increases during the product development process, so does the number of available physical DUTs. This enables increasing, real testing in the component integration. As the number of available DUTs increases, the scope of the tested systems increases, for example from individual components to the complete powertrain. Finally, the complete vehicle is tested on the road. Any necessary changes are implemented iteratively in optimisation loops. (Dohmen, Pfeiffer & Schyr, 2009) A minimum value of kilometres to be driven is required to fully validate product features. Typical values for the example of autonomous driving range from one million kilometres for the validation of a function to one billion kilometres for the complete validation of the system. A corresponding safeguarding by driving on the road is hardly feasible in the available development time. A procedure with increasing relevance is therefore testing on the test rig. (Beine & Rasche, 2018).
Early validation on the test rig or through simulation also keeps product development costs low (Albers, Behrendt, Klingler & Matros, 2016).

Efficiency Mapping
Test rig experiments for efficiency testing provide knowledge about the usable energy share of a system. During power difference measurement, the input power Pin and the output power Pout are determined. From this, the efficiency η is derived according to formula (1):
The power of the drive unit can be determined, for example, by torque Mout and speed nout sensors on the output shaft or current I and voltage U sensors in the armature of the electric machine. With the power difference measurement, the efficiency in the motor operation can be calculated according to formula (2): where: output torque, output speed, armature voltage, armature current.

Modelling
Currently, there is also an increasing trend to shift testing assignments from road and test rig applications to simulation software. Nevertheless, it can be assumed that physical testing will be necessary for the foreseeable future to ensure the required product properties (Dismon, 2017). It is not to be seen as a competitor to simulation, but as an extension to be used synergetically in order to achieve optimized results (Guggenmos, Rückert, Thalmair & Wagner, 2015). Willmerding & Häckh (2017) for example, describe the combination of vehicle simulation and test rig control for mapping highly dynamic driving cycles on test rigs. The winEVA tool used thus enables more realistic results through driving maneuverbased test scenarios. The validation on the test rig can be optimized by using such numerical tools.
The simulation of scenarios on the computer requires models that describe the planned or real system. Isermann (2008) principally distinguishes between the theoretical and the experimental modelling. The main kinds of modelling as well the intermediate stage greybox model are displayed in Fig. 2.

Fig. 2. Modelling concepts
Theoretical modelling of white-box models requires comprehensive knowledge of the system to be modelled. The overall system is usually subdivided into smaller subsystems. The processes and relationships are described physically. The theoretical modelling for the efficiency of a transmission using a simulation is described, for example, by Li et al. (2014) and Ratov & Lyfar (2020).
In experimental modelling of black-box models, also referred to as parameter identification, the processes in the system are not known. It is described exclusively with the input and output variables as well as its transfer function. There are various ways to establish the relationship between the variables like the classic methods of using characteristic diagrams or the calculation of polynomials. In addition to the classical methods, modelling with artificial neural networks is increasingly used. Machrowska et al. (2020) compare the mathematical modelling with polynomials to the modelling with ANNs. The ANNs lead to superior results. The creation of a transfer function for recognition and depiction of correlations between input and output signals is one of the basic ideas of ANNs. Therefore, they're suitable for the modelling of complex systems. There is a wide range of possible applications in modelling. Khan et al. (2020) describe the creation of an efficiency function of an electrical machine by means of ANNs. The training data originates from numerical calculations. Çelik et al. (2017) describe the creation of an efficiency and power function of an electric machine. The training data results from measurements.  Yadav & Yadava (2017) describe the modelling of an ANN for an EAHM process. A parameter study was carried out for optimization. sufficiently good results were achieved with the scaled-conjugate-gradient algorithm. Payal et al. (2013) compare and compete the Levenberg-Marquardt algorithm and the Bayesian-Regularization algorithm. The algorithms are applied for the efficient localization in wireless sensor networks. It is concluded, that the Bayesian-Regularization algorithm produces more accurate results at the expense of higher training time, than the Levenberg-Marquardt algorithm. Based on this Jazayeri et al. (2016) compare and compete the Levenberg-Marquardt algorithm and the Bayesian-Regularization algorithm for power estimation of photovoltaic modules. The better accuracy at the expense of higher training time by using the Bayesian-Regularization algorithm can also be approved for the described use case. In conclusion both papers recommend to use the Levenberg-Marquardt algorithm for time critical applications. Otherwise the Bayesian-Regularization algorithm is recommended. Based on this, this paper investigates the suitability of the described algorithms for modelling the efficiency.

PARAMETER IDENTIFICATION
The investigation is carried out using a drive test rig with a power rating of 300 kW. The test rig consists of a prime mover, which simulates the vehicle drive controlled by its speed and a load machine, which simulates the driving resistances controlled by its torque. The electrical machines are externally excited DC motors. In order to achieve a sufficiently high speed, transmissions are connected to the motors. The input uses a non-switchable planetary gear with a ratio of 3.2. The output uses a switchable planetary gear with ratios i1 = 1 and i2 = 3.47. Drive train components such as transmissions can be connected and tested between the two motors.
As a basis for the project, a digital twin of the described test rig was developed by . This primarily represents the dynamic behavior of the given test rig. This allows planned test scenarios to be examined in advance for their feasibility and optimized for the given test technology. Up to now, the modelling has been done as a white-box model. This is an ideal, loss-free representation of the system. Factors such as power loss are not considered. For a more realistic simulation of the real system, the aim is to model the complex efficiency. This is to be done by functions from ANNs. Training data that can be processed as the basis of the ANNs are necessary for model building. For this purpose, an experimental design is carried out and measurement data is recorded through experiments. The resulting raw data is preprocessed for training the ANN. With the help of existing algorithms, the networks are trained on the given data and can be integrated into simulations. The results are validated in particular by comparing them with the real system. The procedure is described in detail below.

Experimental design and data generation
To generate data, appropriate tests are run on the real test rig. Due to its inferior performance, the investigation is limited to the drive for the time being. For data generation, the two machines are connected directly via a constant velocity drive shaft and operated without a DUT. This avoids disturbing influences of the DUT.
For data acquisition, a large number of measured values are recorded at different measuring points. These are differentiated into mechanical, electrical and thermal.
The distribution of the measuring points T1, T2, T3 and T4 for the temperature is based on infrared images of the machine in operation. This allows points to be identified which react particularly quickly to the heating of the machine. The test setup with the measuring points and the comparison with the infrared images is shown in Fig. 3.

Fig. 3. Schematic illustration of the measurement locations and IR-illustration of the heating after: a) 0 min, b) 50 min, c) 100 min and d) 150 min
The aim of the experiments is to generate data for which an ANN can recognize correlations with the efficiency. This is to be described by a function from a trained network. The collective used for this consists of quasi-randomly distributed points in the machine's characteristic diagram. This is to ensure that no patterns are given to the network and that it recognizes any given patterns itself. In addition, the measuring points are approached in random order to take the temperature influence into account. As a result, the temperature warms up independently of the operating point. The principle is shown in Fig. 4. The start-up time of the individual operating points is determined as a function of the speed difference. A maximum speed ramp is specified. A time buffer of 5 seconds is also added. After start-up, the points are held for 30 seconds. This results in an essential division of the measurement data into dynamic start-up and static holding of the points. The collective is approached for about five hours. At a measurement frequency of 10 Hz, about 190,000 data samples are recorded. Through a correlation analysis between the efficiency and each individual temperature channel, no significant influence of the temperature on the overall efficiency can be determined. It can be assumed that as the temperature increases, the transmission efficiency increases, but the motor efficiency decreases. It is assumed that a superposition of the individual efficiency changes keeps the overall influence of the temperature low.

Data pre-processing
For data pre-processing, relevant measured values are selected in advance for the input. These are in particular mechanical values such as the speed and torque. In addition to the main values, the influence of derived values is also examined. These are in particular the torque gradient and the speed gradient.
The efficiency of the machine is used for the ANN output. This is determined on the basis of the measured values. The measurement data is limited to motor forward operation. It can be assumed that the results can be mirrored to other operating modes.
Some of the data contain information that apparently has no plausible information value. For example, in the area of the dynamic start-up of the measuring points, there is a strong fluctuation of the current around the expected value with a constant period of T = 0.5 s (see Fig. 5).

Fig. 5. Fluctuations in the current and raw and filtered current signal
The fluctuations are transferred to all values derived from the current, but not to the mechanics. Detailed investigations into the cause of the discrepancy, such as an analysis of the frequency response of the mechanical oscillations, did not yield any usable outcome. To avoid a resulting degradation of the results, the signal is smoothed with a filter. According to a correlation analysis a moving average with a window of 3*T = 1.5 s provides the best results. The consequential values in operation without field weakening run approximately proportional to the measured torque.
The smoothing in the current is transferred to all values derived from it. The resulting signals, for example the target signal efficiency, are evaluated as plausible.
To avoid an unequal effect of the inputs, the selected data for input and output are normalized by their maximum values. Thus, all values lie in the numerical range [0,1]. The normalization is performed according to formula 4-7: where: normalized rotational speed, normalized torque, normalized voltage, normalized current.
The output signal efficiency cannot exceed the range [0,1] by definition. A normalization is therefore not needed.

Training of the artificial neural network
The network is a fully-connected multilayer perceptron. It consists of an input layer, two hidden layers and an output layer. The input layer has four neurons. The first hidden layer has 6 neurons, the second hidden layer has 4 neurons. The number of neurons in the hidden layers was determined empirically. The output layer has one neuron. It provides the output in the form of the efficiency. The network is shown in Fig. 6. The calculations, training and testing are done using Matlab. The training data is automatically and randomly divided into training data (70%), validation data (15%) and testing data (15%). The validation data is used to measure network generalization and halt the training when generalization stops improving while the testing data is used to measure the network performance during and after the training (The MathWorks, 2020).
Different versions of the network are created. The distinction is made in two categories. For the first category, the inputs are varied to determine the most suitable combination. In the second category, the training algorithms are varied. This is to determine the most suitable algorithm for the application.
The input variations are different combinations of speed, torque as well as their corresponding gradients. The variants have the following inputs:  Variant 1: speed, torque.  Variant 2: speed, torque, speed gradient.  Variant 3: speed, torque, torque gradient.  Variant 4: speed, torque, speed gradient, torque gradient.
The default set and frequently recommended Levenberg-Marquardt algorithm is used to determine the most suitable input variant (Jazayeri, Jazayeri & Uysal, 2016;Payal, Rai & Reddy, 2013). The training is partly random and therefore does not deliver uniform results. For a significant outcome, the training is performed 20 times per input variant. The characteristic values are presented in Fig. 7.

Fig. 7. Performance metrics with varying inputs
Analogous to Jazayeri et al. (2016), the resulting functions are evaluated on the basis of the performance metrics achieved. A relatively small deviation can be seen in the data. In terms of MSE performance, variant 3 and variant 4 achieve the lowest and thus the best values. For the R² value, variant 4 and variant 2 achieve the highest and thus the best values. Thus, variant 4 shows very good values in the essential criteria for the quality of results.
A summary of further averaged parameters is shown in Tab. 1. The values highlighted in green are the best. The values highlighted in red are the worst. In addition to the quality of the results, a good calculation time is recognizable for variant 4. Therefore, variant 4 is chosen as standard input for the further process.

Tab. 1. Performance metrics of varying inputs
The determination of a fitting algorithm for the ANNs is done by varying different approaches. The following algorithms are used for that:  Levenberg-Marquardt algorithm (LM),  Bayesian regularization algorithm (BR),  Scaled-conjugate-gradient algorithm (SCG).
Each algorithm is trained 20 times. The results are compared and competed with each other based on their performance metrics. The characteristic values are presented in Fig. 8. It can be seen that the BR algorithm gives the best results. The LM algorithm leads to slightly inferior results. The SCG algorithm leads to the worst results. Furthermore, it has a comparatively high dispersion.

Tab. 2. Performance metrics of varying algorithms
It is observed that the BR algorithm is superior to the competing algorithms in terms of performance. Regarding the R² value, the LM algorithm is equivalent. In terms of computation time and number of training epochs, the LM algorithm is superior. The SCG algorithm is inferior regarding the performance and the R²-value. The computation time and the number of raining algorithms is only intermediate.
From the investigation the recommendation can be derived to use the BR algorithm for high demands on the result quality. For time-critical applications, the use of the LM algorithm is recommended. No recommendation can be derived for the SCG algorithm.

RESULTS AND VALIDATION
The results are displayed using an ANN based efficiency map (see Fig. 9). By multiplying by the original maximum values, the normalized values can be converted back to the initial values. A high efficiency gradient can be seen in the lower speed and torque areas. Due to the relatively low resolution of the measurement data in this range, the information content is not sufficient to describe the efficiency there. Accordingly, the areas from 0 % to 5 % of the torque and speed are not mapped.   Figure 10 shows a practical way of validation. For this purpose, Fig. 10 a shows the measured data in a specified time window. Fig. 10 b compares the time signal of the calculated and the measured efficiency. A high level of agreement is evident. However, the calculated value shows a smoother course than the measured value. This is attributable to residual oscillations in the measured current signal. For a quantitative validation of the calculated values it is recommendable to define to a performance indicator. Therefore, the value xη is introduced. It's calculated as the quotient of the efficiencies from the measured data ηCalc and from the ANN ηNN according to formula (8).
The variation in percentage between the efficiencies can be derived from the indicator. A value of xη = 1 describes a perfectly fitted efficiency. A value of xη < 1 describes a too high efficiency calculated by the ANN. A value off xη > 1 describes a too low efficiency calculated by the ANN.
A representation of the indicator in the form of a histogram is shown in Fig. 11. It is evident that the calculated values are close to the measured values. For a numerical statement a statistical investigation is carried out. For this purpose, key parameters of the distribution, such as the standard deviation or the mean value, are calculated. The first standard deviation is σ = 3.86%. The second standard deviation is calculated as 2*σ = 7.72%. About 90% of the determined values for xη are within the first standard deviation. About 95% of the determined values for xη are within the second standard deviation. Thus, the dispersion of the values is classified as sufficiently low.

CONCLUSION
With the use of ANNs, the efficiency of a drive unit consisting of a DC motor and a planetary transmission could be described mathematically. The resulting function enables a variety of applications, such as the observation of the efficiency curve in real and simulated data or the determination and representation of characteristic efficiency maps. The application to preexisting models by integration as a subsystem enables an approximation of the models to reality. The informative value of the prediction increases. The time-efficient calculation by the determined function enables the integration into real-time simulation applications. The application to real measurement data enables the early detection of optimal operating points. This allows the operating strategy to be optimized.
The modelling approach specifically for efficiency can be transferred to other systems, such as separately considered transmissions, without a drive unit. The experimental modelling by ANNs can be transferred to other subsystems, such as the control system, in addition to the simulation of the efficiency.
The comprehensive description of the formation of the function also includes the design of experiments, data generation and data pre-processing. Especially problem solving in data preprocessing is presented with effective and efficient approaches.
The limitations of the approach are, for example, the inability to reliably extrapolate outside the measured data. In addition, the internal processes are not known due to the use of the black-box model.
For further optimization, potentials were identified. For example, the quality of the training data can be optimized by further optimization of the experimental design, in particular by an intelligent distribution of the measurement point density analogous to Martini et al. (2003). 90 % 95 % + σ + 2*σ -2*σ -σ 0* σ σ = 0.0386 2*σ = 0.0772