3.1. Data Acquisition and Processing
The energy consumption of a ship is affected by a series of parameters, which mainly include information on the ship’s operation status, such as the speed, direction, shaft power and shaft speed of voyage, as well as factors of the voyage environment. During the voyage, the data acquisition system will use the sensors to capture the corresponding data, which will then be sent and stored in the onboard database and the shore-based database. The onboard energy management system and the shore-based management system can display and monitor this information in real time for energy management during the voyage. The data acquisition process is shown in
Figure 3.
In this paper, data from two voyages were selected for this research. Case 1 selected the data from the route from Caofeidian to St. Louis. Case 2 selected data on the ship’s voyage from 1 August 2016 to 31 August 2016. The bulk carrier ship studied in this paper and the routes are shown schematically in
Figure 4 and
Figure 5, respectively.
Table 2 details the ship’s parameters, engine specifications, and navigation parameters. During the voyage, data on the sailing speed, heading, shaft power, shaft rotational speed, and fuel consumption of the ship’s main engine were collected using onboard sensors. The instruments used to acquire the data are shown in
Table 3. The shaft power sensor is mounted on the main shaft to obtain the ship’s shaft power and shaft speed, the sailing position and speed information can be obtained from the GPS and speed log on the bridge, and the fuel consumption can be obtained from the fuel consumption sensor in the fuel lines. Additionally, the navigational environment information was obtained from the European Centre for Medium-Range Weather Forecasts (ECMWF), and data on real wind speed and real wind direction were derived through vector synthesis operations. During the whole voyage, the ship is under full load conditions without ballast water.
By establishing a predictive model for ship fuel consumption, the relationship between fuel usage and various influencing factors can be examined, enabling effective prediction and evaluation of fuel usage. Due to the different time scales of the collected fuel consumption data and the navigational environment data obtained from the meteorological center, data preprocessing is necessary. Firstly, the data collected every 10 min starting from 00:00 each day were converted into hourly ship fuel consumption data. Concurrently, the frequency of the meteorological data and sea state data, based on GPS and ECMWF data, was aligned with the data collected from the ships using a three-times B-spline interpolation algorithm. Additionally, to address outliers and noise in the collected data, a cleaning process was undertaken. This included handling missing data and anomalies caused by abnormal navigational environments to ensure the accuracy of the predictive model.
After data acquisition, a total of 12 ship operation features and navigation environment information features were obtained, and the features and their abbreviations are shown in
Table 4. In order to eliminate the influence of the magnitude between the feature variables, the data were normalized. The histograms of feature probability density distribution after data cleaning are shown in
Figure 6 and
Figure 7.
3.3. Correlation Analysis and Feature Selection
During voyages, ships are subject to wind resistance and water resistance. Among them, the water resistance is divided into two parts: static resistance and wave-adding resistance.
The static resistance and the wave-adding resistance can be obtained, respectively, by Equations (16) and (17):
where
CS is the static resistance, and
CF is the frictional resistance.
k is the viscous resistance factor,
CA is the appendage resistance,
CW is the wave-making resistance,
CB is the bulbous bow additional resistance, and
CSi and
CR are the stern immersion additional resistance and the relevant resistance.
Cwave is the wave-adding resistance,
ζc is the characteristic wave height,
B is the breadth of the ship,
k2 is the block coefficient,
ρ is the density of the sea water, and
L denotes the length of the ship.
Wind resistance can be obtained by Equation (18):
where
Cwave is the wind resistance,
k1 is the air coefficient resistance,
ρa is the density of air,
Vwind is the wind speed, and
As is the area of the ship’s positive projection on the water surface.
The total resistance includes static resistance, wave-adding resistance, and wind resistance:
In order for a ship to continue sailing at a specific speed, the main engine must consume fuel and thus provide a certain amount of power to drive the propeller in order to generate thrust to propel the ship forward. The effective thrust of the propeller should balance the hull resistance, which suggests that the fuel consumption of a ship is affected by environmental factors, and that further analyses of this effect can improve the accuracy of the predictions.
Figure 10a–c and
Figure 11a–c show the distribution of environmental characteristics during ship operation.
Figure 10d and
Figure 11d show the distribution of instantaneous fuel consumption at different moments during the ship’s voyage. It can be seen that both the environmental variables and the fuel consumption have a certain time series characteristic, and the environmental factors have a clearer influence on the fuel consumption of ships. In addition, there are significant differences in those parameters at different times and locations.
To further investigate the relationship between characteristic parameters and ship energy consumption during voyages, to screen characteristic input variables, and to improve prediction accuracy, the correlation analysis of the 12 characteristic variables and the ship fuel consumption rate was carried out. The Pearson correlation coefficient and the maximum information coefficient were selected as the correlation evaluation indexes, and the specific calculation methods of the maximum information coefficient are shown in Equations (14) and (15); the results of the correlation analyses are shown in
Figure 12,
Figure 13,
Figure 14 and
Figure 15. In
Figure 12 and
Figure 14, values larger than 0.5 are shown in white; in
Figure 13 and
Figure 15, values larger than 0.8 are shown in white.
From the results of the correlation analysis, it can be seen that the Pearson correlation coefficient between SS, SP, and FC is larger than the other features, indicating that there is a large linear correlation between these parameters, and the shaft speed and shaft power can directly reflect the changes in fuel consumption. There is a negative correlation between WH, WD, WS, and SOG, and the SOG will be reduced when the wave height and wind speed increase and the wind direction changes, which indicates that the navigational environmental factors have a certain influence on the ship speed, which in turn affects the ship’s fuel consumption, and that there exists a certain non-linear relationship between these characteristics and the fuel consumption. The correlation analysis of ship fuel usage and its multiple influencing parameters is of great significance to establish a prediction model of ship fuel usage.
In addition, since the actual wind speed is obtained by vector synthesis from the ECMWF data, this indicates that there exists a certain map** relationship between them, which will lead to the endogeneity problem of the model. At the same time, from
Figure 10, the correlation between longitude and latitude and the other characteristic variables is relatively more obvious, taking into account that the strong correlation of the characteristic variables will result in the problem of multiple covariance in the input variables, which can easily cause the overfitting of the model. The four characteristic variables of longitude, latitude, the 10 m u component of wind, and the 10 m v component of wind are eliminated in this paper. The remaining seven variables are selected as feature inputs to predict the ship’s fuel consumption.