1. Introduction
Water is the main part of organisms, as well as most of bio- and chemical systems, and plays an important role in the life, or bio- and chemical processes. Although the structures and functions of water have been widely studied, it remains mysterious due to the complexity and flexibility of water structures linking with different hydrogen bonding [
1,
2]. The hydrogen bonding of water may change under perturbations, such as the variation of composition, temperature, pH, and concentration. Therefore, the change of water structures can be a reflection of the composition and properties of the analyzing system. The heterogeneous property of the silica surface was confirmed by imaging water rearrangement on the silica surface using second harmonic generation, which proves that water can be used as a ‘window’ to reveal the structural information of surfaces [
3]. The OH group due to the displacement of interfacial water by graphene oxide can be identified by infrared (IR) spectroscopy. Based on the change of interfacial water, the interaction forces between graphene oxide and lipid membrane were determined by surface-enhanced IR absorption spectroscopy [
4].
Chemosensing has experienced rapid growth in the fields of disease diagnosis [
5], environment monitoring [
6], and toxicity analysis [
7] in recent years. The development of chemosensors drives the need to develop efficient probes that allow for an in-depth understanding of the relationships between the presence of chemical or biological markers and their biological implications. Molecular probes are generally chemical substances or materials which can recognize the target analytes, and can be divided into different types depending on the yield measurable signal. Among them, the spectroscopic probe has shown great application in different fields, such as molecular recognition, biological imaging, and medical diagnosis due to its specificity, selectivity, and high sensitivity [
8,
9,
10,
11]. Organic molecules, proteins, nucleic acids, metal ion complexes, functionalized nanoparticles, and quantum dots are common materials for spectroscopic probes [
12,
13,
14,
15,
16]. The characteristic spectral features, ‘on/off’ property, and even the spectral change of the probe with its environment can be used to show the existence, quantity, or structure of the analyzing target in samples. However, requirements are needed for a molecule or material to be a spectroscopic probe in practical applications, particularly in the fields of biological or medical analysis. A good spectroscopic probe should have high performance in optical and chemical stability, biocompatibility, toxicity, targeting, and tissue penetration [
17]. More importantly, almost all of the probes are exogenous, which may have an effect on the analyzing systems by changing the original state of the analyte, and even participating in life activities. Therefore, water, which is ubiquitous and nontoxic, can be a good choice as a probe.
Near-infrared (NIR) spectroscopy has been widely used to study the structures of water and the interactions of water and solutes in aqueous systems [
18,
19,
20]. The spectral features of water structures can be distinguished by analyzing the spectra and the spectral variation induced by perturbations, such as temperature, solutes, and pH, and therefore, the structural changes and interactions can be observed [
21]. For example, based on the temperature effect, temperature-dependent NIR spectroscopy was developed to measure the NIR spectra of samples at different temperatures. The dependence of NIR spectra on the temperature should have a corresponding change when the composition varies. Quantitative and structure analysis has been achieved by the technique. NIR spectra of glucose solution and human serum samples with different concentrations and measured at different temperatures were analyzed by multilevel simultaneous component analysis (MSCA) [
22]. Through the variation of water structures with perturbation, the models of quantitative relationships between spectral changes of water and perturbation were established, providing clear evidence that water can be a probe for the quantitative determination of the analytes in aqueous solutions by NIR spectroscopy. For the biological samples with high water content, a non-invasive, real-time, and dynamic analysis was conducted through NIR spectroscopy. However, the spectrum of water consists of severely overlap** absorption bands and contains a lot of redundant information, which brings difficulties to the analysis of the spectrum [
23].
Chemometric methods to improve spectral resolution and extract spectral variations are necessary for the analysis of water structures from the NIR spectrum [
24]. Great efforts have been made for the aims to extract the spectra of the components from the overlap** spectra of mixtures and to enhance the separation degree between the spectral peaks. Principal component analysis (PCA) is a common analytical technique for resolving NIR spectra. The method was applied to investigate the structures of water [
25]. The spectral features at 1412 and 1491 nm were found able to account for more than 99% of the spectral variation, representing the water species with weaker and stronger hydrogen bonds. Further, the existence of a spectral feature of water located at 1438 nm was found, showing the existence of more water structures. Multivariate curve resolution-alternating least squares (MCR-ALS) was used to analyze complex structural changes in liquid water monitored by mid-infrared and NIR spectra at a different temperature [
26]. Three spectral components, the vibration of the free OH group, the asymmetrically bonded OH group, and four-coordinated species were obtained. According to the concentration variation of the three components with temperature, it was revealed that, with the increase in temperature, the water structure changes from a four-coordinated structure into a free OH structure through the asymmetrically bonded OH structure. Wavelet transform (WT) is one of the powerful methods to improve NIR spectral resolution and resolve overlap** spectra [
24]. Due to the characteristic of the double localization in position (wavelength or wavenumber) and frequency domains, the components of an analytical signal in different frequency can be obtained by WT and the position does not change. The separation of peaks can be improved greatly, and spectral features of different water structures can be obtained from overlap** and broad spectral peak.
2. Wavelet Transform (WT)
WT is a mathematical technique to transform a signal into its components by a series of wavelet functions, which is defined as
ψa,b(
t) from a base function
ψ(
t) by dilation and translation by Equation (1):
where
t is the variable of the function, which can be time or wavenumber depending on the specific signals in the application,
a and
b are the scale and translation parameters that control the dilation and translation, respectively, and
ψ(
t) is a kind of functions that have localization characteristics. A number of wavelet functions have been proposed such as Haar, Morlet, Symlets, and Daubechies, which provided good flexibility for processing the signals in different shapes.
WT is defined as:
where
f(
t) is the analyzed signal. Because the values of
a and
b are continuous, it is called continuous wavelet transform (CWT).
Because analytical signals are generally discrete in practice, it is difficult to do the calculation of WT directly with Equation (2). Therefore, a discrete form of the wavelet functions and the transform is proposed, as well as the algorithm for calculation. The parameters
a and
b can be discretized by
m power of
a0 and
n times of
b0. A binary wavelet is generally adopted in signal processing using
a0 = 2 and
b0 = 1. Then, the discrete form of Equations (1) and (2) can be described by Equations (3) and (4), and the method is called discrete wavelet transform (DWT).
Multiresolution signal decomposition (MRSD) algorithm proposed by Mallat [
27,
28] is generally used for calculation of DWT, as described by Equations (5) and (6):
where
n is the index of the data point,
and
are the filters corresponding to the wavelet function. If
C0 is the analytical data,
Cj and
Dj (
j = 1 …
J) are named as discrete approximations and discrete details, which represent the lower and higher frequency part of the data, and
j is the scale parameter or resolution level. The decomposition diagram of an analytical signal by MRSD is shown in
Figure 1A. In the decomposition of each scale, the discrete approximation C
j−1 is decomposed into two parts, C
j and D
j, containing the low- and high-frequency part of the signal. Therefore, the signal C
0 can be decomposed into a series of the components,
Dj (
j = 1 …
J) and
CJ in an order of high to low frequency.
For further enhancing the resolution in frequency, wavelet packet transform (WPT) was developed. MRSD algorithm can also be employed for the calculation. In the calculation, however, as shown in
Figure 1B, two branches of the decomposition are performed, i.e., both the discrete approximation and detail are further decomposed into two parts in the calculation of each scale. Therefore, the signal can be decomposed into 2
J components for a
J scale decomposition, and the frequency of the components is in order from low to high frequency.
Due to the double localization characteristics, WT has been extensively adopted in processing analytical signals, including data compression, denoising, smoothing, baseline or background correction, and resolution enhancement [
29,
30,
31,
32]. In the field of analytical chemistry, WT has been used for processing analytical signals of ultraviolet-visible (UV-VIS) spectroscopy, IR and NIR spectroscopy, nuclear magnetic resonance (NMR) spectroscopy, flow injection analysis (FIA), high-performance liquid chromatography (HPLC), capillary electrophoresis (CE), mass spectroscopy, electroanalytical chemistry, and X-ray diffraction. Data compression can be achieved by eliminating the small coefficients in discrete details, and denoising and smoothing can be achieved by removing or suppressing the details representing the components of high frequency. The main information in an NMR spectrum of a biological molecule with 32,768 data points can be represented by 190 wavelet coefficients [
33]. No difference can be seen between the experimental spectrum and the reconstructed ones by 2048, 1024, and 512 wavelet coefficients, respectively. On the other hand, removing the components representing the components of low frequency can be used to separate the baseline from chromatograms or spectra of the extended X-ray absorption fine structure (EXAFS) [
34]. If WPT is used, the efficiency can be further improved [
35]. If the wavelet coefficients of the middle frequencies are used, high-resolution information about the signal can be obtained. Chromatographic signals of benzene, methylbenzene, and ethylbenzene can be extracted from the overlap** chromatograms of their mixture by DWT, and the linearity of the decomposed signals can be remained [
36]. Multiplying the middle-frequency component by an enhancing factor can be used to improve the resolution of the nuclear magnetic resonance (NMR) spectrum [
33]. In addition, CWT was widely used as a technique to improve the resolution of overlap** signals [
37,
38]. CWT is known as a method to calculate the approximate derivative. A smooth derivative can be obtained, and high-order derivative can be calculated by using an appropriate wavelet filter [
39]. In NIR spectroscopy, CWT has been extensively adopted to enhance the resolution of the spectrum [
24]. New peaks can be observed in the transformed spectrum to analyze the structures of OH and CH with different intermolecular interactions [
40]. Therefore, with the help of WT, structural information of different water structures can be obtained from the NIR spectrum of water. This may make water to be a good probe to sense the quantity, structure, interactions, and even properties of the molecules in aqueous systems.
3. Complexity of Water Structures
Water is a simple molecule consisting of two hydrogen atoms and one oxygen atom. However, both hydrogen and oxygen are easy to form hydrogen bonds as proton donors and acceptors, respectively. This property makes water molecules easily interact with themselves and the surrounding polar compounds. Furthermore, the hydrogen bonds can rapidly rearrange in response to changing conditions or environments, for example, in the solutions of different solutes [
41,
42] or bio-systems [
43,
44]. Hydrogen bond network structure and dynamics determine the chemical and physical properties of water [
45].
Studies of the hydrogen bond network of water have been conducted using various experimental and theoretical methods for decades [
46,
47,
48,
49]. Different models have been established to describe the structures of water. The models can be divided into two types, i.e., the continuum model and the mixture model. The former was proposed in 1965 [
50]. The structure of water is known as a network of hydrogen-bonded water molecules with a continuous distribution of distances, angles, lengths, and energies. The model was further supported by the observation that only one intensive absorption maximum with a simple shape in the IR spectrum of high-density water at all temperatures [
51]. Raman spectroscopy and Monte Carlo simulation was also applied to analyze hydrogen bond geometries and energies in liquid water at different temperatures [
52]. The results indicate that the features of Raman spectrum actually result from a continuous distribution of water. In the mixture model, liquid water is thought to be composed of well-defined structures, and the liquid properties are calculated as the concentration-weighted averages of the contributions from each structure [
50]. A two-state structural model for water was suggested in early studies, in which water structures were roughly in two major species with weaker and stronger hydrogen bonds [
25]. The content of weakly hydrogen-bonded water increases when the temperature rises, but the other decreases. A model of three OH states based on the cooperativity of hydrogen bonds was proposed [
53]. The cooperativity of hydrogen bonds in water and alcohol was studied by IR spectroscopy. In addition to weak and strong cooperative hydrogen-bonded OH, non-hydrogen-bonded OH was found to explain the anomalous viscosity of water-alcohol mixtures.
A more detailed model describes water structures by five structures, i.e., the water structure with none (S
0), one (S
1), two (S
2), three (S
3), and four (S
4) hydrogen bonds, respectively [
54]. NIR spectra of water measured at different temperatures were analyzed by the model. Through the change of the spectral intensity corresponding to the five water structures with temperature, it was found that the water structure S
0 increases, while the other water structures decrease. In our studies, curve fitting was conducted to fit the NIR band around 6900 cm
−1 [
55]. The spectral peaks for each water species were obtained by fitting the spectrum with five Gaussian peaks. The variation in the water species with temperature shows that hydrogen bonds break with the increase in temperature and water structures with more hydrogen bonds become into that with fewer hydrogen bonds. Further, the model was also used for understanding the structure change of water during the gelation of globular proteins [
56]. It was found that all the water species change with the structure change of protein, S
1 water is an indicator for the denaturation, and S
2 water is an indicator for the three stages of the structure transition during the gelation from native structure to the molten globule and then to the gel.
A model to describe the complex water structures by the number of hydrogen bonds on oxygen (the proton acceptor) and hydrogen (the proton donor) was proposed [
57,
58]. Nine structures were defined by the number of hydrogen bonds on the acceptor and donor, denoted by A
mD
n, that is, A0D0, A0D1, A1D0, A0D2, A1D1, A2D0, A1D2, A2D1 and A2D2 [
57], as shown in
Figure 2A. Among them, the first and the last correspond to the structure of S
0 and S
4, respectively, but there are two structures corresponding to S
1 and S
3, and three structures corresponding to S
2.
Figure 2B shows the result of Gaussian fitting of the NIR spectrum in the spectral range of 7600–6100 cm
−1. Ten peaks simulated by the Gaussian function are used in the fitting, including a spectral peak of the rotation vibration of A0D0 water (S
r) and nine spectral peaks of the water structures. In this work, a knowledge-based genetic algorithm was developed for Gaussian fitting of the NIR spectra of water measured at different temperatures, and CWT was used to improve the resolution and remove the background in the spectra. Through the variation of the relative contents for the water structures with temperature, the dissociation of the larger clusters into smaller ones was observed. Furthermore, the enhancement of the ordered (A2D2) water structure by glucose molecules was obtained, indicating that glucose makes the water structures less sensitive to temperature.
4. Understanding Water Structures by Near-Infrared Spectroscopy
NIR spectroscopy has been recognized as a powerful technique to study the structure of water. Due to the strong absorption, the spectral information of the water structures with different hydrogen bonds can be measured. However, the low resolution of the NIR spectrum and the overlap** of the peaks in the NIR spectrum is an obstacle to obtaining the spectral information to distinguish the similar structures of water. Therefore, great efforts have been made for enhancing the resolution of NIR spectra and extracting the spectral features of water structures. Temperature-dependent NIR spectroscopy (TD-NIRS) [
22,
55,
56,
59], aquaphotomics [
60,
61,
62], and various chemometric methods [
18,
62] were developed. In the analysis of TD-NIRS, the spectra of water are measured at different temperatures. The dependence of the spectra on the variation of temperature can be obtained. Because the spectral change is determined by the structure change of water and the structure of water is related to the surrounding molecules, the structure and even the quantity of the analyte can be obtained through spectral dependence on the temperature. With the help of chemometrics, including the WT techniques, the high-resolution spectrum of water was obtained and various water structures were identified [
18,
20,
24]. In the studies of aquaphotomics, the spectral change with various perturbations was analyzed by water spectral pattern defined with water matrix coordinates (WAMACS) and represented by aquagram. Through the change of the WAMACS in the aquagram with the perturbations, the structure change of water can be observed and the quantity and properties of the analyzing system can be shown.
CWT has been proved to be an effective method to resolve overlap** NIR spectral peaks. With the help of the method, the resolution of NIR spectrum can be significantly improved [
18,
24,
40,
55,
56]. Furthermore, combining CWT with different chemometric methods, such as PCA [
63], independent component analysis (ICA) [
64], and Gaussian fitting [
55,
56,
57], the spectral features of water structures with different hydrogen bonds were obtained.
Figure 3A shows an NIR spectrum of water in the spectral range of 8000–6000 cm
−1. There is only a broad peak around 6900 cm
−1, which corresponds to the first overtone of OH stretching. It is difficult to obtain information about the structures of water directly from the spectrum. To enhance the resolution of the spectra, CWT can be adopted.
Figure 3B shows the transformed spectra of water by CWT. Three peaks at 7111, 6966, and 6841 cm
−1 were obtained, which can be assigned as vibrational of non–hydrogen–bonded (NHB), weakly hydrogen-bonded (WHB), and strongly hydrogen–bonded (SHB) OH in a water molecule, respectively.
To obtain more detailed information on water structures from the NIR spectrum, a chemometric method based on the rotation of the loadings in PCA was proposed [
65]. The method uses CWT to enhance the resolution of the spectra, and further analyzes the transformed spectrum by PCA. When PCA is performed on the spectral data of a binary water–ethanol mixture, for example, the loadings obtained by PCA are combinations of the spectra of the two components. Therefore, through the rotation of the loadings, the spectrum of water in the mixture can be obtained, which is more reliable to reflect the structure of water in the mixture. Furthermore, the difference spectrum between the spectra of pure water and the calculated one can be used to analyze the structure change of water when it is mixed with another compound.
Figure 4A shows the transformed NIR spectrum of pure water and the calculated spectrum from that of water–ethanol solution, and
Figure 4B shows the difference between the two spectra. It is interesting that seven peaks were found, including the negative peaks at 7246, 7050, and 6961 cm
−1 and the positive peaks at 7134, 6864, 6768, and 6691 cm
−1. These peaks, in the sequence of higher to lower wavenumber, may correspond to the rotation vibration (S
r) of free OH and the OH in water structures with hydrogen bonds of different numbers or strengths. More interesting is that three negative peaks emerge in the higher wavenumber range of the spectrum, but three positive peaks emerge in the lower wavenumber range. The result clearly indicates that water structures with less or weaker hydrogen bonds becomes into that with more or stronger hydrogen bonds when ethanol is added to water.
A method combining CWT with Monte–Carlo uninformative variable elimination (MC-UVE) was proposed for investigating the temperature-dependent or temperature-sensitive variables (wavenumbers) in the NIR spectra of water and aqueous solutions [
66].
Figure 5A shows the NIR spectra of water measured at different temperatures. Using CWT with different scale parameters, spectral information with different frequencies can be obtained, as shown in
Figure 5B. In the calculation, scale parameters from 1 to 60 were used, thus each line in
Figure 5B shows the result at a scale. In the figure, the overlap** peak is clearly separated. Then, the quantitative model for predicting temperature by the spectra can be established, and the importance of the variables in the high-resolution spectra can be obtained by Monte-Carlo uninformative variable elimination (MC-UVE), which is a chemometric method for variable selection in NIR spectral analysis [
67]. The variables selected according to their importance can be known as the wavenumbers that change significantly with temperature. Because the spectral change is determined by the structure change, the changing structures of water can be determined by the selected variables. It is interesting that seven variables were found to change significantly with temperature, implying that there may be seven varying structures when the temperature changes. It is more interesting that the seven variables selected from the spectra of water and different solutions are located at similar but not identical wavenumbers, as shown in
Figure 5C. The results demonstrated that the temperature-dependent NIR spectrum of water can be used to show the structure change of water and to discriminate aqueous solutions with different composition.
DWT and WPT have been proven to be powerful in processing experimental signals by decomposing a signal into components with different frequencies, from which high-resolution components can be obtained [
29,
30,
31,
32,
33,
34,
35,
36,
68]. For extracting high-resolution spectral components from the NIR spectrum of water, a six-level WPT decomposition was performed on the NIR spectrum of water and H
2O–D
2O mixture [
69]. A total of 126 spectral components was obtained by a six-scale WPT decomposition. It was found that most of the components are of high frequency representing the noise and a small number of the components are of low frequency representing the overlap** spectral information. There are several components, however, whose frequencies are comparatively higher but lower than noise. Such components correspond to the spectral information of high resolution. For example,
Figure 6A,B show two of the spectral components obtained by WPT from the NIR spectrum of mixed water and heavy water with different mole ratio. It is interesting that, in
Figure 6A, the spectral peaks can be attributed to the spectral responses of OD and OH in HDO and D
2O. The conclusion can be confirmed by the variation of the peak intensity with the mole ratio of D
2O in the mixtures. The result in
Figure 6B is also interesting, from which the structures of OH with different hydrogen bonds can be identified. With a further analysis of the intensity variation with the mole ratio, the labelled four peaks in
Figure 6B can be assigned to the OH with different number hydrogen bonds, i.e., A0D1, A1D0, A1D1, and A2D1, respectively.
5. Water as a Spectroscopic Probe for Quantitative and Structure Analysis
Water structures change with the temperature, pH, and concentration of the analyte, and the change can be captured by NIR spectroscopy with the help of chemometric methods. Thus, water may be a promising spectroscopic probe for sensing the quantity and the structure of the analyte in solutions and bio- and chemical systems. A quantitative spectra-temperature relationship (QSTR) model between NIR spectra of aqueous solution and temperature was established by using partial least squares (PLS) regression [
70,
71]. It was found that well-fitted QSTR models can be obtained for commonly used solvents, such as water, methanol, and n-hexane, as well as their mixtures. The difference between the models was found to be related to the difference in composition between the samples. Quantitation analysis was also achieved by combining CWT and high dimensional chemometric methods, such as high-order principal component analysis, alternating trilinear decomposition (ATLD), and parallel factor analysis (PARAFAC) [
72]. On the other hand, multilevel simultaneous component analysis (MSCA) was also employed in analyzing temperature-dependent NIR spectrum, which performs PCA on the unfolded data by aligning the data blocks along with concentration and temperature [
22]. A new algorithm, named mutual factor analysis (MFA), was proposed, in which PCA is also employed to deal with the unfolded data [
73]. Taking the advantage of CWT, the resolution of the spectra was enhanced, and more information was provided for the analysis of the variations induced by temperature and concentration.
The interaction of water and solutes in solutions and the role of water in the processes of biological and chemical changes were also achieved with the help of WT and chemometric methods. For example, the interaction of water and dimethyl sulfoxide (DMSO) is studied for understanding the mechanism of DMSO in reducing the freezing point of water [
74,
75]. Through the analysis of the NIR spectra of water–DMSO mixture at different temperatures, it was found that DW2 structure, i.e., the structure composed of one DMSO molecule and two water molecules interacting with hydrogen bond, inhibits the formation of the ice-like (tetrahedral) water structure at low temperature. For further understanding the interactions when protein exists in the system, the ternary mixtures of DMSO−water−formamide (FA) was studied. A peak at 6437 cm
−1 was observed in the transformed spectrum by CWT, depicting the interaction of DMSO and water through hydrogen bonding (S=O…H−O). When FA exists, however, the intensity of the peak decreases with the increase in FA content, suggesting that FA may replace the water to form the hydrogen bond between S=O and H−N.
For understanding the role of water in the aggregation of protein and polymer, the variation of water structures during the aggregation or phase transition was analyzed by temperature-dependent NIR spectroscopy and chemometrics. The change of water structures during the gelation of ovalbumin (OVA) was investigated by analyzing the NIR spectra [
56]. With the help of CWT, the spectral features of the protein (α-helix and β-sheet) and water structures were obtained and were used to investigate the relationship with the structural changes of the protein. It was found that the three phases (native, molten globule, and gel) of the gelation process of OVA can be shown by the variation of the spectral features. With the help of 2D correlation NIR spectroscopy and Gaussian fitting, it was shown that the water species S
2, i.e., the water molecule with two hydrogen bonds, changes in the same phases as OVA, demonstrating that water is a good probe to monitor the structure change of protein. Similarly, the structure change of water in the phase separation of temperature-sensitive polymers was studied for understanding the role of water and the driving factor in the aggregation [
63]. CWT was employed to correct the baseline and enhance the resolution of the spectra, and the transformed spectra were analyzed by using a high-order principal component analysis. Through the variation of water structures with temperature, it was found that S
2 water changes significantly with the structure change of the polymer. Therefore, S
2 may be a key structure that stabilizes the coil state of the polymer and plays an important role in the formation of micelle from the coil state.
For studying the water structures in a confinement environment, water in hydrogels and reverse micelles (RMs) were analyzed by NIR spectroscopy [
76,
77]. CWT was also adopted to enhance the resolution. In the hydrogel of poly-N,N-dimethylacrylamide (PDMAA), the spectral features of water species with free OH (S
0) and hydrogen bonded OH (S
1 and S
2) was observed, as well as the spectral information of the water molecules bonding to NH groups in the frame of the hydrogel by hydrogen bond, denoted as S
1NH. The structure forms very early when the hydrogel absorbs water, and disappears very late when the hydrogel releases water. The result suggests that S
1NH water may be the key structure in the shape recovery of the hydrogel. In the RM system of sodium bis(2-ethylhexylhexyl) sulfosuccinate (AOT) and isooctane (IO), bovine serum albumin (BSA), human serum albumin (HSA), and OVA were found to be stabilized compared with the situation in aqueous solution, and the bridging water connecting NH in protein and S=O in the inner surface of RM was found to be a key structure to stabilize the protein confined in RM.