W. Huang et al. / Ocean Engineering 30 (2003) 22752295
2283
network developed in the "learning" process represents a pattern detected in the data.
Thus, in principle, ANN methods can be applied to many research issues such as
those in coastal engineering and oceanography. Theoretically, as long as the training
data set covers the maximum range of the forecasting boundary data, a short-term
data set can be used to train an ANN model for long-term predictions. A trained
neural network can provide a much faster simulation for forecasting long-term events
than traditional hydrodynamic models since its calculation requires no computational
iteration. The implementation of an ANN model is similar to calculating a multiple
variable linear regression function: Output Y(t) = ANN [w1∗X1(t), w2∗ X2(t)
...wn∗Xn(t)], where wi (i = 1, ..., n) are the weights of the ANN network, Xi (i =
1, ..., n) are input signals, and Y is output signal.
3.5. ANN optimization and improvement
The standard gradient-descent training method sometimes suffers from slow con-
vergence due to the presence of one or more local minima. This is generally a charac-
teristic of the particular error surface, which is often composed of several flat and
steep regions. There are, however, several optimization methods, that can be used
to improve the convergence speed and the performance of network training. Details
Foo's (2002) study shows that training speed increases by almost three times when
the conjugated optimization technique is used.
Overfitting is another problem that may occur during neural network training. The
error on the training set is driven to a very small value, but when new data is
presented to the network, the error is large. In this case, the network has memorized
the training examples, but has not learned to generalize to new situations. One useful
approach for improving network generalization is to use an adequately sized network
that is just large enough to provide an adequate fit. The larger a network is, the more
complex the functions that the network can create, which may lead to overfitting. If
a small enough network is used, it will not have enough power to overfit the data.
can prevent overfitting. However, it is difficult to know beforehand just how large
a network should be for a specific application. In general, the optimal network size
to prevent overfitting can be determined through model sensitivity experiments.
4. RNN--WL model design
In this study, the standard three-layer feed-forward backgropagation network
(Haykin, 1999) with a nonlinear differentiable log-sigmoid transfer function in the
hidden layer (Fig. 5) was employed. The network programming was done using the
Matlab computer software (MathWorks, 1999). Huang and Fu's (2002) study indi-
cates that using an optimized conjugated training method results in improvement of
both training speed and accuracy. In general, the network training speed using conju-
gated training method is about three times faster than when the standard gradient