Problem Statement:

Predict the wind speed for next 6 hours based on the historical wind speed data present in atlantic hurricane database using time series forecasting models "LSTM and GRU".

Long short-term memory (LSTM) is an artificial recurrent neural network (RNN) architecture used in the field of deep learning. LSTM networks are well-suited to classifying, processing and making predictions based on time series data, since there can be lags of unknown duration between important events in a time series.

A Gated Recurrent Unit (GRU), as its name suggests, is a variant of the RNN architecture, and uses gating mechanisms to control and manage the flow of information between cells in the neural network.

RNN’s are good for processing sequence data for predictions but suffers from short-term memory. LSTM’s and GRU’s were created as a method to mitigate short-term memory using mechanisms called gates.

Import all of the functions and classes we intend to use. This assumes a working SciPy environment with the Keras deep learning library installed.

It is a good idea to fix the random number seed to ensure our results are reproducible.

Extracting the NumPy array from the dataframe and convert the integer values to floating point values, which are more suitable for modeling with a neural network.

LSTMs are sensitive to the scale of the input data, specifically when the sigmoid (default) or tanh activation functions are used. It can be a good practice to rescale the data to the range of 0-to-1, also called normalizing.

Now we can define a function to create a new dataset, as described above.

The function takes two arguments: the dataset, which is a NumPy array that we want to convert into a dataset, and the look_back, which is the number of previous time steps to use as input variables to predict the next time period — in this case defaulted to 1.

This default will create a dataset where X is the windspped at a given time (t) and Y is the windspeed at the next time (t + 1).

With time series data, the sequence of values is important. A simple method that we can use is to split the ordered dataset into train and test datasets. The code below calculates the index of the split point and separates the data into the training datasets with 70% of the observations that we can use to train our model, leaving the remaining 30% for testing the model.

Creating LSTM & GRU Network

After data preparation we will design and fit our LSTM and GRU network for this problem.

The network has a visible layer with 1 input, a hidden layer with 4 LSTM blocks or neurons, and an output layer that makes a single value prediction. The default sigmoid activation function is used for the LSTM blocks. The network is trained for 100 epochs and a batch size of 1 is used.

After Creating model now we will fit the network to our problem

After the model was fit, we estimated the performance of the model on the train and test datasets. This will give us a point of comparison for new models.

The predictions were inverted before calculating error scores to ensure that performance is reported in the same units as the original data.

GRU Network Model

Comparison of Graphs and numerical values of windspeed predicted by LSTM and GRU shows that LSTM works better in this case var loss is very less in case of LSTM.

Visualizing and Comparing LSTM & GRU Network Predictions

Now we will make predictions using the model for both the train and test dataset to get a visual indication of the skill of the model.

Because of how the dataset was prepared, we must shift the predictions so that they align on the x-axis with the original dataset. Once prepared, the data is plotted, showing the original dataset in blue, the predictions for the training dataset in green, and the predictions on the unseen test dataset in red.

Conclusion:

We see that in case of our dataset LSTM works better than GRU, though both of them are RNN types , the key difference between GRU and LSTM is that GRU's bag has two gates that are reset and update while LSTM has three gates that are input, output, forget. GRU is less complex than LSTM because it has less number of gates. If the dataset is small then GRU is preferred otherwise LSTM for the larger dataset.