# What is the hidden state in a recurrent neural network?

(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: lawrence@krubner.com, or follow me on Twitter.

I found this good example of how the function in a neural net can take hidden state and return hidden state, which then becomes the input-hidden-state during the next iteration. But this example doesn’t give a good example of how the hidden state is used to encode additional information:

Now, instead of the above sequences, try to teach the following sequences to the same MLP.

๐๐=[๐,๐,๐,๐,๐,๐,โฏ,๐ฆ,๐ง,๐ง]=[๐,๐,๐,โฏ,๐ง,๐,๐,๐,โฏ,๐ฆ,๐ง]

X =[a,a,b,b,c,c,โฏ,y,z,z] Y =[a,b,c,โฏ,z,a,b,c,โฏ,y,z]More than likely, this MLP will not be able to learn the relationship between ๐

X

and ๐

Y

. This is because a simple MLP can’t learn and understand the relationship between the previous and current characters.Now, we use the same sequences to train an RNN. In an RNN, we take two inputs, one for our input and the previous hidden values, and two outputs, one for the output and the next hidden values.

๐(๐ฅ,โ๐ก)โ(๐ฆ,โ๐ก+1)

f

(

x

,

h

t

)

โ

(

y

,

h

t

+

1

)Important: here โ๐ก+1

h

t

+

1

represents the next hidden value.We will execute some sequences of this RNN model. We initialize the hidden value to zero.

x = a and h = 0

(a,next_hidden) <- f(x,h)

prev_hidden = next_hiddenx = a and h = prev_hidden

(b,next_hidden) <- f(x,h)

prev_hidden = next_hiddenx = b and h = prev_hidden

(c,next_hidden) <- f(x,h)

prev_hidden = next_hiddenand so on

If we look at the above process we can see that we are taking the previous hidden state values to compute the next hidden state. What happens is while we iterate through this process prev_hidden = next_hidden it also encodes some information about our sequence which will help in predicting our next character.

