The cost function that we use with RNNs is cross-entropy loss. There is nothing special about its implementation for RNNs versus a simple binary classification task. Here, we are comparing the two probability distributions—one predicted, one expected. We calculate the error at each time step and sum them.