Normally, we split the dataset so that we use 70% for training and 30% for testing. In this case, however, we don't need to do this. The dataset provided by NASA has four files that have the train_ prefix and four files that have the test_ prefix. We can use the train files for training and the test files for validation, as suggested by NASA.