Data can also be categorized into labelled and unlabelled data. Data that has been manually tagged with headers and meaning is called labelled. Data that has not been tagged is called unlabelled data. Both labelled and unlabelled data are fed to the preceding machine learning phases. In the training phase, the ratio of labelled to unlabelled is 60-40 and 40-60 in the testing phase. Unlabelled data is transformed to labelled data in the testing phase, as shown in the following diagram: