There are three datasets for testing your algorithms. The first two should be used for the regular part of the project and the third one for testing your extra credits. The dataset DSCURE-10K contains approximately 10K points and it is a synthetic dataset. The clusters were created with a data generator. The second dataset, DS1NE-5K, is a real dataset and therefore the clusters are not as clear as the synthetic one. The third dataset, DSCURE-50K, is the same as the first one, but it contains 50K points. Therefore, this is a good dataset to test the improvement of the algorithm using the R-tree.