Problem
Write short code in Python that will perform a ten-fold cross validation test on a dataset in DataFrame format (df_dataset) while clustering around column 'HELLO_WORLD'. Split the test/train proportion at 80/20. Convert the dataframe format to numpy.