Perform a recursive directory traversal. While walking the file tree, you will be looking for duplicate files and creating symbolic links to them.
To accomplish the directory traversal, you should write a recursive function using the dir family of functions: opendir(), readdir(), chdir() and closedir(). A prototype may look something like this:
void find_unique_files(const char*, char**);
The function should take a character array representing a filename and an array of strings. If it is called on a directory, it should move into that directory and continue its traversal. If it is called on a text file, it should generate a hash for the text file, check the list of hashes to make sure that it does not already contain that hash, then insert it if it doesn't. If it does, you know that you have found a duplicate.
The file's hash will be calculated using the SHA1 hashing algorithm. A library for calculating this hash can be found in .
When a duplicate file is discovered, you will create a symbolic link to the duplicate (the one which cannot be added to the list) in the /dups directory using the symlink() function
Symlink - pathname to a file, in fact a string.