Here's that puzzle:
Given a directory containing say a few thousand files, please output a list of all the names of the files in the directory that are exactly the same, i.e. have the same contents.
func(a_directory_name) output -> {"matches": [[fn1, fn2 ...], [fn3, fn4 ...] ... ]}
e.g. func("/home/my/files") where the directory /home/my/files might contain foo.txt, foo.iso, foo.jpeg, bar.txt, bar.doc, baz.csv, baz.ppt etc. and say the file foo.txt is the same as bar.doc and foo.iso is the same as baz.csv and baz.ppt then the output would be:
{
"matches": [
[
"foo.txt",
"bar.doc"
],
[
"foo.iso",
"baz.csv",
"baz.ppt"
]
]
}
You may code the response in any programming language you like, however our primary DevOps programming languages are:
- Bash
- Perl
- Python
- Groovy
Please provide a written discussion of your code, what are its strengths and weaknesses, what are the boundary conditions where it will fail and where could it be improved