Control behavior of rename and copy detection
These options mostly mimic parameters that can be passed to git-diff.
Combination of git_diff_find_t values (default GIT_DIFF_FIND_BY_CONFIG). NOTE: if you don't explicitly set this, diff.renames
could be set to false, resulting in git_diff_find_similar
doing nothing.
Threshold above which similar files will be considered renames. This is equivalent to the -M option. Defaults to 50.
Threshold below which similar files will be eligible to be a rename source. This is equivalent to the first part of the -B option. Defaults to 50.
Threshold above which similar files will be considered copies. This is equivalent to the -C option. Defaults to 50.
Treshold below which similar files will be split into a delete/add pair. This is equivalent to the last part of the -B option. Defaults to 60.
Maximum number of matches to consider for a particular file.
This is a little different from the -l
option from Git because we will still process up to this many matches before abandoning the search. Defaults to 200.
The metric
option allows you to plug in a custom similarity metric.
Set it to NULL to use the default internal metric.
The default metric is based on sampling hashes of ranges of data in the file, which is a pretty good similarity approximation that should work fairly well for both text and binary data while still being pretty fast with a fixed memory overhead.