From: Junio C Hamano <gitster@pobox•com>
To: Jeff King <peff@peff•net>
Cc: Elliot Wolk <elliot.wolk@gmail•com>,
Robin Rosenberg <robin.rosenberg@dewire•com>,
git@vger•kernel.org
Subject: Re: move detection doesnt take filename into account
Date: Wed, 09 Jul 2014 08:51:07 -0700 [thread overview]
Message-ID: <xmqqegxu7cpg.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <20140709064521.GA14682@sigill.intra.peff.net> (Jeff King's message of "Wed, 9 Jul 2014 02:45:21 -0400")
Jeff King <peff@peff•net> writes:
> On Tue, Jul 01, 2014 at 10:08:15AM -0700, Junio C Hamano wrote:
>
>> I didn't think it through but my gut feeling is that we could change
>> the name similarity score to be the length of the tail part that
>> matches (e.g. 1.a to a/2.a that has the same two bytes at the tail
>> is a better match than to a/2.b that does not share any tail, and to
>> a/1.a that shares the three bytes at the tail is an even better
>> match).
>
> The delta heuristics in pack-objects use pack_name_hash, which claims:
>
> /*
> * This effectively just creates a sortable number from the
> * last sixteen non-whitespace characters. Last characters
> * count "most", so things that end in ".c" sort together.
> */
>
> which might be another option (and seems like a superset of the basename
> check, short of basenames that are longer than 16 characters).
Perhaps.
I am however not sure if the code to compute similarity score is as
OK with false positives, i.e. dissimilar names that happen to hash
together getting clumped in a same bin or in close bins, as the
existing callers of pack_name_hash().
next prev parent reply other threads:[~2014-07-09 15:51 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-30 6:38 move detection doesnt take filename into account Elliot Wolk
2014-07-01 9:16 ` Robin Rosenberg
2014-07-01 14:40 ` Elliot Wolk
2014-07-01 14:57 ` Junio C Hamano
2014-07-01 15:05 ` Elliot Wolk
2014-07-01 17:08 ` Junio C Hamano
2014-07-09 6:45 ` Jeff King
2014-07-09 15:51 ` Junio C Hamano [this message]
2014-07-09 22:03 ` Jeff King
2014-07-09 22:18 ` Junio C Hamano
2014-07-10 3:53 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqegxu7cpg.fsf@gitster.dls.corp.google.com \
--to=gitster@pobox$(echo .)com \
--cc=elliot.wolk@gmail$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=peff@peff$(echo .)net \
--cc=robin.rosenberg@dewire$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox