From: Toon Claes <toon@iotcl•com>
To: Taylor Blau <me@ttaylorr•com>
Cc: git@vger•kernel.org,
"Kristoffer Haugsbakk" <kristofferhaugsbakk@fastmail•com>,
"Derrick Stolee" <stolee@gmail•com>,
"Junio C Hamano" <gitster@pobox•com>, "Jeff King" <peff@peff•net>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail•com>
Subject: Re: [PATCH v5 1/6] last-modified: new subcommand to show when files were last modified
Date: Tue, 22 Jul 2025 17:50:49 +0200 [thread overview]
Message-ID: <877c00kyw6.fsf@iotcl.com> (raw)
In-Reply-To: <aHmPHcNQYlhGo8JB@nand.local>
Taylor Blau <me@ttaylorr•com> writes:
> On Wed, Jul 16, 2025 at 03:35:13PM +0200, Toon Claes wrote:
>> 11 files changed, 549 insertions(+)
>> create mode 100644 Documentation/git-last-modified.adoc
>> create mode 100644 builtin/last-modified.c
>> create mode 100755 t/t8020-last-modified.sh
>
> I'm admittedly not entirely sure what the best way to review this patch
> is given its size and my previous exposure to (similar) code.
Yeah, I wasn't sure how to approach this. I didn't want to come in with
a big bang with the final version, but give the reviewers the change to
see the improvements (and complexity) come in gradually.
>> diff --git a/builtin/last-modified.c b/builtin/last-modified.c
>> new file mode 100644
>> index 0000000000..63993bc1c9
>> --- /dev/null
>> +++ b/builtin/last-modified.c
>> @@ -0,0 +1,289 @@
>> +#include "git-compat-util.h"
>> +#include "builtin.h"
>> +#include "commit.h"
>> +#include "config.h"
>> +#include "diff.h"
>> +#include "diffcore.h"
>> +#include "hashmap.h"
>> +#include "hex.h"
>> +#include "log-tree.h"
>> +#include "object-name.h"
>> +#include "object.h"
>> +#include "parse-options.h"
>> +#include "quote.h"
>> +#include "repository.h"
>> +#include "revision.h"
>> +
>> +struct last_modified_entry {
>> + struct hashmap_entry hashent;
>> + struct object_id oid;
>> + const char path[FLEX_ARRAY];
>> +};
>
> As a general comment on this patch, I am a little sad to see that many
> of the implementation details have been moved back into the builtin
> itself and not in their own last-modified.ch file(s).
>
> Apologies if this was already discussed earlier in the thread and I
> simply missed it, but can you comment on why the last-modified internals
> were moved into the builtin?
Wasn't discussed yet, and this only happened in this last version.
Basically my idea was: there's no one else using this, why put it at the
root level anyway? Also, it relies heavily on `setup_revisions()`. In my
first iterations `argc` and `argv` from the builtin were passed on
directly to the root-level `last-modified.[ch]` subsystem. This is a
little awkward, putting so much raw user-input handling in the
subsystem.
> Even in the earliest version of 'blame-tree' that I could find (from
> 26999d045b (add blame-tree command, 2012-10-20) in my fork) many of the
> internals were written in blame-tree.c instead of builtin/blame-tree.c.
>
>> +static int last_modified_entry_hashcmp(const void *unused UNUSED,
>> + const struct hashmap_entry *hent1,
>> + const struct hashmap_entry *hent2,
>> + const void *path)
>> +{
>> + const struct last_modified_entry *ent1 =
>> + container_of(hent1, const struct last_modified_entry, hashent);
>> + const struct last_modified_entry *ent2 =
>> + container_of(hent2, const struct last_modified_entry, hashent);
>> + return strcmp(ent1->path, path ? path : ent2->path);
>> +}
>> +
>> +struct last_modified {
>> + struct hashmap paths;
>> + struct rev_info rev;
>> + int recursive, tree_in_recursive;
>
> Can we either make these two part of a bitfield, or at least declare
> them separately?
>
>> +};
>> +
>> +static void last_modified_release(struct last_modified *lm)
>> +{
>> + hashmap_clear_and_free(&lm->paths, struct last_modified_entry, hashent);
>> + release_revisions(&lm->rev);
>> +}
>> +
>> +typedef void (*last_modified_callback)(const char *path,
>> + const struct commit *commit, void *data);
>> +
>> +struct last_modified_callback_data {
>> + struct commit *commit;
>> + struct hashmap *paths;
>> +
>> + last_modified_callback callback;
>> + void *callback_data;
>> +};
>
> I can't quite tell what the purpose of this struct is in conjunction
> with the last_modified_callback type above.
Yeah, this is kind of a remnant of when there was a last-modified
subsystem. In current implementation, where all code lives in the
builtin, there's no good reason to keep this callback struct.
> The last_modified_callback type makes sense as a generic callback
> function that callers can pass to get <path, commit> pairs, along with
> an arbitrary "data" pointer.
>
> But then you define a last_modified_callback_data struct that, which
> made me think that it would be used as the data type passed to the
> callback. In other words, given the existence of this struct, I would
> have expected the function pointer above to be defined like:
>
> typedef void (*last_modified_callback)(const char *path,
> const struct commit *commit,
> struct last_modified_callback_data *data);
>
> But the fact that the _data struct contains a last_modified_callback
> function pointer gives us a hint at what's going on here. It seems like
> last_modified_callback_data is used to store some bookkeeping
> information and dispatch calls to the "callback" function pointer.
>
> I think that the fact the struct's name ends with "_data" is what is
> confusing to me. I think this would be a little clearer if you renamed
> this "struct last_modified_callback" and the function pointer to
> "last_modified_callback_fn" or similar.
>
> (The irony is not lost on me that these comments would be applicable to
> GitHub's version of this code, too :-s).
Hey, that's no excuse to keep it like this. I think keeping the callback
infrastructure depends on whether bring back the last-modified
subsystem. In that case, I will address your comments. If not, I think
we can get rid of it completely.
>> +static int populate_paths_from_revs(struct last_modified *lm)
>> +{
>> + int num_interesting = 0;
>> + struct diff_options diffopt;
>> +
>> + memcpy(&diffopt, &lm->rev.diffopt, sizeof(diffopt));
>> + copy_pathspec(&diffopt.pathspec, &lm->rev.diffopt.pathspec);
>> + /*
>> + * Use a callback to populate the paths from revs
>> + */
>> + diffopt.output_format = DIFF_FORMAT_CALLBACK;
>> + diffopt.format_callback = add_path_from_diff;
>> + diffopt.format_callback_data = lm;
>> +
>> + for (size_t i = 0; i < lm->rev.pending.nr; i++) {
>> + struct object_array_entry *obj = lm->rev.pending.objects + i;
>> +
>> + if (obj->item->flags & UNINTERESTING)
>> + continue;
>> +
>> + if (num_interesting++)
>> + return error(_("can only get last-modified one tree at a time"));
>
> This error text is a little difficult to parse, but I'm not sure that I
> have a great suggestion for improving it. The equivalent from GitHub's
> fork is "can only blame one tree at a time", and I think the difficulty
> in parsing is that "last-modified" isn't a verb.
Oh yeah, I've been struggling with that myself as well. I'm open to a
rename, if you've got a better name?
--
Cheers,
Toon
next prev parent reply other threads:[~2025-07-22 15:51 UTC|newest]
Thread overview: 135+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-22 17:46 [PATCH RFC 0/5] Introduce git-blame-tree(1) command Toon Claes
2025-04-22 17:46 ` [PATCH RFC 1/5] blame-tree: introduce new subcommand to blame files Toon Claes
2025-04-24 16:19 ` Junio C Hamano
2025-05-07 13:13 ` Toon Claes
2025-04-22 17:46 ` [PATCH RFC 2/5] t/perf: add blame-tree perf script Toon Claes
2025-04-22 17:46 ` [PATCH RFC 3/5] blame-tree: use Bloom filters when available Toon Claes
2025-04-22 17:46 ` [PATCH RFC 4/5] blame-tree: implement faster algorithm Toon Claes
2025-04-22 17:46 ` [PATCH RFC 5/5] blame-tree.c: initialize revision machinery without walk Toon Claes
2025-04-23 13:26 ` [PATCH RFC 0/5] Introduce git-blame-tree(1) command Marc Branchaud
2025-05-07 14:22 ` Toon Claes
2025-05-07 20:23 ` Marc Branchaud
2025-05-07 20:45 ` Junio C Hamano
2025-05-08 13:26 ` Marc Branchaud
2025-05-08 14:26 ` Junio C Hamano
2025-05-08 15:12 ` Marc Branchaud
2025-05-14 14:42 ` Toon Claes
2025-05-14 19:29 ` Junio C Hamano
2025-05-14 21:15 ` Marc Branchaud
2025-05-15 13:29 ` Patrick Steinhardt
2025-05-15 16:39 ` Junio C Hamano
2025-05-15 17:39 ` Marc Branchaud
2025-05-15 19:30 ` Jeff King
2025-05-16 4:38 ` Patrick Steinhardt
2025-05-20 8:49 ` Toon Claes
2025-05-15 17:30 ` Marc Branchaud
2025-05-16 4:30 ` Patrick Steinhardt
2025-05-14 21:15 ` Marc Branchaud
2025-05-07 20:49 ` Kristoffer Haugsbakk
2025-05-08 13:20 ` D. Ben Knoble
2025-05-08 13:26 ` Marc Branchaud
2025-05-08 13:18 ` D. Ben Knoble
2025-05-23 9:33 ` [PATCH RFC v2 0/5] Introduce git-last-modified(1) command Toon Claes
2025-05-23 9:33 ` [PATCH RFC v2 1/5] last-modified: new subcommand to show when files were last modified Toon Claes
2025-05-25 20:07 ` Justin Tobler
2025-06-05 8:32 ` Toon Claes
2025-05-27 10:39 ` Patrick Steinhardt
2025-06-13 9:34 ` Toon Claes
2025-06-13 9:52 ` Kristoffer Haugsbakk
2025-05-23 9:33 ` [PATCH RFC v2 2/5] t/perf: add last-modified perf script Toon Claes
2025-05-23 9:33 ` [PATCH RFC v2 3/5] last-modified: use Bloom filters when available Toon Claes
2025-05-27 10:40 ` Patrick Steinhardt
2025-06-13 11:05 ` Toon Claes
2025-05-23 9:33 ` [PATCH RFC v2 4/5] last-modified: implement faster algorithm Toon Claes
2025-05-27 10:39 ` Patrick Steinhardt
2025-05-23 9:33 ` [PATCH RFC v2 5/5] last-modified: initialize revision machinery without walk Toon Claes
2025-05-27 10:39 ` Patrick Steinhardt
2025-07-01 20:35 ` [PATCH RFC v2 0/5] Introduce git-last-modified(1) command Kristoffer Haugsbakk
2025-07-01 21:06 ` Junio C Hamano
2025-07-01 21:30 ` Kristoffer Haugsbakk
2025-07-02 13:00 ` Toon Claes
2025-07-09 15:53 ` Toon Claes
2025-07-09 17:00 ` Junio C Hamano
2025-06-30 18:49 ` [PATCH RFC v3 0/3] " Toon Claes
2025-06-30 18:49 ` [PATCH RFC v3 1/3] last-modified: new subcommand to show when files were last modified Toon Claes
2025-07-01 20:20 ` Kristoffer Haugsbakk
2025-07-02 11:51 ` Junio C Hamano
2025-06-30 18:49 ` [PATCH RFC v3 2/3] t/perf: add last-modified perf script Toon Claes
2025-06-30 18:49 ` [PATCH RFC v3 3/3] last-modified: use Bloom filters when available Toon Claes
2025-07-01 23:01 ` [PATCH RFC v3 0/3] Introduce git-last-modified(1) command Junio C Hamano
2025-07-09 15:26 ` [PATCH v4 " Toon Claes
2025-07-09 21:57 ` Junio C Hamano
2025-07-10 18:37 ` Junio C Hamano
2025-07-16 13:32 ` [PATCH v5 0/6] " Toon Claes
2025-07-16 13:35 ` [PATCH v5 1/6] last-modified: new subcommand to show when files were last modified Toon Claes
2025-07-18 0:02 ` Taylor Blau
2025-07-19 6:44 ` Jeff King
2025-07-22 15:50 ` Toon Claes [this message]
2025-08-01 9:09 ` Christian Couder
2025-08-01 16:59 ` Junio C Hamano
2025-07-16 13:35 ` [PATCH v5 2/6] t/perf: add last-modified perf script Toon Claes
2025-07-18 0:08 ` Taylor Blau
2025-07-22 15:52 ` Toon Claes
2025-07-16 13:35 ` [PATCH v5 3/6] last-modified: use Bloom filters when available Toon Claes
2025-07-18 0:16 ` Taylor Blau
2025-07-22 16:02 ` Toon Claes
2025-07-16 13:35 ` [PATCH v5 4/6] pretty: allow caller to disable indentation Toon Claes
2025-07-16 15:50 ` Junio C Hamano
2025-07-17 16:31 ` Toon Claes
2025-07-16 13:35 ` [PATCH v5 5/6] last-modified: support --extended format Toon Claes
2025-07-16 16:09 ` Junio C Hamano
2025-07-17 16:31 ` Toon Claes
2025-07-17 22:37 ` Junio C Hamano
2025-07-18 17:36 ` Junio C Hamano
2025-07-22 16:06 ` Toon Claes
2025-07-16 13:42 ` [PATCH v5 6/6] fixup! last-modified: use Bloom filters when available Toon Claes
2025-07-17 23:39 ` [PATCH v5 0/6] Introduce git-last-modified(1) command Taylor Blau
2025-07-22 15:35 ` Toon Claes
2025-07-30 17:59 ` Toon Claes
2025-07-31 7:45 ` Patrick Steinhardt
2025-07-30 17:55 ` [PATCH v6 0/4] " Toon Claes
2025-07-31 18:40 ` Junio C Hamano
2025-07-31 23:57 ` Junio C Hamano
2025-08-05 9:33 ` [PATCH v7 0/3] " Toon Claes
2025-08-05 14:34 ` Patrick Steinhardt
2025-08-05 16:21 ` Junio C Hamano
2025-08-05 16:34 ` Junio C Hamano
2025-08-05 16:55 ` Toon Claes
2025-08-05 17:20 ` Jean-Noël AVILA
2025-08-05 21:46 ` Junio C Hamano
2025-08-06 12:01 ` Toon Claes
2025-08-06 15:38 ` Junio C Hamano
2025-08-28 22:44 ` Junio C Hamano
2025-08-05 18:28 ` Junio C Hamano
2025-08-05 9:33 ` [PATCH v7 1/3] last-modified: new subcommand to show when files were last modified Toon Claes
2025-08-05 9:33 ` [PATCH v7 2/3] t/perf: add last-modified perf script Toon Claes
2025-08-05 9:33 ` [PATCH v7 3/3] last-modified: use Bloom filters when available Toon Claes
2025-07-30 17:55 ` [PATCH v6 1/4] last-modified: new subcommand to show when files were last modified Toon Claes
2025-07-31 6:42 ` Patrick Steinhardt
2025-08-01 16:22 ` Toon Claes
2025-08-01 17:09 ` Junio C Hamano
2025-08-04 6:34 ` Patrick Steinhardt
2025-08-04 17:14 ` Junio C Hamano
2025-08-05 5:35 ` Toon Claes
2025-08-01 20:34 ` Jean-Noël AVILA
2025-08-05 5:36 ` Toon Claes
2025-08-04 6:33 ` Patrick Steinhardt
2025-08-01 10:18 ` Christian Couder
2025-08-01 10:22 ` Patrick Steinhardt
2025-08-01 17:06 ` Junio C Hamano
2025-08-02 8:18 ` Christian Couder
2025-08-02 11:31 ` Christian Couder
2025-08-02 13:38 ` Christian Couder
2025-08-02 16:26 ` Junio C Hamano
2025-08-04 6:35 ` Patrick Steinhardt
2025-07-30 17:55 ` [PATCH v6 2/4] t/perf: add last-modified perf script Toon Claes
2025-07-30 17:55 ` [PATCH v6 3/4] commit-graph: export prepare_commit_graph() Toon Claes
2025-07-31 6:42 ` Patrick Steinhardt
2025-07-30 17:55 ` [PATCH v6 4/4] last-modified: use Bloom filters when available Toon Claes
2025-07-31 6:43 ` Patrick Steinhardt
2025-08-01 16:23 ` Toon Claes
2025-08-04 6:33 ` Patrick Steinhardt
2025-07-09 15:26 ` [PATCH v4 1/3] last-modified: new subcommand to show when files were last modified Toon Claes
2025-07-09 15:26 ` [PATCH v4 2/3] t/perf: add last-modified perf script Toon Claes
2025-07-09 15:26 ` [PATCH v4 3/3] last-modified: use Bloom filters when available Toon Claes
2025-07-16 13:35 ` [PATCH v5 6/6] fixup! " Toon Claes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877c00kyw6.fsf@iotcl.com \
--to=toon@iotcl$(echo .)com \
--cc=avarab@gmail$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=gitster@pobox$(echo .)com \
--cc=kristofferhaugsbakk@fastmail$(echo .)com \
--cc=me@ttaylorr$(echo .)com \
--cc=peff@peff$(echo .)net \
--cc=stolee@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox