public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Toon Claes <toon@iotcl•com>
To: git@vger•kernel.org
Cc: Toon Claes <toon@iotcl•com>, Junio C Hamano <gitster@pobox•com>,
	Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail•com>,
	"Derrick Stolee" <stolee@gmail•com>,
	"Taylor Blau" <me@ttaylorr•com>
Subject: [PATCH v4 0/3] Introduce git-last-modified(1) command
Date: Wed,  9 Jul 2025 17:26:25 +0200	[thread overview]
Message-ID: <20250709152628.1644521-1-toon@iotcl.com> (raw)
In-Reply-To: <20250630-toon-new-blame-tree-v3-0-3516025dc3bc@iotcl.com>

This series adds the git-last-modified(1) subcommand. In the past the
subcommand was proposed[1] to be named git-blame-tree(1). This version
is based on the patches shared by the kind people at GitHub[2].

What is different from the series shared by GitHub:

* Renamed the subcommand from `blame-tree` to `last-modified`. There was
  some consensus[5] this name works better, so let's give it a try and
  see how this name feels.

* Patches for --max-depth are excluded. I think it's a separate topic to
  discuss and I'm not sure it needs to be part of series anyway. The
  main patch was submitted in the previous attempt[3] and if people
  consider it valuable, I'm happy to discuss that in a separate patch
  series.

* The patches in 'tb/blame-tree' at Taylor's fork[4] implements a
  caching layer. This feature reads/writes cached results in
  `.git/blame-tree/<hash>.btc`. To keep this series to a reviewable
  size, that feature is excluded from this series. I think it's better
  to submit this as a separate series.

* Squashed various commits together. Like they introduced a flag
  `--go-faster`, which later became the default and only implementation.
  That story was wrapped up in a single commit.

* Dropped the patches that attempt to increase performance for tree
  entries that have not been updated in a long time. In my testing I've
  seen both performance improvements *and* degradation with these
  changes:

  Test                                        HEAD~             HEAD
  ------------------------------------------------------------------------------------
  8020.1: top-level last-modified             4.52(4.38+0.11)   2.03(1.93+0.08) -55.1%
  8020.2: top-level recursive last-modified   5.79(5.64+0.11)   8.34(8.17+0.11) +44.0%
  8020.3: subdir last-modified                0.15(0.09+0.06)   0.19(0.14+0.06) +26.7%

  Before we include these patches, I want to make sure these changes
  have positive impact in all/most scenarios. This can happen in a
  separate series.

* The last-modified command isn't recursive by default. If you want
  recurse into subtrees, you need to pass `-r`.

* Fixed all memory leaks, and removed the use of
  USE_THE_REPOSITORY_VARIABLE.

I've set myself as the author and added Based-on-patch-by trailers to
credit the original authors. Let me know if you disagree.

Again thanks to Taylor and the people at GitHub for sharing these
patches. I hope we can work together to get this upstreamed.

[1]: https://lore.kernel.org/git/patch-1.1-0ea849d900b-20230205T204104Z-avarab@gmail.com/
[2]: https://lore.kernel.org/git/Z+XJ+1L3PnC9Dyba@nand.local/
[3]: https://lore.kernel.org/git/20250326-toon-blame-tree-v1-3-4173133f3786@iotcl.com/
[4]: git@github•com:ttaylorr/git.git
[5]: https://lore.kernel.org/git/aCbBKj7O9LjO3SMK@pks.im/

--
Cheers,
Toon

Signed-off-by: Toon Claes <toon@iotcl•com>
---
Changes in v4:
- Removed root-level `last-modified.[ch]` library code and moved code to
  `builtin/last-modified.c`. Historically we've had libary code (also because it
  was used in testtool), but we no longer need that separation. I'm sorry this
  makes the range-diff hard to read.
- Added the use of parse_options() to get better usage messages.
- Formatting fixes after conversation in
  https://lore.kernel.org/git/xmqqh5zvk5h0.fsf@gitster.g/
- Link to v3: https://lore.kernel.org/git/20250630-toon-new-blame-tree-v3-0-3516025dc3bc@iotcl.com/

Changes in v3:
- Updated benchmarks in commit messages.
- Removed the patches that attempt to increase performance for tree
  entries that have not been updated in a long time. (see above)
- Move handling failure in `last_modified_init()` to the caller.
- Sorted #include clauses lexicographically.
- Removed unneeded `commit` in `struct last_modified_entry`.
- Renamed some functions/variables and added some comments to make it
  easier to understand.
- Removed unnecessary checking of the commit-graph generation number.
- Link to v2: https://lore.kernel.org/r/20250523-toon-new-blame-tree-v2-0-101e4ca4c1c9@iotcl.com

Changes in v2:
- The subcommand is renamed from `blame-tree` to `last-modified`
- Documentation is added. Here we mark the command as experimental.
- Some test cases are added related to merges.
- Link to v1: https://lore.kernel.org/r/20250422-toon-new-blame-tree-v1-0-fdb51b8a394a@iotcl.com

Toon Claes (3):
  last-modified: new subcommand to show when files were last modified
  t/perf: add last-modified perf script
  last-modified: use Bloom filters when available

 .gitignore                           |   1 +
 Documentation/git-last-modified.adoc |  49 ++++
 Documentation/meson.build            |   1 +
 Makefile                             |   1 +
 builtin.h                            |   1 +
 builtin/last-modified.c              | 334 +++++++++++++++++++++++++++
 command-list.txt                     |   1 +
 git.c                                |   1 +
 meson.build                          |   1 +
 t/meson.build                        |   2 +
 t/perf/p8020-last-modified.sh        |  21 ++
 t/t8020-last-modified.sh             | 204 ++++++++++++++++
 12 files changed, 617 insertions(+)
 create mode 100644 Documentation/git-last-modified.adoc
 create mode 100644 builtin/last-modified.c
 create mode 100755 t/perf/p8020-last-modified.sh
 create mode 100755 t/t8020-last-modified.sh

Range-diff against v3:
1:  26a2d9b5e0 ! 1:  0cc625f3f5 last-modified: new subcommand to show when files were last modified
    @@ Documentation/git-last-modified.adoc (new)
     +SYNOPSIS
     +--------
     +[synopsis]
    -+git last-modified [-r] [<revision-range>] [[--] <path>...]
    ++git last-modified [-r] [-t] [<revision-range>] [[--] <path>...]
     +
     +DESCRIPTION
     +-----------
    @@ Documentation/git-last-modified.adoc (new)
     +[--] <path>...::
     +	For each _<path>_ given, the commit which last modified it is returned.
     +	Without an optional path parameter, all files and subdirectories
    -+	of the current working directory are included in the
    ++	in path traversal the are included in the output.
     +
     +SEE ALSO
     +--------
    @@ Documentation/meson.build: manpages = {
        'git-ls-remote.adoc' : 1,

      ## Makefile ##
    -@@ Makefile: LIB_OBJS += hook.o
    - LIB_OBJS += ident.o
    - LIB_OBJS += json-writer.o
    - LIB_OBJS += kwset.o
    -+LIB_OBJS += last-modified.o
    - LIB_OBJS += levenshtein.o
    - LIB_OBJS += line-log.o
    - LIB_OBJS += line-range.o
     @@ Makefile: BUILTIN_OBJS += builtin/hook.o
      BUILTIN_OBJS += builtin/index-pack.o
      BUILTIN_OBJS += builtin/init-db.o
    @@ builtin.h: int cmd_hook(int argc, const char **argv, const char *prefix, struct
      ## builtin/last-modified.c (new) ##
     @@
     +#include "git-compat-util.h"
    -+#include "last-modified.h"
    -+#include "hex.h"
    -+#include "quote.h"
    -+#include "config.h"
    -+#include "object-name.h"
    -+#include "parse-options.h"
     +#include "builtin.h"
    -+
    -+static void show_entry(const char *path, const struct commit *commit, void *d)
    -+{
    -+	struct last_modified *lm = d;
    -+
    -+	if (commit->object.flags & BOUNDARY)
    -+		putchar('^');
    -+	printf("%s\t", oid_to_hex(&commit->object.oid));
    -+
    -+	if (lm->rev.diffopt.line_termination)
    -+		write_name_quoted(path, stdout, '\n');
    -+	else
    -+		printf("%s%c", path, '\0');
    -+
    -+	fflush(stdout);
    -+}
    -+
    -+int cmd_last_modified(int argc,
    -+		   const char **argv,
    -+		   const char *prefix,
    -+		   struct repository *repo)
    -+{
    -+	struct last_modified lm;
    -+
    -+	repo_config(repo, git_default_config, NULL);
    -+
    -+	if (last_modified_init(&lm, repo, prefix, argc, argv))
    -+		die(_("error setting up last-modified traversal"));
    -+
    -+	if (last_modified_run(&lm, show_entry, &lm) < 0)
    -+		die(_("error running last-modified traversal"));
    -+
    -+	last_modified_release(&lm);
    -+
    -+	return 0;
    -+}
    -
    - ## command-list.txt ##
    -@@ command-list.txt: git-index-pack                          plumbingmanipulators
    - git-init                                mainporcelain           init
    - git-instaweb                            ancillaryinterrogators          complete
    - git-interpret-trailers                  purehelpers
    -+git-last-modified                       plumbinginterrogators
    - git-log                                 mainporcelain           info
    - git-ls-files                            plumbinginterrogators
    - git-ls-remote                           plumbinginterrogators
    -
    - ## git.c ##
    -@@ git.c: static struct cmd_struct commands[] = {
    - 	{ "init", cmd_init_db },
    - 	{ "init-db", cmd_init_db },
    - 	{ "interpret-trailers", cmd_interpret_trailers, RUN_SETUP_GENTLY },
    -+	{ "last-modified", cmd_last_modified, RUN_SETUP },
    - 	{ "log", cmd_log, RUN_SETUP },
    - 	{ "ls-files", cmd_ls_files, RUN_SETUP },
    - 	{ "ls-remote", cmd_ls_remote, RUN_SETUP_GENTLY },
    -
    - ## last-modified.c (new) ##
    -@@
    -+#include "git-compat-util.h"
     +#include "commit.h"
    ++#include "config.h"
     +#include "diff.h"
     +#include "diffcore.h"
    -+#include "last-modified.h"
    ++#include "hashmap.h"
    ++#include "hex.h"
     +#include "log-tree.h"
    ++#include "object-name.h"
     +#include "object.h"
    ++#include "parse-options.h"
    ++#include "quote.h"
     +#include "repository.h"
     +#include "revision.h"
     +
    @@ last-modified.c (new)
     +	const char path[FLEX_ARRAY];
     +};
     +
    ++static int last_modified_entry_hashcmp(const void *unused UNUSED,
    ++				       const struct hashmap_entry *hent1,
    ++				       const struct hashmap_entry *hent2,
    ++				       const void *path)
    ++{
    ++	const struct last_modified_entry *ent1 =
    ++		container_of(hent1, const struct last_modified_entry, hashent);
    ++	const struct last_modified_entry *ent2 =
    ++		container_of(hent2, const struct last_modified_entry, hashent);
    ++	return strcmp(ent1->path, path ? path : ent2->path);
    ++}
    ++
    ++struct last_modified {
    ++	struct hashmap paths;
    ++	struct rev_info rev;
    ++	int recursive, tree_in_recursive;
    ++};
    ++
    ++static void last_modified_release(struct last_modified *lm)
    ++{
    ++	hashmap_clear_and_free(&lm->paths, struct last_modified_entry, hashent);
    ++	release_revisions(&lm->rev);
    ++}
    ++
    ++typedef void (*last_modified_callback)(const char *path,
    ++				       const struct commit *commit, void *data);
    ++
    ++struct last_modified_callback_data {
    ++	struct commit *commit;
    ++	struct hashmap *paths;
    ++
    ++	last_modified_callback callback;
    ++	void *callback_data;
    ++};
    ++
     +static void add_path_from_diff(struct diff_queue_struct *q,
    -+			       struct diff_options *opt UNUSED,
    -+			       void *data)
    ++			       struct diff_options *opt UNUSED, void *data)
     +{
     +	struct last_modified *lm = data;
     +
    @@ last-modified.c (new)
     +	return 0;
     +}
     +
    -+static int last_modified_entry_hashcmp(const void *unused UNUSED,
    -+				    const struct hashmap_entry *hent1,
    -+				    const struct hashmap_entry *hent2,
    -+				    const void *path)
    -+{
    -+	const struct last_modified_entry *ent1 =
    -+		container_of(hent1, const struct last_modified_entry, hashent);
    -+	const struct last_modified_entry *ent2 =
    -+		container_of(hent2, const struct last_modified_entry, hashent);
    -+	return strcmp(ent1->path, path ? path : ent2->path);
    -+}
    -+
    -+int last_modified_init(struct last_modified *lm,
    -+		     struct repository *r,
    -+		     const char *prefix,
    -+		     int argc, const char **argv)
    -+{
    -+	memset(lm, 0, sizeof(*lm));
    -+	hashmap_init(&lm->paths, last_modified_entry_hashcmp, NULL, 0);
    -+
    -+	repo_init_revisions(r, &lm->rev, prefix);
    -+	lm->rev.def = "HEAD";
    -+	lm->rev.combine_merges = 1;
    -+	lm->rev.show_root_diff = 1;
    -+	lm->rev.boundary = 1;
    -+	lm->rev.no_commit_id = 1;
    -+	lm->rev.diff = 1;
    -+	if (setup_revisions(argc, argv, &lm->rev, NULL) > 1)
    -+		return error(_("unknown last-modified argument: %s"), argv[1]);
    -+
    -+	if (populate_paths_from_revs(lm) < 0)
    -+		return error(_("unable to setup last-modified"));
    -+
    -+	return 0;
    -+}
    -+
    -+void last_modified_release(struct last_modified *lm)
    -+{
    -+	hashmap_clear_and_free(&lm->paths, struct last_modified_entry, hashent);
    -+	release_revisions(&lm->rev);
    -+}
    -+
    -+struct last_modified_callback_data {
    -+	struct commit *commit;
    -+	struct hashmap *paths;
    -+
    -+	last_modified_callback callback;
    -+	void *callback_data;
    -+};
    -+
     +static void mark_path(const char *path, const struct object_id *oid,
     +		      struct last_modified_callback_data *data)
     +{
    @@ last-modified.c (new)
     +		default:
     +			/*
     +			 * Otherwise, we care only that we somehow arrived at
    -+			 * a final path/sha1 state. Note that this covers some
    ++			 * a final oid state. Note that this covers some
     +			 * potentially controversial areas, including:
     +			 *
     +			 *  1. A rename or copy will be found, as it is the
    @@ last-modified.c (new)
     +	}
     +}
     +
    -+int last_modified_run(struct last_modified *lm, last_modified_callback cb, void *cbdata)
    ++static int last_modified_run(struct last_modified *lm,
    ++			     last_modified_callback cb, void *cbdata)
     +{
     +	struct last_modified_callback_data data;
     +
    @@ last-modified.c (new)
     +
     +		if (data.commit->object.flags & BOUNDARY) {
     +			diff_tree_oid(lm->rev.repo->hash_algo->empty_tree,
    -+				       &data.commit->object.oid,
    -+				       "", &lm->rev.diffopt);
    ++				      &data.commit->object.oid, "",
    ++				      &lm->rev.diffopt);
     +			diff_flush(&lm->rev.diffopt);
     +		} else {
     +			log_tree_commit(&lm->rev, data.commit);
    @@ last-modified.c (new)
     +	}
     +
     +	return 0;
    ++}
    ++
    ++static void show_entry(const char *path, const struct commit *commit, void *d)
    ++{
    ++	struct last_modified *lm = d;
    ++
    ++	if (commit->object.flags & BOUNDARY)
    ++		putchar('^');
    ++	printf("%s\t", oid_to_hex(&commit->object.oid));
    ++
    ++	if (lm->rev.diffopt.line_termination)
    ++		write_name_quoted(path, stdout, '\n');
    ++	else
    ++		printf("%s%c", path, '\0');
    ++
    ++	fflush(stdout);
    ++}
    ++
    ++static int last_modified_init(struct last_modified *lm, struct repository *r,
    ++			      const char *prefix, int argc, const char **argv)
    ++{
    ++	hashmap_init(&lm->paths, last_modified_entry_hashcmp, NULL, 0);
    ++
    ++	repo_init_revisions(r, &lm->rev, prefix);
    ++	lm->rev.def = "HEAD";
    ++	lm->rev.combine_merges = 1;
    ++	lm->rev.show_root_diff = 1;
    ++	lm->rev.boundary = 1;
    ++	lm->rev.no_commit_id = 1;
    ++	lm->rev.diff = 1;
    ++	lm->rev.diffopt.flags.recursive = lm->recursive || lm->tree_in_recursive;
    ++	lm->rev.diffopt.flags.tree_in_recursive = lm->tree_in_recursive;
    ++
    ++	if ((argc = setup_revisions(argc, argv, &lm->rev, NULL)) > 1) {
    ++		error(_("unknown last-modified argument: %s"), argv[1]);
    ++		return argc;
    ++	}
    ++
    ++	if (populate_paths_from_revs(lm) < 0)
    ++		return error(_("unable to setup last-modified"));
    ++
    ++	return 0;
    ++}
    ++
    ++int cmd_last_modified(int argc, const char **argv, const char *prefix,
    ++		      struct repository *repo)
    ++{
    ++	int ret;
    ++	struct last_modified lm;
    ++
    ++	const char * const last_modified_usage[] = {
    ++		N_("git last-modified [-r] [-t] "
    ++		   "[<revision-range>] [[--] <path>...]"),
    ++		NULL
    ++	};
    ++
    ++	struct option last_modified_options[] = {
    ++		OPT_BOOL('r', "recursive", &lm.recursive,
    ++			 N_("recurse into subtrees")),
    ++		OPT_BOOL('t', "tree-in-recursive", &lm.tree_in_recursive,
    ++			 N_("recurse into subtrees and include the tree entries too")),
    ++		OPT_END()
    ++	};
    ++
    ++	memset(&lm, 0, sizeof(lm));
    ++
    ++	argc = parse_options(argc, argv, prefix, last_modified_options,
    ++			     last_modified_usage,
    ++			     PARSE_OPT_KEEP_ARGV0 | PARSE_OPT_KEEP_UNKNOWN_OPT);
    ++
    ++	repo_config(repo, git_default_config, NULL);
    ++
    ++	if ((ret = last_modified_init(&lm, repo, prefix, argc, argv))) {
    ++		if (ret > 0)
    ++			usage_with_options(last_modified_usage,
    ++					   last_modified_options);
    ++		goto out;
    ++	}
    ++
    ++	if ((ret = last_modified_run(&lm, show_entry, &lm)))
    ++		goto out;
    ++
    ++out:
    ++	last_modified_release(&lm);
    ++
    ++	return ret;
     +}

    - ## last-modified.h (new) ##
    -@@
    -+#ifndef LAST_MODIFIED_H
    -+#define LAST_MODIFIED_H
    -+
    -+#include "commit.h"
    -+#include "hashmap.h"
    -+#include "revision.h"
    -+
    -+struct last_modified {
    -+	struct hashmap paths;
    -+	struct rev_info rev;
    -+};
    -+
    -+/*
    -+ * Initialize the last-modified machinery using command line arguments.
    -+ */
    -+int last_modified_init(struct last_modified *lm,
    -+		     struct repository *r,
    -+		     const char *prefix,
    -+		     int argc, const char **argv);
    -+
    -+void last_modified_release(struct last_modified *);
    -+
    -+typedef void (*last_modified_callback)(const char *path,
    -+				    const struct commit *commit,
    -+				    void *data);
    -+
    -+/*
    -+ * Run the last-modified traversal. For each path found the callback is called
    -+ * passing the path, the commit, and the cbdata.
    -+ */
    -+int last_modified_run(struct last_modified *lm,
    -+		   last_modified_callback cb,
    -+		   void *cbdata);
    -+
    -+#endif /* LAST_MODIFIED_H */
    + ## command-list.txt ##
    +@@ command-list.txt: git-index-pack                          plumbingmanipulators
    + git-init                                mainporcelain           init
    + git-instaweb                            ancillaryinterrogators          complete
    + git-interpret-trailers                  purehelpers
    ++git-last-modified                       plumbinginterrogators
    + git-log                                 mainporcelain           info
    + git-ls-files                            plumbinginterrogators
    + git-ls-remote                           plumbinginterrogators
    +
    + ## git.c ##
    +@@ git.c: static struct cmd_struct commands[] = {
    + 	{ "init", cmd_init_db },
    + 	{ "init-db", cmd_init_db },
    + 	{ "interpret-trailers", cmd_interpret_trailers, RUN_SETUP_GENTLY },
    ++	{ "last-modified", cmd_last_modified, RUN_SETUP },
    + 	{ "log", cmd_log, RUN_SETUP },
    + 	{ "ls-files", cmd_ls_files, RUN_SETUP },
    + 	{ "ls-remote", cmd_ls_remote, RUN_SETUP_GENTLY },

      ## meson.build ##
    -@@ meson.build: libgit_sources = [
    -   'ident.c',
    -   'json-writer.c',
    -   'kwset.c',
    -+  'last-modified.c',
    -   'levenshtein.c',
    -   'line-log.c',
    -   'line-range.c',
     @@ meson.build: builtin_sources = [
        'builtin/index-pack.c',
        'builtin/init-db.c',
2:  0691884735 = 2:  a017f2c81c t/perf: add last-modified perf script
3:  393f304a3f ! 3:  c739a7dbcc last-modified: use Bloom filters when available
    @@ Commit message

         Comparing the perf test results on git.git:

    -    Test                                        HEAD~             HEAD
    -    ------------------------------------------------------------------------------------
    -    8020.1: top-level last-modified             4.49(4.34+0.11)   2.22(2.05+0.09) -50.6%
    -    8020.2: top-level recursive last-modified   5.64(5.45+0.11)   5.62(5.30+0.11) -0.4%
    -    8020.3: subdir last-modified                0.11(0.06+0.04)   0.07(0.03+0.04) -36.4%
    +        Test                                        HEAD~             HEAD
    +        ------------------------------------------------------------------------------------
    +        8020.1: top-level last-modified             4.49(4.34+0.11)   2.22(2.05+0.09) -50.6%
    +        8020.2: top-level recursive last-modified   5.64(5.45+0.11)   5.62(5.30+0.11) -0.4%
    +        8020.3: subdir last-modified                0.11(0.06+0.04)   0.07(0.03+0.04) -36.4%

         Based-on-patch-by: Taylor Blau <me@ttaylorr•com>
         Signed-off-by: Toon Claes <toon@iotcl•com>

    - ## last-modified.c ##
    + ## builtin/last-modified.c ##
     @@
      #include "git-compat-util.h"
     +#include "bloom.h"
    + #include "builtin.h"
     +#include "commit-graph.h"
      #include "commit.h"
    + #include "config.h"
      #include "diff.h"
    - #include "diffcore.h"
    -+#include "dir.h"
    - #include "last-modified.h"
    - #include "log-tree.h"
    - #include "object.h"
     @@
      struct last_modified_entry {
      	struct hashmap_entry hashent;
    @@ last-modified.c
      	const char path[FLEX_ARRAY];
      };

    -@@ last-modified.c: static void add_path_from_diff(struct diff_queue_struct *q,
    +@@ builtin/last-modified.c: struct last_modified {

    - 		FLEX_ALLOC_STR(ent, path, path);
    - 		oidcpy(&ent->oid, &p->two->oid);
    -+		if (lm->rev.bloom_filter_settings)
    -+			fill_bloom_key(path, strlen(path), &ent->key,
    -+				       lm->rev.bloom_filter_settings);
    - 		hashmap_entry_init(&ent->hashent, strhash(ent->path));
    - 		hashmap_add(&lm->paths, &ent->hashent);
    - 	}
    -@@ last-modified.c: int last_modified_init(struct last_modified *lm,
    - 	if (setup_revisions(argc, argv, &lm->rev, NULL) > 1)
    - 		return error(_("unknown last-modified argument: %s"), argv[1]);
    -
    -+	/*
    -+	 * We're not interested in generation numbers here,
    -+	 * but calling this function to prepare the commit-graph.
    -+	 */
    -+	(void)generation_numbers_enabled(lm->rev.repo);
    -+	lm->rev.bloom_filter_settings = get_bloom_filter_settings(lm->rev.repo);
    -+
    - 	if (populate_paths_from_revs(lm) < 0)
    - 		return error(_("unable to setup last-modified"));
    -
    -@@ last-modified.c: int last_modified_init(struct last_modified *lm,
    -
    - void last_modified_release(struct last_modified *lm)
    + static void last_modified_release(struct last_modified *lm)
      {
     +	struct hashmap_iter iter;
     +	struct last_modified_entry *ent;
    @@ last-modified.c: int last_modified_init(struct last_modified *lm,
      	hashmap_clear_and_free(&lm->paths, struct last_modified_entry, hashent);
      	release_revisions(&lm->rev);
      }
    -@@ last-modified.c: static void mark_path(const char *path, const struct object_id *oid,
    +@@ builtin/last-modified.c: static void add_path_from_diff(struct diff_queue_struct *q,
    +
    + 		FLEX_ALLOC_STR(ent, path, path);
    + 		oidcpy(&ent->oid, &p->two->oid);
    ++		if (lm->rev.bloom_filter_settings)
    ++			fill_bloom_key(path, strlen(path), &ent->key,
    ++				       lm->rev.bloom_filter_settings);
    + 		hashmap_entry_init(&ent->hashent, strhash(ent->path));
    + 		hashmap_add(&lm->paths, &ent->hashent);
    + 	}
    +@@ builtin/last-modified.c: static void mark_path(const char *path, const struct object_id *oid,
      		data->callback(path, data->commit, data->callback_data);

      	hashmap_remove(data->paths, &ent->hashent, path);
    @@ last-modified.c: static void mark_path(const char *path, const struct object_id
      	free(ent);
      }

    -@@ last-modified.c: static void last_modified_diff(struct diff_queue_struct *q,
    +@@ builtin/last-modified.c: static void last_modified_diff(struct diff_queue_struct *q,
      	}
      }

    ++
     +static int maybe_changed_path(struct last_modified *lm, struct commit *origin)
     +{
     +	struct bloom_filter *filter;
    @@ last-modified.c: static void last_modified_diff(struct diff_queue_struct *q,
     +	return 0;
     +}
     +
    - int last_modified_run(struct last_modified *lm, last_modified_callback cb, void *cbdata)
    + static int last_modified_run(struct last_modified *lm,
    + 			     last_modified_callback cb, void *cbdata)
      {
    - 	struct last_modified_callback_data data;
    -@@ last-modified.c: int last_modified_run(struct last_modified *lm, last_modified_callback cb, void
    +@@ builtin/last-modified.c: static int last_modified_run(struct last_modified *lm,
      		if (!data.commit)
      			break;

    @@ last-modified.c: int last_modified_run(struct last_modified *lm, last_modified_c
     +
      		if (data.commit->object.flags & BOUNDARY) {
      			diff_tree_oid(lm->rev.repo->hash_algo->empty_tree,
    - 				       &data.commit->object.oid,
    + 				      &data.commit->object.oid, "",
    +@@ builtin/last-modified.c: static int last_modified_init(struct last_modified *lm, struct repository *r,
    + 		return argc;
    + 	}
    +
    ++	/*
    ++	 * We're not interested in generation numbers here,
    ++	 * but calling this function to prepare the commit-graph.
    ++	 */
    ++	(void)generation_numbers_enabled(lm->rev.repo);
    ++	lm->rev.bloom_filter_settings = get_bloom_filter_settings(lm->rev.repo);
    ++
    + 	if (populate_paths_from_revs(lm) < 0)
    + 		return error(_("unable to setup last-modified"));
    +

base-commit: 41905d60226a0346b22f0d0d99428c746a5a3b14
--
2.50.0.rc0.18.gfcfe60668e

  parent reply	other threads:[~2025-07-09 15:26 UTC|newest]

Thread overview: 135+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-22 17:46 [PATCH RFC 0/5] Introduce git-blame-tree(1) command Toon Claes
2025-04-22 17:46 ` [PATCH RFC 1/5] blame-tree: introduce new subcommand to blame files Toon Claes
2025-04-24 16:19   ` Junio C Hamano
2025-05-07 13:13     ` Toon Claes
2025-04-22 17:46 ` [PATCH RFC 2/5] t/perf: add blame-tree perf script Toon Claes
2025-04-22 17:46 ` [PATCH RFC 3/5] blame-tree: use Bloom filters when available Toon Claes
2025-04-22 17:46 ` [PATCH RFC 4/5] blame-tree: implement faster algorithm Toon Claes
2025-04-22 17:46 ` [PATCH RFC 5/5] blame-tree.c: initialize revision machinery without walk Toon Claes
2025-04-23 13:26 ` [PATCH RFC 0/5] Introduce git-blame-tree(1) command Marc Branchaud
2025-05-07 14:22   ` Toon Claes
2025-05-07 20:23     ` Marc Branchaud
2025-05-07 20:45       ` Junio C Hamano
2025-05-08 13:26         ` Marc Branchaud
2025-05-08 14:26           ` Junio C Hamano
2025-05-08 15:12             ` Marc Branchaud
2025-05-14 14:42               ` Toon Claes
2025-05-14 19:29                 ` Junio C Hamano
2025-05-14 21:15                   ` Marc Branchaud
2025-05-15 13:29                     ` Patrick Steinhardt
2025-05-15 16:39                       ` Junio C Hamano
2025-05-15 17:39                         ` Marc Branchaud
2025-05-15 19:30                           ` Jeff King
2025-05-16  4:38                             ` Patrick Steinhardt
2025-05-20  8:49                               ` Toon Claes
2025-05-15 17:30                       ` Marc Branchaud
2025-05-16  4:30                         ` Patrick Steinhardt
2025-05-14 21:15                 ` Marc Branchaud
2025-05-07 20:49       ` Kristoffer Haugsbakk
2025-05-08 13:20         ` D. Ben Knoble
2025-05-08 13:26         ` Marc Branchaud
2025-05-08 13:18       ` D. Ben Knoble
2025-05-23  9:33 ` [PATCH RFC v2 0/5] Introduce git-last-modified(1) command Toon Claes
2025-05-23  9:33   ` [PATCH RFC v2 1/5] last-modified: new subcommand to show when files were last modified Toon Claes
2025-05-25 20:07     ` Justin Tobler
2025-06-05  8:32       ` Toon Claes
2025-05-27 10:39     ` Patrick Steinhardt
2025-06-13  9:34       ` Toon Claes
2025-06-13  9:52         ` Kristoffer Haugsbakk
2025-05-23  9:33   ` [PATCH RFC v2 2/5] t/perf: add last-modified perf script Toon Claes
2025-05-23  9:33   ` [PATCH RFC v2 3/5] last-modified: use Bloom filters when available Toon Claes
2025-05-27 10:40     ` Patrick Steinhardt
2025-06-13 11:05       ` Toon Claes
2025-05-23  9:33   ` [PATCH RFC v2 4/5] last-modified: implement faster algorithm Toon Claes
2025-05-27 10:39     ` Patrick Steinhardt
2025-05-23  9:33   ` [PATCH RFC v2 5/5] last-modified: initialize revision machinery without walk Toon Claes
2025-05-27 10:39     ` Patrick Steinhardt
2025-07-01 20:35   ` [PATCH RFC v2 0/5] Introduce git-last-modified(1) command Kristoffer Haugsbakk
2025-07-01 21:06     ` Junio C Hamano
2025-07-01 21:30       ` Kristoffer Haugsbakk
2025-07-02 13:00         ` Toon Claes
2025-07-09 15:53           ` Toon Claes
2025-07-09 17:00             ` Junio C Hamano
2025-06-30 18:49 ` [PATCH RFC v3 0/3] " Toon Claes
2025-06-30 18:49   ` [PATCH RFC v3 1/3] last-modified: new subcommand to show when files were last modified Toon Claes
2025-07-01 20:20     ` Kristoffer Haugsbakk
2025-07-02 11:51     ` Junio C Hamano
2025-06-30 18:49   ` [PATCH RFC v3 2/3] t/perf: add last-modified perf script Toon Claes
2025-06-30 18:49   ` [PATCH RFC v3 3/3] last-modified: use Bloom filters when available Toon Claes
2025-07-01 23:01   ` [PATCH RFC v3 0/3] Introduce git-last-modified(1) command Junio C Hamano
2025-07-09 15:26   ` Toon Claes [this message]
2025-07-09 21:57     ` [PATCH v4 " Junio C Hamano
2025-07-10 18:37       ` Junio C Hamano
2025-07-16 13:32     ` [PATCH v5 0/6] " Toon Claes
2025-07-16 13:35       ` [PATCH v5 1/6] last-modified: new subcommand to show when files were last modified Toon Claes
2025-07-18  0:02         ` Taylor Blau
2025-07-19  6:44           ` Jeff King
2025-07-22 15:50           ` Toon Claes
2025-08-01  9:09           ` Christian Couder
2025-08-01 16:59             ` Junio C Hamano
2025-07-16 13:35       ` [PATCH v5 2/6] t/perf: add last-modified perf script Toon Claes
2025-07-18  0:08         ` Taylor Blau
2025-07-22 15:52           ` Toon Claes
2025-07-16 13:35       ` [PATCH v5 3/6] last-modified: use Bloom filters when available Toon Claes
2025-07-18  0:16         ` Taylor Blau
2025-07-22 16:02           ` Toon Claes
2025-07-16 13:35       ` [PATCH v5 4/6] pretty: allow caller to disable indentation Toon Claes
2025-07-16 15:50         ` Junio C Hamano
2025-07-17 16:31           ` Toon Claes
2025-07-16 13:35       ` [PATCH v5 5/6] last-modified: support --extended format Toon Claes
2025-07-16 16:09         ` Junio C Hamano
2025-07-17 16:31           ` Toon Claes
2025-07-17 22:37         ` Junio C Hamano
2025-07-18 17:36           ` Junio C Hamano
2025-07-22 16:06             ` Toon Claes
2025-07-16 13:42       ` [PATCH v5 6/6] fixup! last-modified: use Bloom filters when available Toon Claes
2025-07-17 23:39       ` [PATCH v5 0/6] Introduce git-last-modified(1) command Taylor Blau
2025-07-22 15:35         ` Toon Claes
2025-07-30 17:59           ` Toon Claes
2025-07-31  7:45             ` Patrick Steinhardt
2025-07-30 17:55       ` [PATCH v6 0/4] " Toon Claes
2025-07-31 18:40         ` Junio C Hamano
2025-07-31 23:57           ` Junio C Hamano
2025-08-05  9:33         ` [PATCH v7 0/3] " Toon Claes
2025-08-05 14:34           ` Patrick Steinhardt
2025-08-05 16:21             ` Junio C Hamano
2025-08-05 16:34           ` Junio C Hamano
2025-08-05 16:55             ` Toon Claes
2025-08-05 17:20               ` Jean-Noël AVILA
2025-08-05 21:46                 ` Junio C Hamano
2025-08-06 12:01                   ` Toon Claes
2025-08-06 15:38                     ` Junio C Hamano
2025-08-28 22:44                       ` Junio C Hamano
2025-08-05 18:28               ` Junio C Hamano
2025-08-05  9:33         ` [PATCH v7 1/3] last-modified: new subcommand to show when files were last modified Toon Claes
2025-08-05  9:33         ` [PATCH v7 2/3] t/perf: add last-modified perf script Toon Claes
2025-08-05  9:33         ` [PATCH v7 3/3] last-modified: use Bloom filters when available Toon Claes
2025-07-30 17:55       ` [PATCH v6 1/4] last-modified: new subcommand to show when files were last modified Toon Claes
2025-07-31  6:42         ` Patrick Steinhardt
2025-08-01 16:22           ` Toon Claes
2025-08-01 17:09             ` Junio C Hamano
2025-08-04  6:34               ` Patrick Steinhardt
2025-08-04 17:14                 ` Junio C Hamano
2025-08-05  5:35                   ` Toon Claes
2025-08-01 20:34             ` Jean-Noël AVILA
2025-08-05  5:36               ` Toon Claes
2025-08-04  6:33             ` Patrick Steinhardt
2025-08-01 10:18         ` Christian Couder
2025-08-01 10:22           ` Patrick Steinhardt
2025-08-01 17:06             ` Junio C Hamano
2025-08-02  8:18               ` Christian Couder
2025-08-02 11:31                 ` Christian Couder
2025-08-02 13:38                   ` Christian Couder
2025-08-02 16:26                     ` Junio C Hamano
2025-08-04  6:35               ` Patrick Steinhardt
2025-07-30 17:55       ` [PATCH v6 2/4] t/perf: add last-modified perf script Toon Claes
2025-07-30 17:55       ` [PATCH v6 3/4] commit-graph: export prepare_commit_graph() Toon Claes
2025-07-31  6:42         ` Patrick Steinhardt
2025-07-30 17:55       ` [PATCH v6 4/4] last-modified: use Bloom filters when available Toon Claes
2025-07-31  6:43         ` Patrick Steinhardt
2025-08-01 16:23           ` Toon Claes
2025-08-04  6:33             ` Patrick Steinhardt
2025-07-09 15:26   ` [PATCH v4 1/3] last-modified: new subcommand to show when files were last modified Toon Claes
2025-07-09 15:26   ` [PATCH v4 2/3] t/perf: add last-modified perf script Toon Claes
2025-07-09 15:26   ` [PATCH v4 3/3] last-modified: use Bloom filters when available Toon Claes
2025-07-16 13:35   ` [PATCH v5 6/6] fixup! " Toon Claes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250709152628.1644521-1-toon@iotcl.com \
    --to=toon@iotcl$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitster@pobox$(echo .)com \
    --cc=kristofferhaugsbakk@fastmail$(echo .)com \
    --cc=me@ttaylorr$(echo .)com \
    --cc=stolee@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox