From: Eric Wong <e@yhbt•net>
To: Ivan Baldo <ibaldo@gmail•com>
Cc: git@vger•kernel.org
Subject: Re: Fastest way to set files date and time to latest commit time of each one
Date: Sat, 29 Aug 2020 04:48:42 +0000 [thread overview]
Message-ID: <20200829044842.GA5732@dcvr> (raw)
In-Reply-To: <CAEbcw=3mOoYuJo2mQgqB2aJgn-D2i_7ZRmhfPvYNVHD1Kp8wuA@mail.gmail.com>
Ivan Baldo <ibaldo@gmail•com> wrote:
> Hello.
> I know this is not standard usage of git, but I need a way to have
> more stable dates and times in the files in order to avoid rsync
> checksumming.
> So I found this
> https://stackoverflow.com/questions/2179722/checking-out-old-file-with-original-create-modified-timestamps/2179876#2179876
> and modified it a bit to run in CentOS 7:
>
> IFS="
> "
> for FILE in $(git ls-files -z | tr '\0' '\n')
> do
> TIME=$(git log --pretty=format:%cd -n 1 --date=iso -- "$FILE")
> touch -c -m -d "$TIME" "$FILE"
> done
>
> Unfortunately it takes ages for a 84k files repo.
> I see the CPU usage is dominated by the git log command.
running git log for each file isn't necessary.
On Debian, rsync actually ships the `git-set-file-times' script
in /usr/share/doc/rsync/scripts/ which only runs `git log' once
and parses it.
You can also get my (original) version from:
https://yhbt.net/git-set-file-times
> I know a way I could use to split the work for all the CPU threads
> but anyway, I would like to know if you guys and girls know of a
> faster way to do this.
Much of your overhead is going to be from process spawning.
My Perl version reduces that significantly.
I haven't tried it with 84K files, but it'll have to keep all
those filenames in memory. I'm not sure if parallelizing
utime() syscalls is worth it, either; maybe it helps on SSD
more than HDD.
next prev parent reply other threads:[~2020-08-29 5:03 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-29 1:36 Fastest way to set files date and time to latest commit time of each one Ivan Baldo
2020-08-29 3:20 ` Junio C Hamano
2020-08-29 4:59 ` Raymond E. Pasco
2020-08-29 4:48 ` Eric Wong [this message]
2020-09-02 19:28 ` Ivan Baldo
2020-08-29 6:46 ` Andreas Schwab
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200829044842.GA5732@dcvr \
--to=e@yhbt$(echo .)net \
--cc=git@vger$(echo .)kernel.org \
--cc=ibaldo@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox