* [PATCH] t7528: work around ETOOMANY in OpenSSH 10.1 and newer
@ 2025-10-23 7:14 Patrick Steinhardt
2025-10-23 12:43 ` Jeff King
0 siblings, 1 reply; 4+ messages in thread
From: Patrick Steinhardt @ 2025-10-23 7:14 UTC (permalink / raw)
To: git; +Cc: Xi Ruoyao, brian m. carlson, Jeff King, Lauri Tirkkonen
In t7528 we spawn an SSH agent to verify that we can sign a commit via
it. This test has started to fail on some machines:
+++ ssh-agent
unix_listener_tmp: path "/home/pks/Development/git/build/test-output/trash directory.t7528-signed-commit-ssh/.ssh/agent/s.UTulegefEg.agent.UrPHumMXPq" too long for Unix domain socket
main: Couldn't prepare agent socket
As it turns out this is caused by a change in OpenSSH 10.1 [1]:
* ssh-agent(1), sshd(8): move agent listener sockets from /tmp to
under ~/.ssh/agent for both ssh-agent(1) and forwarded sockets
in sshd(8).
Instead of creating the socket in "/tmp", OpenSSH now creates the socket
in our home directory. And as the home directory gets modified to be
located in our test output directory we end up with paths that are
somewhat long. But Linux has a rather short limit of 108 characters for
socket paths, and other systems have even lower limits, so it is very
easy now to exceed the limit and run into the above error.
Work around the issue by using `ssh-agent -T`, which instructs it to
use the old behaviour and create the socket in "/tmp" again. This switch
has only been introduced with 10.1 though, so for older versions we have
to fall back to not using it. That's fine though, as older versions know
to put the socket into "/tmp" already.
An alternative approach would be to abbreviate the socket name itself so
that we create it as e.g. "sshsock" in the trash directory. But taking
the above example we'd still end up with a path that is 91 characters
long. So we wouldn't really have a lot of headroom, and it is quite
likely that some developers would see the issue on their machines.
[1]: https://www.openssh.com/txt/release-10.1
Reported-by: Xi Ruoyao <xry111@xry111•site>
Suggested-by: brian m. carlson <sandals@crustytoothpaste•net>
Helped-by: Jeff King <peff@peff•net>
Helped-by: Lauri Tirkkonen <lauri@hacktheplanet•fi>
Signed-off-by: Patrick Steinhardt <ps@pks•im>
---
Hi,
I now started to see the issue reported in [1] on my own machine and in
our CI. I couldn't find a patch yet, so I decided to take the discussion
that happened in this thread and cast it into a patch to fix this.
As I am merely taking what others have debugged and agreed on I went a
bit overboard with giving credit.
Thanks!
Patrick
[1]: <4e2952e512afc780b621d2c153b3e6e4eb7ed89a.camel@xry111•site>
---
t/t7528-signed-commit-ssh.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/t/t7528-signed-commit-ssh.sh b/t/t7528-signed-commit-ssh.sh
index 0f887a3ebee..b50306b9b39 100755
--- a/t/t7528-signed-commit-ssh.sh
+++ b/t/t7528-signed-commit-ssh.sh
@@ -82,7 +82,7 @@ test_expect_success GPGSSH 'create signed commits' '
test_expect_success GPGSSH 'sign commits using literal public keys with ssh-agent' '
test_when_finished "test_unconfig commit.gpgsign" &&
test_config gpg.format ssh &&
- eval $(ssh-agent) &&
+ eval $(ssh-agent -T || ssh-agent) &&
test_when_finished "kill ${SSH_AGENT_PID}" &&
test_when_finished "test_unconfig user.signingkey" &&
mkdir tmpdir &&
---
base-commit: c54a18ef67e59cdbcd77d6294916d42c98c62d1d
change-id: 20251023-b4-pks-t7528-ssh-agent-socket-name-too-long-69fad53b67ea
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] t7528: work around ETOOMANY in OpenSSH 10.1 and newer
2025-10-23 7:14 [PATCH] t7528: work around ETOOMANY in OpenSSH 10.1 and newer Patrick Steinhardt
@ 2025-10-23 12:43 ` Jeff King
2025-10-23 13:24 ` Patrick Steinhardt
0 siblings, 1 reply; 4+ messages in thread
From: Jeff King @ 2025-10-23 12:43 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Xi Ruoyao, brian m. carlson, Lauri Tirkkonen
On Thu, Oct 23, 2025 at 09:14:59AM +0200, Patrick Steinhardt wrote:
> As it turns out this is caused by a change in OpenSSH 10.1 [1]:
>
> * ssh-agent(1), sshd(8): move agent listener sockets from /tmp to
> under ~/.ssh/agent for both ssh-agent(1) and forwarded sockets
> in sshd(8).
>
> Instead of creating the socket in "/tmp", OpenSSH now creates the socket
> in our home directory. And as the home directory gets modified to be
> located in our test output directory we end up with paths that are
> somewhat long. But Linux has a rather short limit of 108 characters for
> socket paths, and other systems have even lower limits, so it is very
> easy now to exceed the limit and run into the above error.
There's a secondary issue, too: even if the path is short enough, the
space in "trash directory" of the path will break the shell eval. That's
relevant below.
> Work around the issue by using `ssh-agent -T`, which instructs it to
> use the old behaviour and create the socket in "/tmp" again. This switch
> has only been introduced with 10.1 though, so for older versions we have
> to fall back to not using it. That's fine though, as older versions know
> to put the socket into "/tmp" already.
OK. I think this is an improvement over the status quo, though it leaves
a lot of loose ends, like:
- what happens if "ssh-agent" does not exist at all; we do not notice
the error because the eval succeeds anyway (with blank input)
- one reason we did not notice this immediately is that the failure
mode is to fall back to using the user's SSH_AUTH_SOCK variable if
set (i.e., their real agent with their keys in it!). We should
perhaps be clearing that variable in test-lib.sh.
But those are not really new issues, and I'm OK with just un-breaking
things in the most expedient way possible.
> An alternative approach would be to abbreviate the socket name itself so
> that we create it as e.g. "sshsock" in the trash directory. But taking
> the above example we'd still end up with a path that is 91 characters
> long. So we wouldn't really have a lot of headroom, and it is quite
> likely that some developers would see the issue on their machines.
I assume you mean here something like:
ssh-agent "$PWD/sshsock"
Yeah, that is not buying us that much in terms of headroom. Plus it
would still run afoul of the space issue, since we know that $PWD will
always contain "trash directory".
If we are going to provide a fixed name, I think it would have to be a
true relative path like:
ssh-agent ./sshsock
That does work (and SSH_AUTH_SOCK contains the relative path), but is
maybe a bit of a booby trap waiting to spring on somebody who tries to
access the agent with a different current working directory.
-Peff
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] t7528: work around ETOOMANY in OpenSSH 10.1 and newer
2025-10-23 12:43 ` Jeff King
@ 2025-10-23 13:24 ` Patrick Steinhardt
2025-10-23 13:34 ` Jeff King
0 siblings, 1 reply; 4+ messages in thread
From: Patrick Steinhardt @ 2025-10-23 13:24 UTC (permalink / raw)
To: Jeff King; +Cc: git, Xi Ruoyao, brian m. carlson, Lauri Tirkkonen
On Thu, Oct 23, 2025 at 08:43:20AM -0400, Jeff King wrote:
> On Thu, Oct 23, 2025 at 09:14:59AM +0200, Patrick Steinhardt wrote:
>
> > As it turns out this is caused by a change in OpenSSH 10.1 [1]:
> >
> > * ssh-agent(1), sshd(8): move agent listener sockets from /tmp to
> > under ~/.ssh/agent for both ssh-agent(1) and forwarded sockets
> > in sshd(8).
> >
> > Instead of creating the socket in "/tmp", OpenSSH now creates the socket
> > in our home directory. And as the home directory gets modified to be
> > located in our test output directory we end up with paths that are
> > somewhat long. But Linux has a rather short limit of 108 characters for
> > socket paths, and other systems have even lower limits, so it is very
> > easy now to exceed the limit and run into the above error.
>
> There's a secondary issue, too: even if the path is short enough, the
> space in "trash directory" of the path will break the shell eval. That's
> relevant below.
>
> > Work around the issue by using `ssh-agent -T`, which instructs it to
> > use the old behaviour and create the socket in "/tmp" again. This switch
> > has only been introduced with 10.1 though, so for older versions we have
> > to fall back to not using it. That's fine though, as older versions know
> > to put the socket into "/tmp" already.
>
> OK. I think this is an improvement over the status quo, though it leaves
> a lot of loose ends, like:
>
> - what happens if "ssh-agent" does not exist at all; we do not notice
> the error because the eval succeeds anyway (with blank input)
>
> - one reason we did not notice this immediately is that the failure
> mode is to fall back to using the user's SSH_AUTH_SOCK variable if
> set (i.e., their real agent with their keys in it!). We should
> perhaps be clearing that variable in test-lib.sh.
>
> But those are not really new issues, and I'm OK with just un-breaking
> things in the most expedient way possible.
Yeah. I was wondering whether we should rather do:
( ssh-agent -F || ssh-agent ) >env &&
source env
ssh-agent(1) knows to detach into the background unless told otherwise,
so we should notice the failure and can then source the environment if
it was successful. But I ultimately decided that for now I'd rather want
to fix the fallout, we can still make it more robust after the fact.
> > An alternative approach would be to abbreviate the socket name itself so
> > that we create it as e.g. "sshsock" in the trash directory. But taking
> > the above example we'd still end up with a path that is 91 characters
> > long. So we wouldn't really have a lot of headroom, and it is quite
> > likely that some developers would see the issue on their machines.
>
> I assume you mean here something like:
>
> ssh-agent "$PWD/sshsock"
>
> Yeah, that is not buying us that much in terms of headroom. Plus it
> would still run afoul of the space issue, since we know that $PWD will
> always contain "trash directory".
Yup, that was the idea, and yeah, I don't think it helps us much.
> If we are going to provide a fixed name, I think it would have to be a
> true relative path like:
>
> ssh-agent ./sshsock
>
> That does work (and SSH_AUTH_SOCK contains the relative path), but is
> maybe a bit of a booby trap waiting to spring on somebody who tries to
> access the agent with a different current working directory.
Maybe. On the other hand we only have a single test anyway that uses
ssh-agent, so that's a problem for the future, I guess.
In any case, I'd say for now we should just fix the issue in the easiest
way possible, and we can then follow up and make this more robust in a
subsequent patch series. WDYT?
Patrick
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] t7528: work around ETOOMANY in OpenSSH 10.1 and newer
2025-10-23 13:24 ` Patrick Steinhardt
@ 2025-10-23 13:34 ` Jeff King
0 siblings, 0 replies; 4+ messages in thread
From: Jeff King @ 2025-10-23 13:34 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Xi Ruoyao, brian m. carlson, Lauri Tirkkonen
On Thu, Oct 23, 2025 at 03:24:36PM +0200, Patrick Steinhardt wrote:
> In any case, I'd say for now we should just fix the issue in the easiest
> way possible, and we can then follow up and make this more robust in a
> subsequent patch series. WDYT?
Yeah, I'm OK with your patch as-is.
-Peff
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-10-23 13:34 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-23 7:14 [PATCH] t7528: work around ETOOMANY in OpenSSH 10.1 and newer Patrick Steinhardt
2025-10-23 12:43 ` Jeff King
2025-10-23 13:24 ` Patrick Steinhardt
2025-10-23 13:34 ` Jeff King
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox