public inbox for linux-next@vger.kernel.org 
 help / color / mirror / Atom feed
From: Martin Schwidefsky <schwidefsky@de•ibm.com>
To: Eric Paris <eparis@redhat•com>
Cc: Heiko Carstens <heiko.carstens@de•ibm.com>,
	Sachin Sant <sachinp@in•ibm.com>,
	linux-s390@vger•kernel.org,
	Stephen Rothwell <sfr@canb•auug.org.au>,
	linux-next@vger•kernel.org,
	Andrew Morton <akpm@linux-foundation•org>,
	linux-arch@vger•kernel.org
Subject: Re: [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S)
Date: Wed, 18 Nov 2009 18:34:00 +0100	[thread overview]
Message-ID: <20091118183400.0cd6b177@mschwide.boeblingen.de.ibm.com> (raw)
In-Reply-To: <1258560177.6446.19.camel@dhcp231-106.rdu.redhat.com>

On Wed, 18 Nov 2009 11:02:57 -0500
Eric Paris <eparis@redhat•com> wrote:

> On Wed, 2009-11-18 at 08:04 +0100, Heiko Carstens wrote:
> > Oh wait, I have to correct myself:
> > 
> > With
> > 
> > long sys_fanotify_mark(int fanotify_fd, unsigned int flags,
> >      	 	       int fd, const char  __user *pathname,
> >                        u64 mask);
> > 
> > we have a 64 bit type as 5th argument. That doesn't work for syscalls
> > on 32 bit s390.
> > I just simplify the reason for this: on 32 bit long longs will be passed via
> > two consecutive registers _unless_ the first register would be r6 (which is
> > the case here). In that case the whole 64 bits would be passed on the stack.
> > Our glibc syscall code will always put the contents of the first parameter
> > stack slot into register r7, so we have six registers for parameter passing
> > (r2-r7). So with the 64 bit value put into two stack slots we would miss
> > the second part of the 5th argument.
> 
> asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len);
> 
> sys_fallocate_wrapper:
>         lgfr    %r2,%r2                 # int
>         lgfr    %r3,%r3                 # int
>         sllg    %r4,%r4,32              # get high word of 64bit loff_t
>         lr      %r4,%r5                 # get low word of 64bit loff_t
>         sllg    %r5,%r6,32              # get high word of 64bit loff_t
>         l       %r5,164(%r15)           # get low word of 64bit loff_t
>         jg      sys_fallocate
> 
> Does this work?  It's basically the same thing, right?  I'm willing to
> hear "that's fine you are clueless"   Just saw it and hoping that we
> have everything right....

Ok, we need the full version of the story..
The 32 bit ELF ABI specifies that the 32 bit registers %r2 to %r6 are
used for parameter passing. 64 bit values are passed as registers pairs
with the first register an even numbered register. The effect of that
rule is that parameter registers may be skipped or that the whole 64 bit
value is passed on the stack. Examples:

fn(int a, int b, long long c)
a is passed in %r2, b is passed in %r3, c is passed in %r4/%r5.

fn(int a, long long b, int c)
a is passed in %r2, b is passed in %r4/%r5, c is passed in %r6, %r3 is
skipped.

fn(int a, int b, int c, int d, long long e)
a is passed in %r2, b is passed in %r3, c is passed in %r4, d is passed
in %r5, e is passed on the stack, %r6 is skipped.

The second fact to understand is how the system call arguments are
passed. The original system call ABI used the same calling conventions
as the ELF ABI. That is only registers %r2 to %r6 are used. Now futex
came along with 6 parameters. We did not want to use the user process
stack to pass the parameters because that would require a
copy_from_user which is expensive. Instead we tricked a little bit. The
6th parameter is passed by glue code in glibc in register %r7 (no user
copy). The code in entry.S stores %r7 to the beginning of the pt_regs
structure:

struct pt_regs
{
        unsigned long args[1];
	...
};

The C function that implements a system call with 6 32-bit parameters
expects 5 parameters in registers, the 6th is located on the stack. The
args element of pt_regs "happens" to be at the same offset where the C
function is looking for the first overflow argument (= the 6th
parameter).

Now consider a system call with an overflowing 64 bit parameter. The
glue code in glibc could be hacked in a way that the 64 bit value is
split into %r6 and %r7. But the system call function is just a C
function. It follows the ELF ABI and expects the 64 bit argument on the
stack. It would take two 32 bit overflow registers in pt_regs to make
one 64 bit parameter. With the current code that won't work. We would
need a wrapper function in the kernel to untangle this parameter mess.

The avoid all this all 64 bit parameter have to be placed at positions
where no register is skipped because of the even/odd rule and where it
is not affected by the %r7 trick (= may not be the last parameter).
Easy, no?

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

  parent reply	other threads:[~2009-11-18 17:33 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-17  8:53 linux-next: Tree for November 17 Stephen Rothwell
2009-11-17 12:06 ` [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S) Sachin Sant
2009-11-17 12:52   ` Heiko Carstens
2009-11-17 13:40     ` Eric Paris
2009-11-17 13:55       ` Heiko Carstens
2009-11-17 15:23         ` Eric Paris
2009-11-17 15:50           ` Heiko Carstens
2009-11-17 15:57             ` Eric Paris
2009-11-17 16:14               ` Heiko Carstens
2009-11-18  7:04           ` Heiko Carstens
2009-11-18  9:27             ` Russell King
2009-11-18 14:49             ` Ralf Baechle
2009-11-18 16:02             ` Eric Paris
2009-11-18 16:22               ` Heiko Carstens
2009-11-18 17:34               ` Martin Schwidefsky [this message]
2009-11-18 18:41                 ` Heiko Carstens
2009-11-19  8:54                   ` Martin Schwidefsky
2009-11-18 17:24             ` Eric Paris
2009-11-17 17:02 ` linux-next: Tree for November 17 (exofs) Randy Dunlap
2009-11-17 17:08   ` Boaz Harrosh
2009-11-17 17:10     ` Boaz Harrosh
2009-11-17 17:24       ` Boaz Harrosh
2009-11-17 17:48         ` Randy Dunlap
2009-11-17 18:17 ` [PATCH -next] sep: fix 2 warnings Randy Dunlap
2009-11-17 18:29   ` Alan Cox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091118183400.0cd6b177@mschwide.boeblingen.de.ibm.com \
    --to=schwidefsky@de$(echo .)ibm.com \
    --cc=akpm@linux-foundation$(echo .)org \
    --cc=eparis@redhat$(echo .)com \
    --cc=heiko.carstens@de$(echo .)ibm.com \
    --cc=linux-arch@vger$(echo .)kernel.org \
    --cc=linux-next@vger$(echo .)kernel.org \
    --cc=linux-s390@vger$(echo .)kernel.org \
    --cc=sachinp@in$(echo .)ibm.com \
    --cc=sfr@canb$(echo .)auug.org.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox