public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: "Wangnan (F)" <wangnan0@huawei•com>
To: Alexei Starovoitov <alexei.starovoitov@gmail•com>,
	Peter Zijlstra <a.p.zijlstra@chello•nl>
Cc: <acme@kernel•org>, <linux-kernel@vger•kernel.org>,
	<pi3orama@163•com>, <lizefan@huawei•com>,
	<netdev@vger•kernel.org>, <davem@davemloft•net>,
	Adrian Hunter <adrian.hunter@intel•com>,
	Arnaldo Carvalho de Melo <acme@redhat•com>,
	David Ahern <dsahern@gmail•com>, Ingo Molnar <mingo@kernel•org>,
	Yunlong Song <yunlong.song@huawei•com>
Subject: Re: [PATCH 27/53] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE
Date: Wed, 13 Jan 2016 12:34:19 +0800	[thread overview]
Message-ID: <5695D3CB.3030604@huawei.com> (raw)
In-Reply-To: <20160112195641.GA34601@ast-mbp.thefacebook.com>



On 2016/1/13 3:56, Alexei Starovoitov wrote:
> On Tue, Jan 12, 2016 at 08:36:23PM +0800, Wangnan (F) wrote:
>>> hmm, in this kernel patch I see that you're adding 8 bytes for
>>> every record via this extra TAILSISZE flag and in perf you're
>>> walking the ring buffer backwards by reading this 8 byte
>>> sizes, comparing header sizes and so on until reaching beginning,
>>> where you start dumping it as normal.
>>> So for this 'signal to perf' approach to work the ring buffer
>>> will contain tailsizes everywhere just so that user space can
>>> find the beginning. That's not very pretty. imo if kernel
>>> can do header read to adjust data_tail it would make user
>>> space side clean. May be there are other solutions.
>>> Adding tailsize seems like brute force hack.
>>> There must be some nicer way.
>> Hi Peter,
>>
>>   What's your opinion? Should we reconsider moving size field from header the
>> end?
>> Or moving whole header to the end of a record?
> I think moving the whole header under new TAILHEADER flag is
> actually very good idea. The ring buffer will be fully utilized
> and no extra bytes necessary. User space would need to parse it
> backwards, but for this use case it fits well.

I have another crazy suggestion: can we make kernel writing to
the ring buffer from the end to the beginning? For example:

This is the initial state of the ring buffer, head pointer
pointes to the end of it:

       -------------> Address increase

                                     head
                                       |
                                       V
  +--+---+-------+----------+------+---+
  |                                    |
  +--+---+-------+----------+------+---+


Write the first event at the end of the ring buffer, and *decrease*
the head pointer:

                                 head
                                   |
                                   V
  +--+---+-------+----------+------+---+
  |                                | A |
  +--+---+-------+----------+------+---+


Another record:
                           head
                            |
                            V
  +--+---+-------+----------+------+---+
  |                         |   B  | A |
  +--+---+-------+----------+------+---+


Ring buffer rewind, A is fully overwritten and B is broken:

                                head
                                  |
                                  V
  +--+---+-------+----------+-----+----+
  |F | E |   D   | C        | ... | F  |
  +--+---+-------+----------+-----+----+

At this time user can parse the ring buffer normally from
F to C. From timestamp in it he know which one is the
oldest.

By this perf don't need too much extra work to do. There's no
performance penalty at all, and the 8 bytes are saved.

Thought?

Thank you.

  reply	other threads:[~2016-01-13  4:34 UTC|newest]

Thread overview: 90+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
2016-01-11 13:47 ` [PATCH 01/53] perf tools: Add -lutil in python lib list for broken python-config Wang Nan
2016-01-12  9:43   ` Jiri Olsa
2016-01-11 13:47 ` [PATCH 02/53] perf tools: Fix phony build target for build-test Wang Nan
2016-01-11 13:47 ` [PATCH 03/53] perf tools: Set parallel making options build-test Wang Nan
2016-01-11 13:47 ` [PATCH 04/53] perf tools: Pass O option to Makefile.perf in build-test Wang Nan
2016-01-11 13:47 ` [PATCH 05/53] perf tools: Test correct path of perf " Wang Nan
2016-01-11 15:24   ` Arnaldo Carvalho de Melo
2016-01-11 22:06     ` Arnaldo Carvalho de Melo
2016-01-11 22:39       ` Arnaldo Carvalho de Melo
2016-01-11 22:39         ` Arnaldo Carvalho de Melo
2016-01-12  7:16           ` Wangnan (F)
2016-01-12 14:08             ` Arnaldo Carvalho de Melo
2016-01-11 13:47 ` [PATCH 06/53] perf tools: Fix PowerPC native building Wang Nan
2016-01-11 13:47 ` [PATCH 07/53] tools: Move Makefile.arch from perf/config to tools/scripts Wang Nan
2016-01-11 13:52   ` Wangnan (F)
2016-01-11 14:10     ` Arnaldo Carvalho de Melo
2016-01-11 13:47 ` [PATCH 08/53] perf tools: Add missing sources in perf's MANIFEST Wang Nan
2016-01-11 13:48 ` [PATCH 09/53] perf: bpf: Fix build breakage due to libbpf Wang Nan
2016-01-11 13:48 ` [PATCH 10/53] tools build: Add BPF feature check to test-all Wang Nan
2016-01-11 13:48 ` [PATCH 11/53] perf test: Fix false TEST_OK result for 'perf test hist' Wang Nan
2016-01-11 14:25   ` Sergei Shtylyov
2016-01-11 14:58     ` Arnaldo Carvalho de Melo
2016-01-11 15:32       ` Arnaldo Carvalho de Melo
2016-01-11 13:48 ` [PATCH 12/53] perf test: Reset err after using it hold errcode in hist testcases Wang Nan
2016-01-11 13:48 ` [PATCH 13/53] perf tools: Prevent calling machine__delete() on non-allocated machine Wang Nan
2016-01-11 15:42   ` Arnaldo Carvalho de Melo
2016-01-12  7:03     ` Wangnan (F)
2016-01-12 14:07       ` Arnaldo Carvalho de Melo
2016-01-11 13:48 ` [PATCH 14/53] perf test: Check environment before start real BPF test Wang Nan
2016-01-11 21:55   ` Arnaldo Carvalho de Melo
2016-01-12  7:40     ` Wangnan (F)
2016-01-12 14:10       ` Arnaldo Carvalho de Melo
2016-01-11 13:48 ` [PATCH 15/53] perf tools: Fix symbols searching for offline module in buildid-cache Wang Nan
2016-01-11 13:48 ` [PATCH 16/53] perf tools: Fix mmap2 event allocation in synthesize code Wang Nan
2016-01-11 13:48 ` [PATCH 17/53] perf test: Improve bp_signal Wang Nan
2016-01-11 21:37   ` Arnaldo Carvalho de Melo
2016-01-12  4:13     ` Wangnan (F)
2016-01-12  9:21     ` Jiri Olsa
2016-01-12 14:11       ` Arnaldo Carvalho de Melo
2016-01-12 14:17         ` Will Deacon
2016-01-11 13:48 ` [PATCH 18/53] perf tools: Add API to config maps in bpf object Wang Nan
2016-01-11 13:48 ` [PATCH 19/53] perf tools: Enable BPF object configure syntax Wang Nan
2016-01-11 13:48 ` [PATCH 20/53] perf record: Apply config to BPF objects before recording Wang Nan
2016-01-11 13:48 ` [PATCH 21/53] perf tools: Enable passing event to BPF object Wang Nan
2016-01-11 13:48 ` [PATCH 22/53] perf tools: Support perf event alias name Wang Nan
2016-01-11 13:48 ` [PATCH 23/53] perf tools: Support setting different slots in a BPF map separately Wang Nan
2016-01-11 13:48 ` [PATCH 24/53] perf tools: Enable indices setting syntax for BPF maps Wang Nan
2016-01-11 13:48 ` [PATCH 25/53] perf tools: Introduce bpf-output event Wang Nan
2016-01-11 13:48 ` [PATCH 26/53] perf data: Support converting data from bpf_perf_event_output() Wang Nan
2016-01-11 13:48 ` [PATCH 27/53] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE Wang Nan
2016-01-11 18:09   ` Alexei Starovoitov
2016-01-12  5:33     ` Wangnan (F)
2016-01-12  6:11       ` Alexei Starovoitov
2016-01-12 12:36         ` Wangnan (F)
2016-01-12 19:56           ` Alexei Starovoitov
2016-01-13  4:34             ` Wangnan (F) [this message]
2016-01-13  5:14               ` Alexei Starovoitov
2016-01-12 14:05       ` Peter Zijlstra
2016-01-12 14:14   ` Peter Zijlstra
2016-01-11 13:48 ` [PATCH 28/53] perf tools: Move timestamp creation to util Wang Nan
2016-01-11 13:48 ` [PATCH 29/53] perf tools: Make ordered_events reusable Wang Nan
2016-01-11 21:33   ` Arnaldo Carvalho de Melo
2016-01-11 13:48 ` [PATCH 30/53] perf record: Extract synthesize code to record__synthesize() Wang Nan
2016-01-11 13:48 ` [PATCH 31/53] perf tools: Add perf_data_file__switch() helper Wang Nan
2016-01-11 13:48 ` [PATCH 32/53] perf record: Turns auxtrace_snapshot_enable into 3 states Wang Nan
2016-01-11 13:48 ` [PATCH 33/53] perf record: Introduce record__finish_output() to finish a perf.data Wang Nan
2016-01-11 13:48 ` [PATCH 34/53] perf record: Use OPT_BOOLEAN_SET for buildid cache related options Wang Nan
2016-01-11 13:48 ` [PATCH 35/53] perf record: Add '--timestamp-filename' option to append timestamp to output filename Wang Nan
2016-01-11 13:48 ` [PATCH 36/53] perf record: Split output into multiple files via '--switch-output' Wang Nan
2016-01-11 13:48 ` [PATCH 37/53] perf record: Force enable --timestamp-filename when --switch-output is provided Wang Nan
2016-01-11 13:48 ` [PATCH 38/53] perf record: Disable buildid cache options by default in switch output mode Wang Nan
2016-01-11 13:48 ` [PATCH 39/53] perf record: Re-synthesize tracking events after output switching Wang Nan
2016-01-11 13:48 ` [PATCH 40/53] perf record: Generate tracking events for process forked by perf Wang Nan
2016-01-11 13:48 ` [PATCH 41/53] perf record: Ensure return non-zero rc when mmap fail Wang Nan
2016-01-11 13:48 ` [PATCH 42/53] perf record: Prevent reading invalid data in record__mmap_read Wang Nan
2016-01-11 14:21   ` Sergei Shtylyov
2016-01-11 15:00     ` Arnaldo Carvalho de Melo
2016-01-11 15:01       ` Arnaldo Carvalho de Melo
2016-01-11 13:48 ` [PATCH 43/53] perf tools: Add evlist channel helpers Wang Nan
2016-01-11 13:48 ` [PATCH 44/53] perf tools: Automatically add new channel according to evlist Wang Nan
2016-01-11 13:48 ` [PATCH 45/53] perf tools: Operate multiple channels Wang Nan
2016-01-11 13:48 ` [PATCH 46/53] perf tools: Squash overwrite setting into channel Wang Nan
2016-01-11 13:48 ` [PATCH 47/53] perf record: Don't read from and poll overwrite channel Wang Nan
2016-01-11 13:48 ` [PATCH 48/53] perf tools: Enable overwrite settings Wang Nan
2016-01-11 13:48 ` [PATCH 49/53] perf tools: Consider TAILSIZE bit when caclulate is_pos Wang Nan
2016-01-11 13:48 ` [PATCH 50/53] perf tools: Set tailsize attribut bit for overwrite events Wang Nan
2016-01-11 13:48 ` [PATCH 51/53] perf record: Read from tailsize ring buffer Wang Nan
2016-01-11 13:48 ` [PATCH 52/53] perf record: Toggle tailsize ring buffer for reading Wang Nan
2016-01-11 13:48 ` [PATCH 53/53] perf record: Allow generate tracking events at the end of output Wang Nan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5695D3CB.3030604@huawei.com \
    --to=wangnan0@huawei$(echo .)com \
    --cc=a.p.zijlstra@chello$(echo .)nl \
    --cc=acme@kernel$(echo .)org \
    --cc=acme@redhat$(echo .)com \
    --cc=adrian.hunter@intel$(echo .)com \
    --cc=alexei.starovoitov@gmail$(echo .)com \
    --cc=davem@davemloft$(echo .)net \
    --cc=dsahern@gmail$(echo .)com \
    --cc=linux-kernel@vger$(echo .)kernel.org \
    --cc=lizefan@huawei$(echo .)com \
    --cc=mingo@kernel$(echo .)org \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=pi3orama@163$(echo .)com \
    --cc=yunlong.song@huawei$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox