* network load 8245 vs. 8347E
@ 2007-06-21 14:15 Marc Leeman
2007-06-21 15:33 ` Kumar Gala
0 siblings, 1 reply; 12+ messages in thread
From: Marc Leeman @ 2007-06-21 14:15 UTC (permalink / raw)
To: linuxppc-dev
[-- Attachment #1.1: Type: text/plain, Size: 804 bytes --]
Ok, I guess it's comparing apples to lemons here, but I'll have a go at
it anyway.
I'm trying to figure out why partial network decoding on an 8347E is
disappointingly slow wrt an older 8245 processor.
When simply receiving (cf. att) a multicast stream of 12 Mbps, an
8245/uclibc 0.9.28 @350 MHz, ppc arch , the system runs smoothly at a
load of around 4% on a e100 based MAC (pci: 8086:1209 ).
When doing the same thing on 8347e/0.9.28 @400 Mhz, powerpc arch, the
system is loaded at around 34%.
any clues?
This program just takes in data and does nothing with it, to limit the
search area :)
--
greetz, marc
You know until today, I never really realized how much I love my feet.
Chiana - Vitas Mortis
chiana 2.6.18-4-ixp4xx #1 Tue Mar 27 18:01:56 BST 2007 GNU/Linux
[-- Attachment #1.2: recv_mcast.c --]
[-- Type: text/x-csrc, Size: 3067 bytes --]
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define BUFSIZE 64*1024 /* Maximal packet content is 64k bytes */
#define MAXPARAMLEN 80
void error(char *msg);
int main(int argc, char *argv[])
{
int socket_val, bind_val, recvfromval, ctrlboardlen, cameralen,
i, rc, recvbuff, optlen;
u_short ctrlboardPort;
struct ip_mreq mreq;
struct sockaddr_in ctrlboard, camera;
struct in_addr mcast_address;
struct hostent *h;
unsigned char buffer[BUFSIZE], ctrlboardip[16], sendseq[11];
FILE *fc,*fp;
/* Initialize the buffer */
bzero(buffer,BUFSIZE);
if(argc!=3) {
fprintf(stdout,"usage : %s <mcast address> <mcast port>\n",argv[0]);
exit(0);
}
/* Get mcast address to listen to */
h=gethostbyname(argv[1]);
if(h==NULL) {
fprintf(stdout,"Unknown group %s\n",ctrlboardip);
exit(1);
}
ctrlboardPort=atoi(argv[2]);
memcpy(&mcast_address, h->h_addr_list[0],h->h_length);
/* Check given address is multicast */
if(!IN_MULTICAST(ntohl(mcast_address.s_addr))) {
fprintf(stdout,"Given address '%s' is not multicast\n",
inet_ntoa(mcast_address));
exit(1);
}
/* Create socket for incoming connections */
socket_val=socket(AF_INET, SOCK_DGRAM, 0);
if (socket_val < 0)
error("Error opening socket");
/* Set content of ctrlboard */
ctrlboardlen = sizeof(ctrlboard);
bzero(&ctrlboard,ctrlboardlen);
/* Fill in the UDP Receiver properties */
ctrlboard.sin_family=AF_INET;
ctrlboard.sin_addr.s_addr=htonl(INADDR_ANY);
ctrlboard.sin_port=htons(ctrlboardPort);
/* Set socket options */
recvbuff=128*1024;
if (setsockopt(socket_val,SOL_SOCKET,SO_RCVBUF,(char *) &recvbuff, sizeof(recvbuff)) < 0)
error("Error setting socket options");
/* Get socket options */
optlen=sizeof(recvbuff);
if (getsockopt(socket_val,SOL_SOCKET,SO_RCVBUF,(char *) &recvbuff, &optlen) < 0)
error("Error getting socket options");
fprintf(stdout,"Receive buffer size: %d \n",recvbuff);
/* Bind to associate port number with the socket */
bind_val = bind(socket_val,(struct sockaddr *)&ctrlboard,ctrlboardlen);
if (bind_val < 0)
error("Error bind");
/* join multicast group */
mreq.imr_multiaddr.s_addr=mcast_address.s_addr;
mreq.imr_interface.s_addr=htonl(INADDR_ANY);
rc = setsockopt(socket_val,IPPROTO_IP,IP_ADD_MEMBERSHIP, (void *) &mreq, sizeof(mreq));
if(rc<0)
{
fprintf(stdout,"Cannot join multicast group '%s'", inet_ntoa(mcast_address));
exit(1);
}
else
fprintf(stdout,"Listening to mgroup %s:%d\n", inet_ntoa(mcast_address), ctrlboardPort);
/* Fill in length of struct sockaddr_in camera */
cameralen = sizeof(camera);
/* Loop */
while (1) {
recvfromval = recvfrom(socket_val,buffer,BUFSIZE,0,(struct sockaddr *)&camera,&cameralen);
if (recvfromval < 0) error("Error recvfrom");
}
close(socket_val);
fclose(fp);
}
void error(char *msg) {
perror(msg);
exit(0);
}
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: network load 8245 vs. 8347E 2007-06-21 14:15 network load 8245 vs. 8347E Marc Leeman @ 2007-06-21 15:33 ` Kumar Gala 2007-06-21 15:56 ` Marc Leeman 0 siblings, 1 reply; 12+ messages in thread From: Kumar Gala @ 2007-06-21 15:33 UTC (permalink / raw) To: Marc Leeman; +Cc: linuxppc-dev On Jun 21, 2007, at 9:15 AM, Marc Leeman wrote: > Ok, I guess it's comparing apples to lemons here, but I'll have a > go at > it anyway. > > I'm trying to figure out why partial network decoding on an 8347E is > disappointingly slow wrt an older 8245 processor. > > When simply receiving (cf. att) a multicast stream of 12 Mbps, an > 8245/uclibc 0.9.28 @350 MHz, ppc arch , the system runs smoothly at a > load of around 4% on a e100 based MAC (pci: 8086:1209 ). > > When doing the same thing on 8347e/0.9.28 @400 Mhz, powerpc arch, the > system is loaded at around 34%. > > any clues? How are you measuring load? I'm assuming the 8245 and 8347 are using the same kernel. - k ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: network load 8245 vs. 8347E 2007-06-21 15:33 ` Kumar Gala @ 2007-06-21 15:56 ` Marc Leeman 2007-06-21 17:13 ` Marc Leeman 0 siblings, 1 reply; 12+ messages in thread From: Marc Leeman @ 2007-06-21 15:56 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev [-- Attachment #1: Type: text/plain, Size: 1769 bytes --] > How are you measuring load? I'm assuming the 8245 and 8347 are using > the same kernel. Simply the load of the processor. The 8245 is running ppc/2.6.17, the 8347e is running powerpc/2.6.21.1. The 8245 kernel is not being upgraded anymore since 1. we're not using them anymore on the new designs 2. I didn't bother fixing the board support after the interrupt handling changed in 2.6.18 because of 1. Disabling NAPI seems to improve the situation a bit, but there's still a load difference 25% on a marginally faster processor. Mem: 10788K used, 116940K free, 0K shrd, 0K buff, 4212K cached Load average: 0.39 0.43 0.29 PID USER STATUS VSZ PPID %CPU %MEM COMMAND 361 barco SW 152 270 30.8 0.1 recv 2222 barco RW 1124 270 0.3 0.8 top 119 root SW 1216 1 0.0 0.9 dropbear 270 barco SW 1132 1 0.0 0.8 sh 94 root SW 1132 1 0.0 0.8 syslogd 1 root SW 1128 0 0.0 0.8 init 95 root SW 1112 1 0.0 0.8 klogd Mem: 8144K used, 20944K free, 0K shrd, 888K buff, 2384K cached Load average: 0.00 0.00 0.00 (Status: S=sleeping R=running, W=waiting) PID USER STATUS RSS PPID %CPU %MEM COMMAND 407 barco R 124 1 6.4 0.4 recv 510 root S 668 130 0.7 2.2 dropbear 13889 root R 368 7377 0.3 1.2 top 511 barco S 476 510 0.0 1.6 sh 52 root S 376 1 0.0 1.2 syslogd 1 root S 352 0 0.0 1.2 init 59 root S 340 1 0.0 1.1 klogd -- greetz, marc Oh no, no, no, no I don't boogie. Crichton - Won't Get Fooled Again chiana 2.6.18-4-ixp4xx #1 Tue Mar 27 18:01:56 BST 2007 GNU/Linux [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: network load 8245 vs. 8347E 2007-06-21 15:56 ` Marc Leeman @ 2007-06-21 17:13 ` Marc Leeman [not found] ` <467ABCDC.8060401@genesi-usa.com> 2007-06-28 18:06 ` 2.4/2.6/ppc/powerpc/8245/8347e Marc Leeman 0 siblings, 2 replies; 12+ messages in thread From: Marc Leeman @ 2007-06-21 17:13 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev [-- Attachment #1: Type: text/plain, Size: 574 bytes --] > Simply the load of the processor. The 8245 is running ppc/2.6.17, the > 8347e is running powerpc/2.6.21.1. Sorry, I'm confusing my boards: this particular board with the 8245 processor is running 2.4.34 [1]. so 8245: 2.4.34 8237e: 2.6.21.1 [1] the 2.4 line balanced the load better of multiple streams being taken in wrt to the 2.6 kernels; this is the reason we stuck with 2.4 for this platform. -- greetz, marc He claims to be a human from a planet called Erp. Aeryn - Premiere chiana 2.6.18-4-ixp4xx #1 Tue Mar 27 18:01:56 BST 2007 GNU/Linux [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <467ABCDC.8060401@genesi-usa.com>]
* Re: network load 8245 vs. 8347E [not found] ` <467ABCDC.8060401@genesi-usa.com> @ 2007-06-21 18:33 ` Marc Leeman 2007-06-21 18:53 ` Matt Sealey 0 siblings, 1 reply; 12+ messages in thread From: Marc Leeman @ 2007-06-21 18:33 UTC (permalink / raw) To: Matt Sealey; +Cc: Linuxppc-dev [-- Attachment #1: Type: text/plain, Size: 1611 bytes --] > Isn't it just because the Intel chipset is a REALLY nice ethernet > controller and the integrated one in the 8347E isn't as good? :) Well, that's something what I'm afraid off: that for any kind of decent network performance we'd need an external chipset. But there are some other hickups we need to investigate with the 8347. I just hope it's either a configuration problem, a not so efficient driver implementation -> no re-design :) > You could try turning the interrupt coaelescing off in the e100 > driver and see how well it does, then. Or knock the bundling > threshold or timer down to something similar to the 8347E is using.. > or turn the ones on the 8347E up to match the ones the e100 driver > is using :) > > All these options used to be modprobe options but the latest > e100 driver seems to hardcode a bunch of them probably for best > performance. Anyway, putting them on a level peg would mean at > least you are comparing onboard apples with pci apples. I started doing this this evening, but at first glance, changing sysfs params (on the gianfar driver) didn't change much. I'll start comparing 8245/e100/2.4.34, 8245/e100/2.6.17 and 8347E/gianfar/2.6.21.1 tomorrow. Anyway a load of 30% for a single process where an older processor only takes around 5% seems too much of a difference to be solved with simple parameter settings. -- greetz, marc Open your ears, or your tentacles, or whatever orifice it is you listen with! Crichton - Back and Back and Back to the Future chiana 2.6.18-4-ixp4xx #1 Tue Mar 27 18:01:56 BST 2007 GNU/Linux [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: network load 8245 vs. 8347E 2007-06-21 18:33 ` Marc Leeman @ 2007-06-21 18:53 ` Matt Sealey 2007-06-21 19:09 ` Marc Leeman 0 siblings, 1 reply; 12+ messages in thread From: Matt Sealey @ 2007-06-21 18:53 UTC (permalink / raw) To: Marc Leeman; +Cc: Linuxppc-dev Sure; the 2.4 ethernet driver for the e100 might be extremely efficient on the CPU but not very good on bandwidth. They've (Intel) rewritten it a short while back according to the documentation, so it may be that the newer version on 2.4 may be just as CPU intensive as the gianfar driver, and you're lucky to be using the old one. I haven't looked at the 2.4.x version you're running to see what driver it's using, but Intel do have their latest driver version backported to the 2.4 kernel series on their site. You might just be lucky that 2.4 is less bloated and has to handle less features than 2.6 and it's lowering the CPU usage :) So, I suppose, comparing at least 2.6 series kernels is a start, making sure you compare the old 2.4 e100 driver to the new 2.6 e100 driver is another idea. I think there are too many variables. That said, usually inbuilt ethernets on SoC's.. at least in my experience.. tend to be more efficient in some ways but not in others. In any case an Intel network card has always kicked the pants off it.. :( -- Matt Sealey <matt@genesi-usa•com> Genesi, Manager, Developer Relations Marc Leeman wrote: >> Isn't it just because the Intel chipset is a REALLY nice ethernet >> controller and the integrated one in the 8347E isn't as good? :) > > Well, that's something what I'm afraid off: that for any kind of decent > network performance we'd need an external chipset. But there are some > other hickups we need to investigate with the 8347. > > I just hope it's either a configuration problem, a not so efficient > driver implementation -> no re-design :) > >> You could try turning the interrupt coaelescing off in the e100 >> driver and see how well it does, then. Or knock the bundling >> threshold or timer down to something similar to the 8347E is using.. >> or turn the ones on the 8347E up to match the ones the e100 driver >> is using :) >> >> All these options used to be modprobe options but the latest >> e100 driver seems to hardcode a bunch of them probably for best >> performance. Anyway, putting them on a level peg would mean at >> least you are comparing onboard apples with pci apples. > > I started doing this this evening, but at first glance, changing sysfs > params (on the gianfar driver) didn't change much. > > I'll start comparing 8245/e100/2.4.34, 8245/e100/2.6.17 and > 8347E/gianfar/2.6.21.1 tomorrow. > > Anyway a load of 30% for a single process where an older processor only > takes around 5% seems too much of a difference to be solved with simple > parameter settings. > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: network load 8245 vs. 8347E 2007-06-21 18:53 ` Matt Sealey @ 2007-06-21 19:09 ` Marc Leeman 0 siblings, 0 replies; 12+ messages in thread From: Marc Leeman @ 2007-06-21 19:09 UTC (permalink / raw) To: Matt Sealey; +Cc: Linuxppc-dev [-- Attachment #1: Type: text/plain, Size: 624 bytes --] > You might just be lucky that 2.4 is less bloated and has to handle less > features than 2.6 and it's lowering the CPU usage :) Actually, I backported the e100 driver from the 2.6 series to the 2.4 series. It performed better. Perhaps I should re-sync again with what is currently in the 2.6, but as I said, not much work is put in the older platforms due to a load of proto 8347e boards being ported.. -- greetz, marc Long enough for me to see your blue backside meditating, but not long enough for you to touch me. Rygel - PK Tech Girl chiana 2.6.18-4-ixp4xx #1 Tue Mar 27 18:01:56 BST 2007 GNU/Linux [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* 2.4/2.6/ppc/powerpc/8245/8347e 2007-06-21 17:13 ` Marc Leeman [not found] ` <467ABCDC.8060401@genesi-usa.com> @ 2007-06-28 18:06 ` Marc Leeman 2007-06-29 14:59 ` 2.4/2.6/ppc/powerpc/8245/8347e Marc Leeman 1 sibling, 1 reply; 12+ messages in thread From: Marc Leeman @ 2007-06-28 18:06 UTC (permalink / raw) To: linuxppc-dev [-- Attachment #1.1: Type: text/plain, Size: 3224 bytes --] a small update: > 8245: 2.4.34 > 8237e: 2.6.21.1 I've tried the following setup: multicast stream @8192 kbps, one process taking in and dumping the data on each board [1]. a) 8245/2.4.34/e100: 2.3.43-k1, @400 MHz b) 8245/2.6.17/e100: 2.3.43-k1 [2] @350 MHz c) 8347e/2.6.21.1/gianfar @400 Mhz c) XScale-IXP42x/2.6.18-4/ixp4xx @266 MHz (NSLU2) (2.3.43-k1 is the e100 driver version). The process load for taking in the data is: a) 4-5% [3] b) 10-11% c) 13-14% d) 4-5% While the current 8347/gianfar platform is the worst performer, the 2.6 kernel with the 2.4 e100 (before the rewrite) seems to perform poorly too [4]. So the 834x preforms worse wrt the 8245 based configuration even though it is slightly higher clocked. It seems as if I bumped into the problem that lead me stick with the 2.4 in the first place for this 8245 platform; but never got round to investigating. I find these results especially intriguing when considering an ARM platform (NSLU2 device) that I had around, clocked at only 66% of the 8347 and at 80% of the 8245 performs certainly in par with the last one... Even though I will need to recheck this (results to follow), a quick test didn't reveal any significant difference between a ppc and powerpc arch in the kernel. It does look like, on our 8245/83xx platforms, the 2.6.x kernel performs worse wrt the 2.4 ppc kernels and the 83xx configuration is worse wrt the 8245 based configuration [5]. In retrospect, we had signals that there was a problem with the 8245/83xx performance over the network last year when investigating gstreamer, but due to time pressure but assumed it was due to gstreamer and not the processor. This came as a suprise to some of the ppl on the gstreamer mailing list that reported performant ports to ARM architectures. The results with the NSLU2 will certainly put heat on us from management when redesigning or for follow up designs :( Anyhow, I'm currently extending my test setups since this is an important problem and set back. If anyone has a hint to explaining what is going on here, please do since solving this will certainly beat redesigning (esp. considering the timeframe we've been assigned). I've only found one relevant reference to 2.4/2.6 network performance decrease at this point [6]. [1] sources attached mcrecv -p 225.1.2.3 -a 12345 mcsend -p 225.1.2.3 -a 12345 -b 8192 I'm preparing more tests in the next days, in trying to figure out what really is going on here. [2] 2.4 driver ported to 2.6 kernel. [3] This figure is read from top and not from the app since it seems to be an underestimate (./fs/proc/array.c). [4] I believe I ported the 2.4 e100 to the 2.6 2 years ago because it performed much better, but I'll verify that in the next days. [5] Obviously, testing 834x against the 2.4 kernel is not really an option :) [6] http://www.mail-archive.com/linux-net@vger.kernel.org/msg01283.html -- greetz, marc Don't think I'm going to miss you, any of you. I'm not. Well, maybe a little bit. Rygel - Into the Lion's Den - Wolf in Sheep's Clothing chiana 2.6.18-4-ixp4xx #1 Tue Mar 27 18:01:56 BST 2007 GNU/Linux [-- Attachment #1.2: recv_mcast.c --] [-- Type: text/x-csrc, Size: 9890 bytes --] #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> #include <netdb.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <sys/time.h> #include <time.h> #define BUFSIZE 64*1024 /* Maximal packet content is 64k bytes */ #define MAXPARAMLEN 80 #define STATFILE "/proc/stat" #define SELFSTATFILE "/proc/self/stat" void error(char *msg); void usage(const char* program) { fprintf(stdout,"usage : %s -a address -p port\n",program); } float getsysload() { unsigned long user = 0ul, nice = 0ul, system = 0ul, idle = 0ul; char buf[BUFSIZ]; FILE *fp; char buffer[BUFSIZ]; memset(buffer,0x0,BUFSIZ); if((fp=fopen(STATFILE,"r"))<=0){ fprintf(stderr, "Problem opening %s\n", STATFILE); return EXIT_FAILURE; } fread(buffer, sizeof(char), BUFSIZ, fp); fclose(fp); if(sscanf(buffer,"%s %lu %lu %lu %lu",buf, &user, &nice, &system, &idle)){ return (float)(user+system)/(user+system+nice+idle); } else{ fprintf(stdout, "no matching strings found\n"); } } unsigned long getcurrjiffies() { FILE *fp; char buffer[BUFSIZ]; memset(buffer, 0x0, BUFSIZ); fp=popen("cat /proc/self/stat | cut -d \\ -f 22","r"); fread(buffer, sizeof(char), BUFSIZ, fp); pclose(fp); return atoi(buffer); } unsigned long scan_stat(unsigned long *user, unsigned long *kernel) { FILE *fp; char buffer[BUFSIZ]; uint32_t scanned = 0u; signed int pid = 0; char tcomm[BUFSIZ]; char state = 0x0; signed int ppid = 0; signed int pgid = 0; signed int sid = 0; signed int tty_nr = 0; signed int tty_pgrp = 0; unsigned long flags = 0ul; unsigned long min_flt = 0ul; unsigned long cmin_flt = 0ul; unsigned long maj_flt = 0ul; unsigned long cmaj_flt = 0ul; unsigned long utime = 0ul; unsigned long stime = 0ul; signed long cutime = 0l; signed long cstime = 0l; signed long priority = 0l; signed long nice = 0l; signed int num_threads = 0; unsigned long long start_time = 0ul; unsigned long vsize = 0ul; signed long rss = 0l; unsigned long rsslim = 0ul; unsigned long start_code = 0ul; unsigned long end_code = 0ul; unsigned long start_stack = 0ul; unsigned long esp = 0ul; unsigned long eip = 0ul; /* The signal information here is obsolete. * It must be decimal for Linux 2.0 compatibility. * Use /proc/#/status for real-time signals. */ unsigned long signalpending = 0ul; unsigned long signalblocked = 0ul; unsigned long sigign = 0ul; unsigned long sigcatch = 0ul; unsigned long wchan = 0ul; unsigned long dummy0 = 0ul; unsigned long dummy1 = 0ul; int exit_signal = 0; int task_cpu = 0; unsigned long rt_priority = 0ul; unsigned long policy = 0ul; unsigned long long delayticks = 0ull; memset(buffer,0x0,BUFSIZ); memset(tcomm,0x0,BUFSIZ); if(!(fp=fopen(SELFSTATFILE,"r"))){ fprintf(stderr,"Error opening \"%s\".\n",SELFSTATFILE); return 0; } fread(buffer, sizeof(char), BUFSIZ, fp); fclose(fp); #if 0 fprintf(stdout,"Reference:\n"); fprintf(stdout,"-------\n"); fprintf(stdout,"%s\n",buffer); fprintf(stdout,"-------\n"); #endif scanned = sscanf(buffer,"%d %s %c %d %d %d %d %d %lu %lu \ %lu %lu %lu %lu %lu %ld %ld %ld %ld %d 0 %llu %lu %ld %lu %lu %lu %lu %lu \ %lu %lu %lu %lu %lu %lu %lu %lu %d %d %lu %lu %llu\n", &pid, tcomm, &state, &ppid, &pgid, &sid, &tty_nr, &tty_pgrp, &flags, &min_flt, &cmin_flt, &maj_flt, &cmaj_flt, &utime, &stime, &cutime, &cstime, &priority, &nice, &num_threads, &start_time, &vsize, &rss, &rsslim, &start_code, &end_code, &start_stack, &esp, &eip, /* The signal information here is obsolete. * It must be decimal for Linux 2.0 compatibility. * Use /proc/#/status for real-time signals. */ &signalpending, &signalblocked, &sigign, &sigcatch, &wchan, &dummy0, &dummy1, &exit_signal, &task_cpu, &rt_priority, &policy, &delayticks); #if 0 fprintf(stdout,"scanned %u items.\n",scanned); fprintf(stdout,"%d %s %c %d %d %d %d %d %lu %lu \ %lu %lu %lu %lu %lu %ld %ld %ld %ld %d 0 %llu %lu %ld %lu %lu %lu %lu %lu \ %lu %lu %lu %lu %lu %lu %lu %lu %d %d %lu %lu %llu\n", pid, tcomm, state, ppid, pgid, sid, tty_nr, tty_pgrp, flags, min_flt, cmin_flt, maj_flt, cmaj_flt, utime, stime, cutime, cstime, priority, nice, num_threads, start_time, vsize, rss, rsslim, start_code, end_code, start_stack, esp, eip, /* The signal information here is obsolete. * It must be decimal for Linux 2.0 compatibility. * Use /proc/#/status for real-time signals. */ signalpending, signalblocked, sigign, sigcatch, wchan, dummy0, dummy1, exit_signal, task_cpu, rt_priority, policy, delayticks); #endif *user = utime; *kernel = stime; return utime + stime; } int main(int argc, char *argv[]) { extern int getopt(); extern int optind; extern char *optarg; int c_opt; int socket_val, bind_val, recvfromval, ctrlboardlen, rc, recvbuff; unsigned optlen, cameralen; uint16_t mc_port; char mc_addr[16]; struct ip_mreq mreq; struct sockaddr_in ctrlboard, camera; struct in_addr mcast_address; struct hostent *h; unsigned char buffer[BUFSIZE]; FILE *fp; uint32_t ucnt = 0u,i,cnt=0u; uint64_t usecs[8]; uint64_t received[8]; uint64_t treceived = 0ul; struct timeval newtime; unsigned long p_jiffies, c_jiffies; unsigned long puser, pkernel; unsigned long cuser, ckernel; unsigned long long p_start, c_start; /* Initialize the buffer */ memset(mc_addr, 0x0, 16); memset(buffer,0x0,BUFSIZE); for(i=0;i<8;i++){ usecs[i] = 0ul; received[8] = 0ul; } /* handling of command line options */ while ((c_opt = getopt(argc, argv, "a:b:p:")) != EOF) { switch (c_opt) { case 'a': strncpy(mc_addr,optarg,16); break; case 'p': mc_port = (atoi(optarg)); break; default: fprintf(stderr, "%s: Bad Option -%c\n", argv[0], c_opt); exit(EXIT_FAILURE); } } if(!mc_addr || !mc_port){ usage(argv[0]); return EXIT_FAILURE; } /* Get mcast address to listen to */ h=gethostbyname(mc_addr); if(h==NULL) { fprintf(stdout,"Unknown group %s\n",mc_addr); exit(1); } memcpy(&mcast_address, h->h_addr_list[0],h->h_length); /* Check given address is multicast */ if(!IN_MULTICAST(ntohl(mcast_address.s_addr))) { fprintf(stdout,"Given address '%s' is not multicast\n", inet_ntoa(mcast_address)); exit(1); } /* Create socket for incoming connections */ socket_val=socket(AF_INET, SOCK_DGRAM, 0); if (socket_val < 0) error("Error opening socket"); /* Set content of ctrlboard */ ctrlboardlen = sizeof(ctrlboard); bzero(&ctrlboard,ctrlboardlen); /* Fill in the UDP Receiver properties */ ctrlboard.sin_family=AF_INET; ctrlboard.sin_addr.s_addr=htonl(INADDR_ANY); ctrlboard.sin_port=htons(mc_port); /* Set socket options */ recvbuff=128*1024; if (setsockopt(socket_val,SOL_SOCKET,SO_RCVBUF,(char *) &recvbuff, sizeof(recvbuff)) < 0) error("Error setting socket options"); /* Get socket options */ optlen=sizeof(recvbuff); if (getsockopt(socket_val,SOL_SOCKET,SO_RCVBUF,(char *) &recvbuff, &optlen) < 0) error("Error getting socket options"); fprintf(stdout,"Receive buffer size: %d \n",recvbuff); /* Bind to associate port number with the socket */ bind_val = bind(socket_val,(struct sockaddr *)&ctrlboard,ctrlboardlen); if (bind_val < 0) error("Error bind"); /* join multicast group */ mreq.imr_multiaddr.s_addr=mcast_address.s_addr; mreq.imr_interface.s_addr=htonl(INADDR_ANY); rc = setsockopt(socket_val,IPPROTO_IP,IP_ADD_MEMBERSHIP, (void *) &mreq, sizeof(mreq)); if(rc<0){ fprintf(stdout,"Cannot join multicast group '%s'", inet_ntoa(mcast_address)); exit(1); } else fprintf(stdout,"Listening to mgroup %s:%d\n", inet_ntoa(mcast_address), mc_port); /* Fill in length of struct sockaddr_in camera */ cameralen = sizeof(camera); gettimeofday(&newtime,NULL); p_jiffies = scan_stat(&puser, &pkernel); p_start = getcurrjiffies(); /* Loop */ while (1) { recvfromval = recvfrom(socket_val,buffer,BUFSIZE,0,(struct sockaddr *)&camera,&cameralen); treceived += recvfromval; if (recvfromval < 0) error("Error recvfrom"); if(!(cnt&0xff)){ double cbitrate = 0.0; uint8_t ccnt = ucnt&0x7; uint8_t pcnt = ccnt?ccnt-1:0x7; uint64_t dt; c_jiffies = scan_stat(&cuser,&ckernel); c_start = getcurrjiffies(); gettimeofday(&newtime,NULL); usecs[ccnt] = (uint64_t)(newtime.tv_sec*1e6+newtime.tv_usec); received[ccnt] = treceived; dt = usecs[ccnt] - usecs[pcnt]; if(usecs[ccnt]<usecs[pcnt]){ ucnt = 0; } else{ unsigned lcnt = 0u; for(i=0;i<8;i++){ lcnt = received[i]; } if(dt){ cbitrate = (((double)lcnt*8)/((double)dt))*1e3; fprintf(stdout,"Approx bitrate is %2.2lf kbps, system load is %2.2f%%, process %2.2f%% (u %2.2f%%, s %2.2f).\n", cbitrate, getsysload()*100, (double)(c_jiffies - p_jiffies)*100/(c_start - p_start), (double)(cuser-puser)*100/(c_start - p_start), (double)(ckernel-pkernel)*100/(c_start - p_start) ); } } ucnt++; treceived = 0; p_jiffies = c_jiffies; p_start = c_start; pkernel = ckernel; puser = cuser; } cnt++; } close(socket_val); fclose(fp); } void error(char *msg) { perror(msg); exit(0); } [-- Attachment #1.3: send_mcast.c --] [-- Type: text/x-csrc, Size: 9247 bytes --] #include <stdio.h> #include <stdlib.h> #include <sys/types.h> #include <netinet/in.h> #include <sys/socket.h> #include <arpa/inet.h> #include <fcntl.h> #include <netdb.h> #include <stdlib.h> #include <unistd.h> #include <string.h> #include <sys/time.h> #include <time.h> #define STATFILE "/proc/stat" #define SELFSTATFILE "/proc/self/stat" void usage(const char* program) { fprintf(stdout,"Usage: %s -a address -p port -b bitrate\n",program); fprintf(stdout," bitrate in kbps.\n"); } float getsysload() { unsigned long user = 0ul, nice = 0ul, system = 0ul, idle = 0ul; char buf[BUFSIZ]; FILE *fp; char buffer[BUFSIZ]; memset(buffer,0x0,BUFSIZ); if((fp=fopen(STATFILE,"r"))<=0){ fprintf(stderr, "Problem opening %s\n", STATFILE); return EXIT_FAILURE; } fread(buffer, sizeof(char), BUFSIZ, fp); fclose(fp); if(sscanf(buffer,"%s %lu %lu %lu %lu",buf, &user, &nice, &system, &idle)){ return (float)(user+system)/(user+system+nice+idle); } else{ fprintf(stdout, "no matching strings found\n"); } } unsigned long getcurrjiffies() { FILE *fp; char buffer[BUFSIZ]; memset(buffer, 0x0, BUFSIZ); fp=popen("cat /proc/self/stat | cut -d \\ -f 22","r"); fread(buffer, sizeof(char), BUFSIZ, fp); pclose(fp); return atoi(buffer); } unsigned long scan_stat(unsigned long *user, unsigned long *kernel) { FILE *fp; char buffer[BUFSIZ]; uint32_t scanned = 0u; signed int pid = 0; char tcomm[BUFSIZ]; char state = 0x0; signed int ppid = 0; signed int pgid = 0; signed int sid = 0; signed int tty_nr = 0; signed int tty_pgrp = 0; unsigned long flags = 0ul; unsigned long min_flt = 0ul; unsigned long cmin_flt = 0ul; unsigned long maj_flt = 0ul; unsigned long cmaj_flt = 0ul; unsigned long utime = 0ul; unsigned long stime = 0ul; signed long cutime = 0l; signed long cstime = 0l; signed long priority = 0l; signed long nice = 0l; signed int num_threads = 0; unsigned long long start_time = 0ul; unsigned long vsize = 0ul; signed long rss = 0l; unsigned long rsslim = 0ul; unsigned long start_code = 0ul; unsigned long end_code = 0ul; unsigned long start_stack = 0ul; unsigned long esp = 0ul; unsigned long eip = 0ul; /* The signal information here is obsolete. * It must be decimal for Linux 2.0 compatibility. * Use /proc/#/status for real-time signals. */ unsigned long signalpending = 0ul; unsigned long signalblocked = 0ul; unsigned long sigign = 0ul; unsigned long sigcatch = 0ul; unsigned long wchan = 0ul; unsigned long dummy0 = 0ul; unsigned long dummy1 = 0ul; int exit_signal = 0; int task_cpu = 0; unsigned long rt_priority = 0ul; unsigned long policy = 0ul; unsigned long long delayticks = 0ull; memset(buffer,0x0,BUFSIZ); memset(tcomm,0x0,BUFSIZ); if(!(fp=fopen(SELFSTATFILE,"r"))){ fprintf(stderr,"Error opening \"%s\".\n",SELFSTATFILE); return 0; } fread(buffer, sizeof(char), BUFSIZ, fp); fclose(fp); #if 0 fprintf(stdout,"Reference:\n"); fprintf(stdout,"-------\n"); fprintf(stdout,"%s\n",buffer); fprintf(stdout,"-------\n"); #endif scanned = sscanf(buffer,"%d %s %c %d %d %d %d %d %lu %lu \ %lu %lu %lu %lu %lu %ld %ld %ld %ld %d 0 %llu %lu %ld %lu %lu %lu %lu %lu \ %lu %lu %lu %lu %lu %lu %lu %lu %d %d %lu %lu %llu\n", &pid, tcomm, &state, &ppid, &pgid, &sid, &tty_nr, &tty_pgrp, &flags, &min_flt, &cmin_flt, &maj_flt, &cmaj_flt, &utime, &stime, &cutime, &cstime, &priority, &nice, &num_threads, &start_time, &vsize, &rss, &rsslim, &start_code, &end_code, &start_stack, &esp, &eip, /* The signal information here is obsolete. * It must be decimal for Linux 2.0 compatibility. * Use /proc/#/status for real-time signals. */ &signalpending, &signalblocked, &sigign, &sigcatch, &wchan, &dummy0, &dummy1, &exit_signal, &task_cpu, &rt_priority, &policy, &delayticks); #if 0 fprintf(stdout,"scanned %u items.\n",scanned); fprintf(stdout,"%d %s %c %d %d %d %d %d %lu %lu \ %lu %lu %lu %lu %lu %ld %ld %ld %ld %d 0 %llu %lu %ld %lu %lu %lu %lu %lu \ %lu %lu %lu %lu %lu %lu %lu %lu %d %d %lu %lu %llu\n", pid, tcomm, state, ppid, pgid, sid, tty_nr, tty_pgrp, flags, min_flt, cmin_flt, maj_flt, cmaj_flt, utime, stime, cutime, cstime, priority, nice, num_threads, start_time, vsize, rss, rsslim, start_code, end_code, start_stack, esp, eip, /* The signal information here is obsolete. * It must be decimal for Linux 2.0 compatibility. * Use /proc/#/status for real-time signals. */ signalpending, signalblocked, sigign, sigcatch, wchan, dummy0, dummy1, exit_signal, task_cpu, rt_priority, policy, delayticks); #endif *user = utime; *kernel = stime; return utime + stime; } int main(int argc, char *argv[]) { extern int getopt(); extern int optind; extern char *optarg; int c_opt; unsigned int mc_server_socket; struct sockaddr_in mc_addr_sockaddr; uint8_t TTL = 0u; uint8_t buffer[BUFSIZ]; int retval; uint16_t mc_port = 0u; char mc_addr[16]; uint32_t mc_bitrate = 0u, i, cnt = 0u, ucnt=0u; float delay = 0; unsigned long usecs[8]; unsigned long p_jiffies, c_jiffies; unsigned long puser, pkernel; unsigned long cuser, ckernel; unsigned long long p_start, c_start; struct timeval newtime; /* Init */ memset(mc_addr, 0x0, 16); for(i=0;i<8;i++){ usecs[i] = 0ul; } for(i=0;i<BUFSIZ>>2;i++){ ((uint32_t*)buffer)[i]=0xbadc0ffe; } /* handling of command line options */ while ((c_opt = getopt(argc, argv, "a:b:p:")) != EOF) { switch (c_opt) { case 'a': strncpy(mc_addr,optarg,16); break; case 'b': mc_bitrate = (atoi(optarg)); break; case 'p': mc_port = (atoi(optarg)); break; default: fprintf(stderr, "%s: Bad Option -%c\n", argv[0], c_opt); exit(EXIT_FAILURE); } } if(!mc_addr || !mc_port || !mc_bitrate){ usage(argv[0]); return EXIT_FAILURE; } /* Create a multicast socket */ mc_server_socket=socket(AF_INET, SOCK_DGRAM,0); /* Create multicast group address information */ mc_addr_sockaddr.sin_family = AF_INET; mc_addr_sockaddr.sin_addr.s_addr = inet_addr(mc_addr); mc_addr_sockaddr.sin_port = htons(mc_port); /* Set the TTL for the sends using a setsockopt() */ TTL = 1; retval = setsockopt(mc_server_socket, IPPROTO_IP, IP_MULTICAST_TTL, (char *)&TTL, sizeof(TTL)); if (retval < 0){ fprintf(stdout,"ERROR setsockopt() failed with %d \n", retval); return EXIT_FAILURE; } /* get estimated us delay */ delay = 1e6/((float)(mc_bitrate<<10)/(sizeof(buffer)<<3)); /* Send MC message */ fprintf(stdout,"Multicast to socket %s:%u.\n",mc_addr, mc_port); fprintf(stdout,"Requested bitrate is %u kbps.\n",mc_bitrate); fprintf(stdout,"Need %2.2f packets of %u bytes per second.\n",((float)(mc_bitrate<<10)/(sizeof(buffer)<<3)),sizeof(buffer)); fprintf(stdout,"Setting interpacket delay at %2.2f usec.\n",delay); gettimeofday(&newtime,NULL); p_jiffies = scan_stat(&puser, &pkernel); p_start = getcurrjiffies(); // usecs[ucnt++] = newtime.tv_sec*1e6+newtime.tv_usec; while(1){ /* Send buffer as a datagram to the multicast group */ sendto(mc_server_socket, buffer, sizeof(buffer), 0, (struct sockaddr*)&mc_addr_sockaddr, sizeof(mc_addr_sockaddr)); usleep(delay); if(!(cnt&0xff)){ double cbitrate = 0.0; uint8_t ccnt = ucnt&0x7; uint8_t pcnt = ccnt?ccnt-1:0x7; unsigned long dt; c_jiffies = scan_stat(&cuser,&ckernel); c_start = getcurrjiffies(); gettimeofday(&newtime,NULL); usecs[ccnt] = newtime.tv_sec*1e6+newtime.tv_usec; dt = usecs[ccnt] - usecs[pcnt]; if(usecs[ccnt]<usecs[pcnt]){ ucnt = 0; } else{ cbitrate = ((((double)sizeof(buffer)*8)*0xff)/((float)dt))*1e3; fprintf(stdout,"Approx bitrate is %2.2lf kbps, system load is %2.2f%%, process %2.2f%% (u %2.2f%%, s %2.2f).\n", cbitrate, getsysload()*100, (double)(c_jiffies - p_jiffies)*100/(c_start - p_start), (double)(cuser-puser)*100/(c_start - p_start), (double)(ckernel-pkernel)*100/(c_start - p_start) ); if(((cbitrate+256)<mc_bitrate)){ delay -= 500; fprintf(stdout,"Interpacket delay adjusted to %2.2f usec\n",delay); } else if ((cbitrate-256)>mc_bitrate){ delay += 500; fprintf(stdout,"Interpacket delay adjusted to %2.2f usec\n",delay); } if(delay<0){ fprintf(stderr,"Cannot send data out fast enough.\n"); mc_bitrate -= 1024; delay = 0; fprintf(stdout,"Limiting data to %u kbps.\n",mc_bitrate); } ucnt++; } p_jiffies = c_jiffies; p_start = c_start; pkernel = ckernel; puser = cuser; } cnt++; } /* Close and clean-up */ close(mc_server_socket); return EXIT_SUCCESS; } [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.4/2.6/ppc/powerpc/8245/8347e 2007-06-28 18:06 ` 2.4/2.6/ppc/powerpc/8245/8347e Marc Leeman @ 2007-06-29 14:59 ` Marc Leeman 2007-07-09 15:47 ` 2.4/2.6/ppc/powerpc/8245/8347e Marc Leeman 0 siblings, 1 reply; 12+ messages in thread From: Marc Leeman @ 2007-06-29 14:59 UTC (permalink / raw) To: linuxppc-dev [-- Attachment #1: Type: text/plain, Size: 3657 bytes --] More platforms and higher bitrate tests (I've left the previous post in comment): > a) 8245/2.4.34/e100: 2.3.43-k1, @400 MHz > b) 8245/2.6.17/e100: 2.3.43-k1 [2] @350 MHz > c) 8347e/2.6.21.1/gianfar @400 Mhz > d) XScale-IXP42x/2.6.18-4/ixp4xx @266 MHz (NSLU2) e) 8245/2.6.17/e100: 3.5.10-k2-NAPI @350 MHz f) 405/2.6.22-rc6/smsc9117 @200 MHz g) 405/2.4.32/IBM OCP EMAC: 2.0 @266 MHz h) Coppermine/2.6.18/e100: 3.5.10-k2-NAPI @930 MHz > (2.3.43-k1 is the e100 driver version). Platform h is just an old server as reference to see if a 2.6.x scales as bad with an e100 on a different architecture. > > The process load for taking in the data is: > > a) 4-5% [3] > b) 10-11% > c) 13-14% > d) 4-5% e) 10-11% f) 2-3% g) 5% h) 0% This situation is even (a lot) worse when increasing the bitrate. When a bitrate of 12 Mbps is used, we get the following results: a) 4-5% b) 18% c) 35% d) 4-7% e) 18% f) - g) - h) 1-2 % > While the current 8347/gianfar platform is the worst performer, the > 2.6 kernel with the 2.4 e100 (before the rewrite) seems to perform > poorly too [4]. > > So the 834x preforms worse wrt the 8245 based configuration even though > it is slightly higher clocked. > > It seems as if I bumped into the problem that lead me stick with the 2.4 > in the first place for this 8245 platform; but never got round to > investigating. I find these results especially intriguing when > considering an ARM platform (NSLU2 device) that I had around, clocked at > only 66% of the 8347 and at 80% of the 8245 performs certainly in par > with the last one... The load is even worsening in a non linearly as the bitrate goes up (I coult not test all the platforms for this since not all the embedded platforms are located in our network and I've rallied some collegues from over the company to get some other platforms tested, probably I will get more data next week). > Even though I will need to recheck this (results to follow), a quick > test didn't reveal any significant difference between a ppc and powerpc > arch in the kernel. > > It does look like, on our 8245/83xx platforms, the 2.6.x kernel performs > worse wrt the 2.4 ppc kernels and the 83xx configuration is worse wrt > the 8245 based configuration [5]. In retrospect, we had signals that > there was a problem with the 8245/83xx performance over the network last > year when investigating gstreamer, but due to time pressure but assumed > it was due to gstreamer and not the processor. This came as a suprise to > some of the ppl on the gstreamer mailing list that reported performant > ports to ARM architectures. If I look at platform (f), 405/2.6.22-rc6, it doesn't seem to be a general powerpc problem, but just a 824x/83xx or platform issue. > The results with the NSLU2 will certainly put heat on us from management > when redesigning or for follow up designs :( > > Anyhow, I'm currently extending my test setups since this is an > important problem and set back. > > If anyone has a hint to explaining what is going on here, please do > since solving this will certainly beat redesigning (esp. considering the > timeframe we've been assigned). > > > I've only found one relevant reference to 2.4/2.6 network performance > decrease at this point [6]. > > > [1] sources attached mcrecv -a 225.1.2.3 -p 12345 mcsend -a 225.1.2.3 -p 12345 -b 8192 -- greetz, marc Aeryn, did I say or do anything to piss you off? I mean other than caving in the side of your head? Crichton - Die Me, Dichotomy chiana 2.6.18-4-ixp4xx #1 Tue Mar 27 18:01:56 BST 2007 GNU/Linux [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.4/2.6/ppc/powerpc/8245/8347e 2007-06-29 14:59 ` 2.4/2.6/ppc/powerpc/8245/8347e Marc Leeman @ 2007-07-09 15:47 ` Marc Leeman 2007-07-09 19:56 ` 2.4/2.6/ppc/powerpc/8245/8347e Linas Vepstas 0 siblings, 1 reply; 12+ messages in thread From: Marc Leeman @ 2007-07-09 15:47 UTC (permalink / raw) To: linuxppc-dev [-- Attachment #1: Type: text/plain, Size: 834 bytes --] > More platforms and higher bitrate tests (I've left the previous post in > comment): I finally was able to figure out the culprid: CONFIG_SLOB=y instead of CONFIG_SLAB=y -------- CONFIG_SLAB: Disabling this replaces the advanced SLAB allocator and kmalloc support with the drastically simpler SLOB allocator. SLOB is more space efficient but does not scale well and is more susceptible to fragmentation. -------- I was expecting a lower DMM performance but wasn't expecting such a drain on kernel/network load. The original reason for this change was a fixed flashmap and a increased 2.6 kernel that didn't fit in this region (backwards compatible). -- greetz, marc I feel like I had a spiritual enema. Jool - Losing Time chiana 2.6.18-4-ixp4xx #1 Tue Mar 27 18:01:56 BST 2007 GNU/Linux [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.4/2.6/ppc/powerpc/8245/8347e 2007-07-09 15:47 ` 2.4/2.6/ppc/powerpc/8245/8347e Marc Leeman @ 2007-07-09 19:56 ` Linas Vepstas 2007-07-10 7:55 ` 2.4/2.6/ppc/powerpc/8245/8347e Marc Leeman 0 siblings, 1 reply; 12+ messages in thread From: Linas Vepstas @ 2007-07-09 19:56 UTC (permalink / raw) To: Marc Leeman; +Cc: linuxppc-dev On Mon, Jul 09, 2007 at 05:47:23PM +0200, Marc Leeman wrote: > > Disabling this replaces the advanced SLAB allocator and > kmalloc support with the drastically simpler SLOB allocator. > SLOB is more space efficient but does not scale well and is > more susceptible to fragmentation. > -------- > > I was expecting a lower DMM performance but wasn't expecting such a > drain on kernel/network load. OK, to be clear: you seem to be saying that using the SLOB instead of the SLAB allocator results in such terrible memory fragmentation that network performance is degraded by large factors (2x or 5x or something like that, if I remember your earlier emails). Is that right? I thought I heard about some memory-defrag patches being posted. What happens if these are used together with SLOB? Does one regain the lost performance? Perhaps maybe one gets even better performance? --linas ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.4/2.6/ppc/powerpc/8245/8347e 2007-07-09 19:56 ` 2.4/2.6/ppc/powerpc/8245/8347e Linas Vepstas @ 2007-07-10 7:55 ` Marc Leeman 0 siblings, 0 replies; 12+ messages in thread From: Marc Leeman @ 2007-07-10 7:55 UTC (permalink / raw) To: Linas Vepstas; +Cc: linuxppc-dev [-- Attachment #1: Type: text/plain, Size: 1539 bytes --] > > I was expecting a lower DMM performance but wasn't expecting such a > > drain on kernel/network load. > > OK, to be clear: you seem to be saying that using the SLOB instead > of the SLAB allocator results in such terrible memory fragmentation > that network performance is degraded by large factors (2x or 5x or > something like that, if I remember your earlier emails). Is that right? Yep, I thought I would at least post my findings after hurracing the list with my posts. Well, I don't really know if it is the fragmentation that comes into play, or if it is simply the implementation of the slob allocator that much more inefficient in allocating free blocks of memory; but that's about right. > I thought I heard about some memory-defrag patches being posted. > What happens if these are used together with SLOB? Does one regain the > lost performance? Perhaps maybe one gets even better performance? In the ChangeLog of the 2.6.22, I saw something about a slub allocator that I want to test; I'll give your suggestion a go too, though I would not expect significant improvements: I suspect it's the slob implementation that is slower. But I had a small problem with my flash not being detected anymore when quickly booting the 2.6.22, I'll look into it today, there was a note in the ChangeLog for powerpc about this IIRC. -- greetz, marc Better wed than dead. Crichton - Look at the Princess - A Kiss is Just a Kiss chiana 2.6.18-4-ixp4xx #1 Tue Mar 27 18:01:56 BST 2007 GNU/Linux [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2007-07-10 7:55 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-21 14:15 network load 8245 vs. 8347E Marc Leeman
2007-06-21 15:33 ` Kumar Gala
2007-06-21 15:56 ` Marc Leeman
2007-06-21 17:13 ` Marc Leeman
[not found] ` <467ABCDC.8060401@genesi-usa.com>
2007-06-21 18:33 ` Marc Leeman
2007-06-21 18:53 ` Matt Sealey
2007-06-21 19:09 ` Marc Leeman
2007-06-28 18:06 ` 2.4/2.6/ppc/powerpc/8245/8347e Marc Leeman
2007-06-29 14:59 ` 2.4/2.6/ppc/powerpc/8245/8347e Marc Leeman
2007-07-09 15:47 ` 2.4/2.6/ppc/powerpc/8245/8347e Marc Leeman
2007-07-09 19:56 ` 2.4/2.6/ppc/powerpc/8245/8347e Linas Vepstas
2007-07-10 7:55 ` 2.4/2.6/ppc/powerpc/8245/8347e Marc Leeman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox