Project Home
Project Home
Source Code
Source Code
Documents
Documents
Wiki
Wiki
Discussion Forums
Discussions
Project Information
Project Info
Forum Topic - Stange performance issue: (7 Items)
   
Stange performance issue  
If I compile a very program small program:

qcc -V3.3.5,gcc_ntox86 program.c

It compiles in .05 seconds.  And I can see that there was 916 read from the cache and 3 from the disk.  
qcc -V4.3.2,gcc_ntox86

It compiles in 1.72 seconds.  And I can see that there was 1202 read from the cache and 5789!!!! from the disk.  Even if
 I run it 10 times in a row, it's aways the same. Yet the filesystem cache is 128 Meg so it's not because it doesn't fit
 in the cache.

This is odd?


Re: Stange performance issue  
> 
> If I compile a very program small program:
> 
> qcc -V3.3.5,gcc_ntox86 program.c
> 
> It compiles in .05 seconds.  And I can see that there was 916 read from the 
> cache and 3 from the disk.  
> qcc -V4.3.2,gcc_ntox86
> 
> It compiles in 1.72 seconds.  And I can see that there was 1202 read from the 
> cache and 5789!!!! from the disk.  Even if I run it 10 times in a row, it's 
> aways the same. Yet the filesystem cache is 128 Meg so it's not because it 
> doesn't fit in the cache.
> 
> This is odd?

Yes, I find it strange. I have not observed any performance similar to what you describe. May I ask how you are 
measuring the reads? 

I'm very interested in investigating this behavior as compilation times using gcc 4.2 needs to be faster then gcc 3.3.5.


Regards,

Ryan Mansfield

Re: Stange performance issue  
Take a trace, take a trace, take a trace. :-)

Hmmm, gotta get moving on dtrace, that would be handy in this case.

Ryan Mansfield wrote:
>
> >
> > If I compile a very program small program:
> >
> > qcc -V3.3.5,gcc_ntox86 program.c
> >
> > It compiles in .05 seconds.  And I can see that there was 916 read 
> from the
> > cache and 3 from the disk. 
> > qcc -V4.3.2,gcc_ntox86
> >
> > It compiles in 1.72 seconds.  And I can see that there was 1202 read 
> from the
> > cache and 5789!!!! from the disk.  Even if I run it 10 times in a 
> row, it's
> > aways the same. Yet the filesystem cache is 128 Meg so it's not 
> because it
> > doesn't fit in the cache.
> >
> > This is odd?
>
> Yes, I find it strange. I have not observed any performance similar to 
> what you describe. May I ask how you are measuring the reads?
>
> I'm very interested in investigating this behavior as compilation 
> times using gcc 4.2 needs to be faster then gcc 3.3.5.
>
> Regards,
>
> Ryan Mansfield
>
>
>
> _______________________________________________
> General
> http://community.qnx.com/sf/go/post3066
>

-- 
cburgess@qnx.com

Re: Stange performance issue  
> Take a trace, take a trace, take a trace. :-)
> 
> Hmmm, gotta get moving on dtrace, that would be handy in this case.
> 

I have attached a trace file ( compressed with 7za ).

cc1 is in WaitPage state most of the time ???

In this sample qcc -V4.2.1 took 1.87 seconds, while -V3.3.5 took .03 seconds...

Before taking the trace I ran qcc -V4.2.1 foo.c a few time to make sure it all sits in the file system cache.

Attachment: Text trenton-trace-071128-164416.7z 1.08 MB
Re: Stange performance issue  
I took a trace of my own, and I can confirm that cc1 is waitpage most of 
the time.

This is a combination of a larger binary, plus the sticky bit was 
forgotten on the binary.
This sticky bit hints to procnto that you use the program a lot, and 
keeps it in memory for 30
seconds after exit.  It can make a big difference for some programs, 
especially cc1.

You can replace the sticky  bit with

# chmod +t cc1

I notice that it is missing on a great deal of binaries in $QNX_HOST - 
you might want to set it on them all.

Colin

Mario Charest wrote:
>
> > Take a trace, take a trace, take a trace. :-)
> >
> > Hmmm, gotta get moving on dtrace, that would be handy in this case.
> >
>
> I have attached a trace file ( compressed with 7za ).
>
> cc1 is in WaitPage state most of the time ???
>
> In this sample qcc -V4.2.1 took 1.87 seconds, while -V3.3.5 took .03 
> seconds...
>
> Before taking the trace I ran qcc -V4.2.1 foo.c a few time to make 
> sure it all sits in the file system cache.
>
>
>
> _______________________________________________
> General
> http://community.qnx.com/sf/go/post3070
>
>  
>

-- 
cburgess@qnx.com

Re: Stange performance issue  
> I took a trace of my own, and I can confirm that cc1 is waitpage most of 
> the time.
> 
> This is a combination of a larger binary, plus the sticky bit was 
> forgotten on the binary.
> This sticky bit hints to procnto that you use the program a lot, and 
> keeps it in memory for 30
> seconds after exit.  It can make a big difference for some programs, 
> especially cc1.
> 
> You can replace the sticky  bit with
> 
> # chmod +t cc1
> 

Now your talking!!!

I still find it strange that it was that slow because i set up the file system cache to be 64Meg which should be enough 
to hold everything to compile a very small program, no?

cc1 is 5Meg it shouldn't take long to load it from cache to the memory.  Unless it's because it's allocating lots of 
memory?
Re: Stange performance issue  
Mario Charest wrote:
>
> > You can replace the sticky  bit with
> >
> > # chmod +t cc1
> >
>
> Now your talking!!!
>
> I still find it strange that it was that slow because i set up the 
> file system cache to be 64Meg which should be enough to hold 
> everything to compile a very small program, no?
>
> cc1 is 5Meg it shouldn't take long to load it from cache to the 
> memory.  Unless it's because it's allocating lots of memory?
>
Well, it's faulting it in, so it's a fault, pulse, procnto runs, reads 
from devb, restarts cc1, so it adds a lot of overhead.

Colin
>
> _______________________________________________
> General
> http://community.qnx.com/sf/go/post3418
>

-- 
cburgess@qnx.com