|
qcc using emulated tls rather than %gs
|
09/15/2009 2:03 PM
post38018
|
qcc using emulated tls rather than %gs
I am starting a port of a large multi-threaded application from smp linux to smp qnx 6.4.1. It uses the gcc __thread
keyword for thread local variables in performance-critical code. Under gcc 3.4.4 on linux, this source snippet:
__thread int _local_number;
int getLocalNumber(void)
{
return _local_number;
}
generates this assembly:
083202b4 <getLocalNumber>:
83202b4: 55 push %ebp
83202b5: 89 e5 mov %esp,%ebp
83202b7: a1 14 81 38 08 mov 0x8388114,%eax
83202bc: 65 8b 00 mov %gs:(%eax),%eax
83202bf: 5d pop %ebp
83202c0: c3 ret
Note the use of the %gs segment register to access tls variables.
Using qcc v4.3.3 for qnx 6.4.1 generates this assembly:
0834ee10 <getLocalNumber>:
834ee10: 55 push %ebp
834ee11: 89 e5 mov %esp,%ebp
834ee13: 83 ec 08 sub $0x8,%esp
834ee16: c7 04 24 c0 6f 40 08 movl $0x8406fc0,(%esp)
834ee1d: e8 0a 50 05 00 call 83a3e2c <__emutls_get_address>
834ee22: 8b 00 mov (%eax),%eax
834ee24: c9 leave
834ee25: c3 ret
The __emutls_get_address function appears to use the posix pthread_getspecific function to obtain thread local data.
The performance of emulated tls is significantly slower than using %gs. My questions are:
1. Is there any option to qcc (and corresponding support in the runtime) that provides for tls support via the %gs
register?
2. Is there any plan to update qcc and qnx to support %gs for tls in the future?
3. If I allocate my own per-thread local variable data structs and point %gs to them, can I expect %gs to be maintained
across context switches of my application threads?
|
|
|