Hi,

note: all I know about OpenMP is what I just found with a quick search ... so take anything I say with caution 😉


In general, please supply as much information as possible when writing things like "better performance", eg:

- what are the actual numbers? (or even more generally, what is considered "better performance"?)

- how was this measured?

- is this reproducable?

- what else did you try do understand what's going on, and what were the results?


looking at your example, and comparing to what I found, I believe you may have missed segmenting 'c' into per-thread sections, and you may have heavy contention when writing to this array.


HTH

-- Michael



From: Wenbin Wang <community-noreply@qnx.com>
Sent: Tuesday, January 21, 2020 06:18
To: ostech-core_os
Subject: openmp poor performance
 
openmp performance is much worth in multi-threads mode than single thread mode.

code:
```
#include <stdio.h>
#include <omp.h>

#define N 10000
int main(int argc, char** argv) {
  float a[N], b[N], c[N];
  int i;

  for (i = 0; i < N; i++) {
    a[i] = i * 2.0;
    b[i] = i * 3.0;
  }

 
  #pragma omp parallel num_threads(4) shared(a, b, c) private(i)
   for (size_t j = 0; j < 10000; ++j) {
        #pragma omp for
        for (i = 0; i < N; i++) {
            c[i] = a[i] + b[i];
            VLOG(3) << c[i];
        }
  }

}
```

change num_threads(4)  to num_threads(1) gives better performance.

any idea why?



_______________________________________________

OSTech
http://community.qnx.com/sf/go/post120163
To cancel your subscription to this discussion, please e-mail ostech-core_os-unsubscribe@community.qnx.com