Project Home
Project Home
Documents
Documents
Wiki
Wiki
Discussion Forums
Discussions
Project Information
Project Info
Forum Topic - pthread_rwlock_unlock( ) return-value discrepancy?: (10 Items)
   
pthread_rwlock_unlock( ) return-value discrepancy?  
Some code encapsulates pthread_rwlock_trywrlock( ) in one method, and pthread_rwlock_unlock( ) in another, of a 
singleton class/object.  Another class calls these two methods, in lock-then-unlock sequence with some other work in 
between.

The write-lock operation succeeds; pthread_rwlock_trywrlock( ) returns 0 (EOK).  The unlock operation fails; 
pthread_rwlock_unlock( ) returns 1 (EPERM).

I have two problems.

1) It's not clear why pthread_rwlock_unlock() is returning EPERM.  The Library Reference Manual says that this indicates
 that "No thread has a read or write lock on rwl or the calling thread doesn’t have a write lock on rwl," but in this 
case it seems pretty obvious that the calling thread DOES "have a write lock on wrl."  

2) In order to determine what was going on, I had to print the return values as ints; passing the value 1 (EPERM) to 
strerror( ) returns the string "No error," which I couldn't distinguish from 0 (EOK).  Comments in errno.h suggest that 
EPERM should be stringified instead as "Not owner."

Chris
Re: pthread_rwlock_unlock( ) return-value discrepancy?  
Following up to myself...

Looks like strerror(EPERM) is okay after all.  I was feeding it the wrong variable, which had a different value.  
Expectations vs. implementation was messed up.

Still don't understand why I'm getting EPERM, though.  I've confirmed that the same lock (the only one in the program, 
but you never know) is being successfully "trywrlock"-ed and unsuccessfully unlocked.

Um, is it possible for pthread_rwlock_trywrlock() to return EOK while _not_ locking the lock?

Chris
Re: pthread_rwlock_unlock( ) return-value discrepancy?  
On Fri, Mar 19, 2010 at 09:59:23AM -0400, Chris Chiesa wrote:
> Following up to myself...
> 
> Looks like strerror(EPERM) is okay after all.  I was feeding it the wrong variable, which had a different value.  
Expectations vs. implementation was messed up.
> 
> Still don't understand why I'm getting EPERM, though.  I've confirmed that the same lock (the only one in the program,
 but you never know) is being successfully "trywrlock"-ed and unsuccessfully unlocked.
> 
> Um, is it possible for pthread_rwlock_trywrlock() to return EOK while _not_ locking the lock?
> 
> Chris

Can you post a test case?

-seanb
Re: pthread_rwlock_unlock( ) return-value discrepancy?  
> Can you post a simple test case? 

Not really; I was trying to write a simple program to demonstrate it -- but it turns out that outside of my company's 
(SConstruct.py-based) construction scheme I can't even get 'hello world' to run.  (It compiles, but at run time I get a 
"syntax error:'(' unexpected" error message.  That's a topic for a whole other post, someday when I have time.)

Chris 
Re: pthread_rwlock_unlock( ) return-value discrepancy?  
On 19/03/2010 10:49, Chris Chiesa wrote:
>
>> Can you post a simple test case?
>
> Not really; I was trying to write a simple program to demonstrate it -- but it turns out that outside of my company's 
(SConstruct.py-based) construction scheme I can't even get 'hello world' to run.  (It compiles, but at run time I get a 
"syntax error:'(' unexpected" error message.  That's a topic for a whole other post, someday when I have time.)

That looks like you are trying to run a binary built for one 
target architecture on another.

---
Aleksandar
Re: pthread_rwlock_unlock( ) return-value discrepancy?  
> That looks like you are trying to run a binary built for one 
> target architecture on another.

Thanks.  I'll look into it.  I'm pretty sure I'm using the right compiler, but maybe I'm missing a crucial command-line 
switch or something.

Re: pthread_rwlock_unlock( ) return-value discrepancy?  
> 
> > That looks like you are trying to run a binary built for one 
> > target architecture on another.
> 
> Thanks.  I'll look into it.  I'm pretty sure I'm using the right compiler, but
>  maybe I'm missing a crucial command-line switch or something.
> 

Figured it out.  First had to (correctly) use company SConstruct procedures after all.  Second, FileZilla FTP client in 
AUTO mode, claimed to transmit my binary (from development system to target system) in BINARY mode, but didn't really.  
When I forcibly set FileZilla to BINARY mode, the same binary arrived on the target in a form that executed succesfully.
  Whew.
Re: pthread_rwlock_unlock( ) return-value discrepancy?  
I have written several versions of a test-case program.  It exhibits
precisely the behavior I would expect, under non-locking,
non-blocking-locking, and blocking-locking conditions.  It
specifically fails to exhibit the precise behavior that (I think) I'm
seeing in my production code.  However, it has made me aware that
non-blocking locking requires a somewhat different thought process
than the more familiar blocking kind, and I am now struggling with
coming up with the right way to use non-blocking locking.  I can't
afford to have my production code block here, but it seems the only
alternative is to massively restructure my code to continuously retry
operations that are buried deep in a resource manager, where retrying
is awkward at best.  On the other hand, the blocking versions of the
functions appear to be very stable so maybe I can get away with using
them anyway.  Decisions, decisions.  Are there any standard paradigms/
patterns for using the non-blocking (...try_...) locking functions?
If so, I would greatly appreciate hearing about them.

Those not wishing/needing to examine/experiment further can stop
reading here; I thank you for your time.  For those who do wish to go
deeper, I have attached a zip archive containing several relevant
files, as follows.

File tryit.cpp is, of course, the program source code.  As far as I
can tell, it should build on anybody's QNX platform (FWIW, I'm using
6.3.2).  Program history and output are as follows.

The first version of the program had no locking, and exhibited race
conditions: some of the output lines showed the string containing a
run of one character followed by a run of another; i.e. the reader had
caught the writer in the act of changing the string's contents.  See
thread_interlace.txt.  This is exactly as I expected (and it took some
time to cobble up a situation in which the effects of not locking
would be easily visible).

The second version of the program used the non-blocking (...try...)
versions of the locking functions.  See file radio_try_lockfails.txt.
It appears that the writer and first three readers started up
more-or-less immediately, and ran without locking problems but also
with the readers producing no output, i.e. perhaps not really running,
perhaps starved by the writer, though it's not clear how this can
occur.  Startup of the fourth reader is reported quite belatedly, and
locking problems begin immediately after that startup is announced.
The locking problem is specifically that the first reader tries to
lock while the writer holds the lock; this is as I would expect, and
is in more-or-less the same "category" as the problem I see in my
production code -- but is DIFFERENT FROM the behavior of my production
code, in which problems begin when the WRITER *UN*LOCKS the lock that
IT ITSELF allegedly holds.  (I say "allegedly" because it is
impossible to know for sure without doing your own bookkeeping which
requires making assumptions about the state of the underlying
lock. (If only struct pthread_wrlock_t weren't opaque, I could examine
the lock itself under various conditions and see what IT (i.e. the
locking facility) thinks is going on, and probably solve all of this
quite quickly.  Many years of experiences like this one have given me
an abiding dislike for opaque data structures.)  Several of the
behaviors of this program are difficult to explain: 1) The belated
indication of Thread 4 startup, the lack of output prior to that
point, and the coincidence of that thread startup with the beginning
of locking problems, suggests that the readers are not fully started
until after the writer has been running for some time -- which is very
odd since the readers are started BEFORE the writer.  Well, maybe
starvation is occurring.  2) Even though "locking is failing," I see
no evidence of the race condition that appeared in the no-locking
version of the program.  Conversely, note that while the...
View Full Message
Attachment: Text example.zip 9.41 KB
Re: pthread_rwlock_unlock( ) return-value discrepancy?  
I may have nailed it down.  I added a bunch more debugging statements to my production code and I now see it "locking 
once, unlocking twice."  Now have to figure out how/where THAT's happening.  Details irrelevant to this discussion.

Glad to know it's not a problem with the locking facility.  That was really bugging me.
Re: pthread_rwlock_unlock( ) return-value discrepancy?  
On Mon, 2010-03-22 at 14:30 -0400, Chris Chiesa wrote:
> Are there any standard paradigms/patterns for using the non-blocking
> (...try_...) locking functions? If so, I would greatly appreciate
> hearing about them.

The usual advice I give is "Don't" ;-)

As you have noted, it is usually a system design choice and not
something that should be shoehorned in later....