Project Home
Project Home
Documents
Documents
Wiki
Wiki
Discussion Forums
Discussions
Project Information
Project Info
Forum Topic - dlopen fails but not a helpful dlerror response: (5 Items)
   
dlopen fails but not a helpful dlerror response  
Any ideas how to get a useful error message from the dynamic library loading facility? I have an issue where the 
debugger is loading my library into /opt/target and my code is calling dlopen with the correct name. I get NULL back 
from the call and the error string from dlerror() is "(Shared objects still referenced)"

As you can see, it is not terribly useful in figuring out what the OS thought was wrong. I have verified that the 
permissions are correct (777) and the name is spelled correctly so it is not permissions or a bad name.

With fopen() you get an error such as E_EXIST or other useful detailed information.
Here is the offending code:
	void *Library=dlopen(Path.c_str(), RTLD_NOW | RTLD_NODELETE);
	if ( Library==NULL )
        {
		char *ErrorStr=dlerror();
		printf( "%s::%s(): ERROR - failed to load '%s' (%s)\n",
				CLASS_NAME, __FUNCTION__, Path.c_str(), ErrorStr);
	}
And here is the output:
IComponent::RegisterDynamic(): ERROR - failed to load '/opt/target/libModbusTransceiver_g.so' (Shared objects still 
referenced)

Any hints on how to find the real error are appreciated. My guess is that there is some error with the format of the 
library since the file exists and the name in the call matches the name on disk.

Ray
Re: dlopen fails but not a helpful dlerror response  
Try running your app with env var. "LD_DEBUG=1" set.



  Original Message
From: Ray Mack
Sent: Tuesday, July 14, 2015 11:14
To: ostech-core_os
Reply To: ostech-core_os@community.qnx.com
Subject: dlopen fails but not a helpful dlerror response


Any ideas how to get a useful error message from the dynamic library loading facility? I have an issue where the 
debugger is loading my library into /opt/target and my code is calling dlopen with the correct name. I get NULL back 
from the call and the error string from dlerror() is "(Shared objects still referenced)"

As you can see, it is not terribly useful in figuring out what the OS thought was wrong. I have verified that the 
permissions are correct (777) and the name is spelled correctly so it is not permissions or a bad name.

With fopen() you get an error such as E_EXIST or other useful detailed information.
Here is the offending code:
        void *Library=dlopen(Path.c_str(), RTLD_NOW | RTLD_NODELETE);
        if ( Library==NULL )
        {
                char *ErrorStr=dlerror();
                printf( "%s::%s(): ERROR - failed to load '%s' (%s)\n",
                                CLASS_NAME, __FUNCTION__, Path.c_str(), ErrorStr);
        }
And here is the output:
IComponent::RegisterDynamic(): ERROR - failed to load '/opt/target/libModbusTransceiver_g.so' (Shared objects still 
referenced)

Any hints on how to find the real error are appreciated. My guess is that there is some error with the format of the 
library since the file exists and the name in the call matches the name on disk.

Ray




_______________________________________________

OSTech
http://community.qnx.com/sf/go/post114119
To cancel your subscription to this discussion, please e-mail ostech-core_os-unsubscribe@community.qnx.com
Re: dlopen fails but not a helpful dlerror response  
Thanks so much for the pointer on how to display loader debug information.

It turns out that there was appropriate debug information coming out to let me know that the error was due to unresolved
 symbols, but it appeared in my console log in a place where it looked to be associated with a different set of errors.

Now the problem is that a shared object library is not getting resolved at run time correctly. We are using Stephan 
Raimbault's MODBUS library that is compiled as one shared object library. It gets loaded and used by one of our threads 
(yet another runtime loaded object) correctly.  *All* of the symbols are resolved for that thread. Next another thread 
is loaded as a shared object and it also uses the MODBUS library. In that case, *all* of the symbols from the MODBUS 
library show up as unresolved (including a large number that are also in the other thread). I tried loading the new 
module right after the MODBUS library in case it was an order issue, but still no joy.

I ran nm on all three libraries and the symbols are correctly named in all three. I suspect there is some mismatch in 
the object definitions in the thread I am trying to add, but it I have not found it. It is unclear how I can further 
diagnose the new object library and "hand resolve" the symbols to find where the error is. If it were Intel object 
format, I would have a fighting chance since I know that format.

The loader debug output shows that the MODBUS library is getting loaded into the program's memory space and stays there 
throughout the process of loading all these libraries.

The loader debug output is not sufficient to see exactly how the symbols are determined to be unresolved.
Re: dlopen fails but not a helpful dlerror response  
Symbols are resolved within their resolution scope. This is not dependent
on the calling thread, only on the resolution scope.

To start troubleshooting, I suggest collecting 'resolution scopes' from
the LD_DEBUG logs.

Observe which object *defines* the unresolved symbol(s). This object,
let's call it lbFOO, needs to appear in the resolution scope of MODBUS
since MODBUS library needs the symbol.

Normally, you would add libFOO as a dependency to MODBUS. However, if you
can not change MODBUS, then there are a few options:


From what you are describing, I would expect that the working "thread"
(shared object) lists needed dependency libFOO as its own dependency and
it gets included in MODBUS resolution scope; MODBUS then gets the symbol
from it.


In the non-working scenario the said object (libFOO) does not get pulled
in the resolution scope and the symbol remains unresolved.

To include libFOO in the resolution scope, there a are a few scenarios (as
a workaround to the ideal, which is adding all needed libraries to MODBUS
library).


A) Assuming your "threads" - shared objects link against MODBUS (as
opposed to dlopening it), you can remedy the situation by adding defining
object to the non-working shared object NEEDED list (you add libFOO as a
library passed to the linker 'ld').


B) Another workaround is to link your defining library to your executable.
That way, it will end up in the resolution scope of MODBUS. You dlopen
your objects with '0' passed as flags.

C) you can explicitly dlopen the defining object with
RTLD_GROUP|RTLD_GLOBAL|RTLD_WORLD flags, but you have to do this *before*
any dlopen(MODBUS). Then, dlopen of MODBUS must contain
RTLD_GLOBAL|RTLD_WORLD to pull in the object dlopened above.



If the above doesn't help, please post more detail: which symbol was not
resolved, which shared object defines it, and portions of the LD_DEBUG
output listing "Resolution scope" for each of the 'thread' shared objects
- working and non working ones. Also, which exactly OS version are you
using.


HTH,

Aleksandar

Re: dlopen fails but not a helpful dlerror response  
Aleksandr:

Thanks for the detailed information. It is as you suspected. There are two steps where a symbol gets linked. For some 
reason, the engineer that started this project did not go into the build properties and set up the MODBUS library as a 
dependency.  It was not obvious at the beginning that you need to add library names and paths into the linker build data
.

It was also not obvious that there were several drop down menus where additional library information needed to be 
included. Once I found those and compared against a working project, it became clear what was missing from the linker.

As nearly as I can tell, the linker puts both the symbol names as well as all referenced library names into a shared 
library object. That is where it was broken.

Thanks so much for your well thought out explanations.

Ray