[URGENT] System stopped working when Oracle was upgraded -> OracleDataAccess.dll to version 2.102.4.0

Posts   
 
    
changomarcelo avatar
Posts: 62
Joined: 15-Feb-2007
# Posted on: 01-Dec-2008 12:37:46   

Hi, this is a long story, and we are really in a hurry. This is for a cars plant in Mexico and they are holding on the production because of this. They should start producing in 2 hours.

We have been using LLBLGen 2.5 with Oracle 10g release 2, and ODP.NET, through a DLL called Oracle.DataAccess.dll version 2.102.2.20.

So far the system was behaving good, but a mem leak error made my client think that it was needed to install an Oracle patch. That Oracle patch upgraded the DB and the OracleDataAccess.dll to version 2.102.4.0.

One of our guys test the patch in a test server and everything went fine.

My boss went to the plant in Mexico to install the patch (I work from Argentina). Apparently he left the server patched, but the app is not working.

I did a simple console app. The following works fine:

Console.WriteLine("Test connection through Oracle.DataAccess.dll");
                Oracle.DataAccess.Client.OracleConnection conn = new Oracle.DataAccess.Client.OracleConnection();
                conn.ConnectionString = ConfigurationManager.ConnectionStrings["ConnectionString"].ConnectionString.ToString();
                conn.Open();
                Console.WriteLine("Connection successfully opened");
                conn.Close();
                Console.WriteLine("Connection successfully closed");

But when I try this, fails:

Console.WriteLine("Test connection through LLBLGenPro");
                Console.WriteLine("Retrieving Base Models from DB...");
                BaseModelCollection baseModels = new BaseModelCollection();
                baseModels.GetMulti(null);
                foreach (BaseModelEntity bm in baseModels)
                {
                    Console.WriteLine("- {0} ({1})", bm.Name, bm.Year);
                }
                Console.WriteLine("Connection through LLBLGenPro succeeded.");

The error we get is this:

Exception: SD.LLBLGen.Pro.ORMSupportClasses.ORMQueryExecutionException: An excep
tion was caught during the execution of a retrieval query: Data provider interna
l error(-3001) [Oracle.DataAccess.Client.OracleCommand]. Check InnerException, Q
ueryExecuted and Parameters of this exception to examine the cause of this exception.

Unhandled Exception: System.EntryPointNotFoundException: Unable to find an entry
 point named 'OpsSqlTimeout' in DLL 'OraOps10w.dll'.
   at Oracle.DataAccess.Client.OpsSql.Timeout(OpoSqlValTimeoutCtx& opoSqlValTime
outCtx)
   at Oracle.DataAccess.Client.OracleCommand.Timeout(Object state)
   at System.Threading._ThreadPoolWaitCallback.WaitCallback_Context(Object state
)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, C
ontextCallback callback, Object state)
   at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback(Object state)

We found out that OracleDataAccess.dll relays on an unmanaged DLL OraOps10w.dll. Seems that our app is not resolving a method in that DLL. We checked what versions of the DLLs are running with Process Explorer, and they are both 2.102.4.0, and that's the way it should be.

Now I wonder if I screwed it with LLBLGen pro somehow because this is what I did:

  1. I wanted to make sure that the new version of the OracleDataAccess.dll was being used and since SD.LLBLGen.Pro.DQE.Oracle10g.NET20.dll was referencing the old version (2.102.2.20), I recompiled the source code using the new one as a reference (2.102.4.0).

  2. Then, since the LLBL runtime source code I had was from version 2.6, and the project was 2.5, I had to regenerate the data access project.

I wonder if I mess it there. I'm begining to learn how the .NET runtime resolves the assemblies and I'm realizing that it's not that simple. So, now I ask myself (and you folks) these questions?

  • Should I have rebuilt the LLBL runtime with the new Oracle.DataAccess.dll (2.102.4.0)?
  • Should it work if I simply use the 2.6 runtime DLLs out of the box and my new code regenerated for 2.6?
  • Isn't it a problem that the 2.6 runtimes are compiled against OracleDataAccess.dll 2.102.2.20 and we want to use 2.102.4.0?

(Some of those answers could be answered by simple tests on the realn environment but I don't have acces to it right now since my boss went to take a nap after 20 hours of work and there's no remote access).

The weird thing is that the application works fine in our testing server in any flavor. So I imagine that this should be a versioning problem or something like that.

Thanks!

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39865
Joined: 17-Aug-2003
# Posted on: 01-Dec-2008 13:03:23   

You can't update just the ODP.NET .net dll, you have to install the full ODP.NET install, which comes with a client (200MB+ of stuff).

If you received a patch from oracle and it's just the dll, call Oracle and tell them their patch is broken. If you received a full ODP.NET install from oracle (and if I recall correctly, patches for ODP.NET are distributed as full ODP.NET installations) please install the full ODP.NET install you received from Oracle.

You don't need to recompile DQE's to make the DQE work with the new ODP.NET version: you can use an assembly redirect in your application's .config file: <dependentAssembly> <assemblyIdentity name="Oracle.DataAccess" publicKeyToken="89b483f429c47342"/> <bindingRedirect oldVersion="2.102.2.20-2.102.3.99" newVersion="2.102.4.0"/> </dependentAssembly>

(see the llblgenpro.exe.config file for an example)

So best thing for you is to pick the v2.5 DQE for ODP.NET and add this redirect, IF you get ODP.NET resolve errors. The thing is that Oracle also installs policy files most of the time so these redirects aren't needed.

All LLBLGen Pro does is call the ODP.NET code, so we never call into other DLL's or client code, it's just OracleConnection, OracleCommand, OracleParameter and OracleTransaction

(edit) be sure to UNinstall the older ODP.NET first. It might be you've installed 2 times ODP.NET (once the 2.102.2.20 version and another time the 2.102.4.0 version), and this gives two oracle homes. (see on a command prompt the 'path' environment variable). This could lead to the weird thing that the code initially uses the new oracle home but switches to the old oracle home with the old client.

Frans Bouma | Lead developer LLBLGen Pro
changomarcelo avatar
Posts: 62
Joined: 15-Feb-2007
# Posted on: 01-Dec-2008 13:12:36   

Hi Otis,

Thanks for your answer!

I didn't do the patch. My boss did and I understand that he received the full patch from Oracle and applied it in full, as you suggest. He even got phone support from Oracle to do the patch.

It's good to read that I don't need to rebuild the LLBLGen runtime. But I'm not sure how I should configure the references to OracleDataAccess.dll in my development PC to build the application. My dev PC is not patched. Here I have the ODP.NET version that comes with 2.102.2.20. I don't know if I should reference that assembly or get the 2.102.4.0 version and reference that instead. Can you help with this?

For the other LLBLGenPro assemblies, I think I will reference directly to the 2.6 ones out of the box. Is it right?

Finally, I think I will try the assembly redirect and check that there are not two ODP.NET homes.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39865
Joined: 17-Aug-2003
# Posted on: 01-Dec-2008 13:28:45   

changomarcelo wrote:

Hi Otis,

Thanks for your answer!

I didn't do the patch. My boss did and I understand that he received the full patch from Oracle and applied it in full, as you suggest. He even got phone support from Oracle to do the patch.

It's good to read that I don't need to rebuild the LLBLGen runtime. But I'm not sure how I should configure the references to OracleDataAccess.dll in my development PC to build the application. My dev PC is not patched. Here I have the ODP.NET version that comes with 2.102.2.20. I don't know if I should reference that assembly or get the 2.102.4.0 version and reference that instead. Can you help with this?

On your dev box, simply build against 2.102.2.20, referencing the normal DQE's from us. Now, at the site where ODP.NET 2.102.4.0 is installed, you can run into two things: 1) the application you build with 2.102.2.20 runs normally. This is because Oracle installed a policy file in the gac which redirects 2.102.2.20 references automatically to 2.102.4.0. The RTM builds of ODP.NET do install these kind of policy files, however patches from oracle not always do that 2) the application you build with 2.102.2.20 crashes directly that it can't resolve the reference to Oracle.DataAccess v2.102.2.20. In that case, add the assembly redirect I specified above. See for more details also this post on Walaa's blog: http://walaapoints.blogspot.com/2007/06/odpnet-llblgen-pro.html

In either case: uninstall ODP.NET 2.102.2.20 on the PRODUCTION machine. You don't have to do anything on your dev box.

For the other LLBLGenPro assemblies, I think I will reference directly to the 2.6 ones out of the box. Is it right?

If your application is build with v2.5 and the code is generated using v2.5, you have to reference the runtime libs from v2.5, not 2.6. This is not difficult, because you can install them side-by-side on your dev machine

Frans Bouma | Lead developer LLBLGen Pro
changomarcelo avatar
Posts: 62
Joined: 15-Feb-2007
# Posted on: 01-Dec-2008 14:03:14   

We verified that we only have one ODP.NET home (and another home for db). We deleted all the old assemblies from the GAC too. We built the new test app as I described above, with assembly redirection, but I still get this error:

Exception: Oracle.DataAccess.Client.OracleException: Data provider internal erro
r(-3001) [Oracle.DataAccess.Client.OracleCommand]

Unhandled Exception: System.EntryPointNotFoundException: Unable to find an entry
 point named 'OpsSqlTimeout' in DLL 'OraOps10w.dll'.
   at Oracle.DataAccess.Client.OpsSql.Timeout(OpoSqlValTimeoutCtx& opoSqlValTime
outCtx)
   at Oracle.DataAccess.Client.OracleCommand.Timeout(Object state)
   at System.Threading._ThreadPoolWaitCallback.WaitCallback_Context(Object state
)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, C
ontextCallback callback, Object state)
   at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback(Object state)

Any other idea how I could find the reason?

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39865
Joined: 17-Aug-2003
# Posted on: 01-Dec-2008 14:29:04   

This is coming from the fact that the query timeout is set to a value. (OracleCommand.TimeOut). As this apparently fails, it looks like oracle's code has a bug. Please contact Oracle about this as it calls into an ODP.NET client dll which apparently has no method required for this.

What I wonder is: you say you tested this and it works on your devmachine. However, is this with 2.102.4.0 ? So in short: can you rebuild the situation locally (on a different box than yuor devbox) which has 2.102.4.0 and your code which is build against 2.102.2.20) and see what happens?

The CommandTimeout is always set in oracle DQE's, so you can't avoid it, unless you comment out the line in the DQE for ODP.NET (DynamicQueryEngine.cs -> CreateCommand).

Still, I think this error is a sign that there's something else wrong with the Oracle installation, so even if you manually work around this in the DQE, it's likely to fail elsewhere in obscure situations.

So my advice: - at the production site: roll back ODP.NET to 2.102.2.20, so they can start working and avoid losing money there. Of course the memory leak might still be present, but you can first test things further. (you should be able to connect to the patched 10gR2 box without problems. ) - at your site: build an environment which mimics the production environment, so a box with 2.102.4.0 installed, your code which is build against 2.102.2.20 and unmodified DQE etc. and an assembly redirect if required, then test to see if that works OK. If not, either call oracle to see if they're aware of the problem (as the crash happens inside their code calling a dll which apparently is out of sync), or patch the DQE and see if that workaround fixes it.

As always, use our latest runtime libs and templates for your version (v2.5).

Frans Bouma | Lead developer LLBLGen Pro