Big number of open sessions in Oracle (116!)

Posts   
 
    
changomarcelo avatar
Posts: 62
Joined: 15-Feb-2007
# Posted on: 10-Oct-2011 22:49:58   

Hi

I have an application that uses LLBLGen Pro 2.6, with Oracle 10g. The application parses a file but we are seeing a problem where the application gets stucked at certain lines and the reason is that the database begins to throw timeouts.

We analyzed what could be happening and we discovered that there is a huge number of open sessions with Oracle (all belonging to my program and username).

This is something we can check with select count(*) from v$session WHERE program='our-program' in SQLPLUS. We increased the Max Pool Size from 100 to 200 and that make it work, with 116 open sessions at the same time.

The only database access layer we use is LLBLGen Pro so, my questions is how could it be that we reach 116 open sessions so fast. It is true that the files we are parsing are really big (100MB) and there's a lot of data to parse (40 mins x file). We use a transaction per-line.

I wonder how LLBLGen Pro handles these sessions, why it is opening so many, why they are not being released, or if we should close them explicitly.

I really appreciate your help.

I'm not attaching real code because the code is spread among different files, but if you are willing to watch, I could zip all the files and post it here.

Thank you!

changomarcelo avatar
Posts: 62
Joined: 15-Feb-2007
# Posted on: 10-Oct-2011 23:40:25   

I want to add that most of the sessions are in the inactive state, what I understand that is the pool of connections (I'm using connection pooling). But why it needs to open so many connections and why the database starts to throw timeout errors after the max is reached?

daelmo avatar
daelmo
Support Team
Posts: 8245
Joined: 28-Nov-2005
# Posted on: 11-Oct-2011 06:06:42   

Hi there,

As a matter of fact we need the code you are using to this, to understand how are you using the DataAccessAdapter (or SelfServicing). If you can reduce the code in a way it still reproduces the problem, would be great.

Additionally info we need: LLBLGen version, RuntimeLibrary version, connector used (ODP, MSOracle) and build nr of that connector: http://llblgen.com/tinyforum/Messages.aspx?ThreadID=7719

Also, post the connection string used, just to be sure you are not turning off oracle connection pooling. Are you using any profiling tool for this? btw: http://www.ormprofiler.com

David Elizondo | LLBLGen Support Team
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39959
Joined: 17-Aug-2003
# Posted on: 11-Oct-2011 10:03:51   

If you're using selfservicing, it could be a lack of dispose calls in the framework, something we fixed in v3.1. It's also important to know whether the open sessions are really open or pooled connections: a .NET connection object is a wrapper around a pooled connection (in ODP.NET's case this is a java object). So when the .NET connection is closed and/or disposed, the pooled connection is added to the pool again. These pooled connection stay open ! That's the point of connection pooling.

So if the .NET connections are still open, something's leaking. If the .net connections are not open, you've to check whether there are a lot of connections in the pool unused. You can check these objects in the windows performance monitor.

Frans Bouma | Lead developer LLBLGen Pro
changomarcelo avatar
Posts: 62
Joined: 15-Feb-2007
# Posted on: 12-Oct-2011 15:41:48   

Thank you. Making a query to the v$session table, in Oracle, which shows me the open sessions, I could see most of them are inactive, but still there and blocking the creation of new. I wondered why, if most were inactive, could not be re-used.

So I read again the ODP.NET documentation and I found that there's a connection string attribute called Connection Lifetime. I wasn't setting that attribute and learned that the default was 0. So I set it to 60 seconds, and that seemed to fix it.

So, what I suspect now is that those connections, with a Lifetime of 0, were not reused and stayed there until the pooling service closed them, according to the OPD.NET documentation, this happens every 3 minutes.

Do you think this could have been the problem?

Thanks!

Walaa avatar
Walaa
Support Team
Posts: 14995
Joined: 21-Aug-2005
# Posted on: 12-Oct-2011 16:13:40   

This indeed looks like the cause of the problem. You can validate that by not setting LifeTime attribute, and wait for 3 minutes to check back the v$session table. And see if these sessions were freed (deleted).

changomarcelo avatar
Posts: 62
Joined: 15-Feb-2007
# Posted on: 14-Oct-2011 00:45:42   

Well, that didn't seem to definite solve the problem.

What my application (actually a windows service) does is parse a big file.

Each line is handled in a transaction, and each line demands many queries and inserts. A line could tipically demand 100 inserts, although it's supposed to support 2000.

If the service is stopped, I have a method to resume from the last line we have reached.

What seems odd to me is that I don't have concurrent connections in this application. What it means that I read the file sequentially in one thread and I never intentionally open more than one connection at the same time. Actually, with LLBLGen Pro I never open connections myself.

So, what it happens is that at some point, the Max Pool Size is reached and I start getting errors like "[b]Connection request timed out[/b]" or "[b]ORA-12542: TNS:address already in use.[/b]". The service, in this case, automatically restarts and the same error messages are thrown all the time.

When I make queries to the v$session table I get that the number of sessions equals the Max Pool Size + 1. This remains like this for hours.

My question is why, if I'm not supposed to have more than one connection in use at the time, I can't get new connections from the pool. Seems that the pool does not give any of the inactive connections to the application.

Another odd thing is if I stop the service, wait for the sessions go to zero and restart it, it starts again with the Max Pool Size + 1 value. So I'm stucked again in the same situation.

Finally, if I disable the Connection Pooling, the "[b]Connection request timed out[/b]" or "[b]ORA-12542: TNS:address already in use.[/b]" happen more often.

I will prepare a zip with the sample code to upload in a next post.

Thanks

changomarcelo avatar
Posts: 62
Joined: 15-Feb-2007
# Posted on: 14-Oct-2011 00:57:03   

This is a zip with the code.

daelmo avatar
daelmo
Support Team
Posts: 8245
Joined: 28-Nov-2005
# Posted on: 14-Oct-2011 06:00:57   

You forgot to include your DAL project in the zip. I tried to create it from DB but you have special things in it apparently (typeConverters, typedLists, renamed fields, etc). Also even if we can run it, we don't have archives to parse.

Looking into the code I see you are using v3.1 and SelfServicing. Please make sure you are using the latest Runtime Library version. Also I think that putting the ConnectionLifeTime in your connection string actually could improve the pool recycling. However if you still got errors it seems that something is not disposing it. Please monitor the pool while you debug the service to see where in your code the connection isn't disposing it. Although you seem to to a lot of large fetch and commits I don't see any piece of code that could cause this. So please try to debug it and figure it out. Maybe this could be helpful: http://blog.ilab8.com/2011/09/02/odp-net-pooling-and-connection-request-timed-out/

David Elizondo | LLBLGen Support Team
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39959
Joined: 17-Aug-2003
# Posted on: 14-Oct-2011 10:53:29   

As the thread is public I removed the attachment.

Frans Bouma | Lead developer LLBLGen Pro
changomarcelo avatar
Posts: 62
Joined: 15-Feb-2007
# Posted on: 14-Oct-2011 23:27:14   

I want to add that I've finally found the solution to the problem.

The issue was that I was creating a new transaction and, because of an uncommon value in one of the files, I was returning from the method which creates the transaction without properly disposing it.

That seemed to be adding to the Oracle sessions.

A simply validation before creating the transaction solved it beautifully.

To make things clearer, the code was more or less like this:

CreateTransaction() try { if (position > 2000) // this was the root of the issue. We never had before positions greater than 2000, so this is when it began to fail. return false; DoTheWork() } finally { DisposeTransaction() }

Thank you for all your support!