StackOverflow in UOW2.ConstructSaveProcessQueues

Posts   
 
    
simmotech
User
Posts: 1024
Joined: 01-Feb-2006
# Posted on: 12-Feb-2019 20:00:37   

This code, followed by a call to ConstructSaveProcessQueue on the resulting UOW, has been working fine for a long time.

        public UnitOfWork2 CreateUnitOfWork()
        {
            var result = new UnitOfWork2();

            result.AddCollectionForSave(LegalBodies, false, true);
            result.AddCollectionForSave(Resources, false, true);
            result.AddCollectionForSave(Patrons, false, true);

            return result;
        }

And now it throws a StackOverflowException. No idea why or what changed, I guess I just hit a limit.

I used your suggestion in https://www.llblgen.com/tinyforum/Messages.aspx?ThreadID=23861 to run ConstructSaveProcessQueue on a thread with a massive stack and a quick mashup seems to confirm that will do the trick. (Incidentally, the first hyperlink in the second to last post in that thread doesn't go anywhere)

Phew because I was stuffed otherwise! There are 138,734 entities all being inserted into an empty database in one UOW.

My questions are:- 1) Did you ever change the internal workings of UOW2 so stackoverflow is no longer an issue? If so, which version for reference? 2) Can you think of any reason that calling ConstructSaveProcessQueue would not be thread-safe? I would be pausing the original thread holding the UOW2 to wait until the big-stack has called that method.

Walaa avatar
Walaa
Support Team
Posts: 14950
Joined: 21-Aug-2005
# Posted on: 12-Feb-2019 20:06:11   

Which runtime library version are you using?

simmotech
User
Posts: 1024
Joined: 01-Feb-2006
# Posted on: 13-Feb-2019 07:09:15   

v4.2

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39614
Joined: 17-Aug-2003
# Posted on: 13-Feb-2019 10:01:37   

simmotech wrote:

This code, followed by a call to ConstructSaveProcessQueue on the resulting UOW, has been working fine for a long time.

        public UnitOfWork2 CreateUnitOfWork()
        {
            var result = new UnitOfWork2();

            result.AddCollectionForSave(LegalBodies, false, true);
            result.AddCollectionForSave(Resources, false, true);
            result.AddCollectionForSave(Patrons, false, true);

            return result;
        }

And now it throws a StackOverflowException. No idea why or what changed, I guess I just hit a limit.

I used your suggestion in https://www.llblgen.com/tinyforum/Messages.aspx?ThreadID=23861 to run ConstructSaveProcessQueue on a thread with a massive stack and a quick mashup seems to confirm that will do the trick. (Incidentally, the first hyperlink in the second to last post in that thread doesn't go anywhere)

Phew because I was stuffed otherwise! There are 138,734 entities all being inserted into an empty database in one UOW.

My questions are:- 1) Did you ever change the internal workings of UOW2 so stackoverflow is no longer an issue? If so, which version for reference?

No, the internal workings are still depth-first-search. We did research if we could reimplement it with breadth first search but it turned out it resulted (at that time) in a different ordering (still a working order that honors the dependencies) and we didn't proceed with it (as at the time we thought it might break people's code). It's also a fundamental part of the save algorithm and we didn't want to risk errors in that as the current code is fine. That said, with the introduction of batching, we do change the ordering a bit (which has resulted in some questions regarding that as expected) while still honoring the dependencies so we could have a second look and offer a BFS implementation. I'll re-open the issue and we'll see what we can do.

2) Can you think of any reason that calling ConstructSaveProcessQueue would not be thread-safe? I would be pausing the original thread holding the UOW2 to wait until the big-stack has called that method.

It traverses the entities in-memory and sorts them in the right order, and does that by collecting information in datastructures. If you have multiple threads doing that, they could overwrite data written by other threads as the design isn't meant to be thread safe. If you have 1 thread handling the UoW and no other thread is altering the entities, then you're fine.

Frans Bouma | Lead developer LLBLGen Pro
simmotech
User
Posts: 1024
Joined: 01-Feb-2006
# Posted on: 14-Feb-2019 10:09:36   

The weird thing is that the source for the entities, the import data, hasn't changed for months.

I think all I did was load 2 known entities from the database first and then assign them to the new entities at various places during the import. I even tried, for one of them, setting the ID rather than the entity and still got the crash (it wasn't possible to use the ID for the other - had to assign the entity itself).

So now I need a 5GB stack to run the code just because processing is now in a slightly different order - I think the number of entities is actually the same.

Anyway I am working again with this code in case it helps anyone else out of a spot: (I had help: see the - aptly named - StackOverflow site - https://stackoverflow.com/questions/54655372/run-process-on-special-thread-and-await-result/54656592)

            ThreadHelper.RunOnThreadWithCustomStack(5_000_000, () =>
                {
                    var uow = CreateUnitOfWork();

                    using (var adapter = new DataAccessAdapter())
                    {
                        adapter.StartTransaction(IsolationLevel.Serializable, "X");

                        uow.Commit(adapter);

                        adapter.Commit();
                    }
                });

    public static class ThreadHelper
    {
        public static Task<TOut> RunOnThreadWithCustomStack<TIn, TOut>(int stackSize, Func<TIn, TOut> action, TIn arg)
        {
            var tcs = new TaskCompletionSource<TOut>();

            var thread = new Thread(() =>
                {
                    try
                    {
                        tcs.SetResult(action(arg));
                    }
                    catch (Exception e)
                    {
                        tcs.SetException(e);
                    }
                }, stackSize);

            thread.Start();
            thread.Join();

            return tcs.Task;
        }

        public static Task<TOut> RunOnThreadWithCustomStack<TOut>(int stackSize, Func<TOut> action)
        {
            var tcs = new TaskCompletionSource<TOut>();

            var thread = new Thread(() =>
                {
                    try
                    {
                        tcs.SetResult(action());
                    }
                    catch (Exception e)
                    {
                        tcs.SetException(e);
                    }
                }, stackSize);

            thread.Start();
            thread.Join();

            return tcs.Task;
        }

        public static void RunOnThreadWithCustomStack(int stackSize, Action action)
        {
            Exception exception = null;

            var thread = new Thread(() =>
                {
                    try
                    {
                        action();
                    }
                    catch (Exception e)
                    {
                        exception = e;
                    }
                }, stackSize);

            thread.Start();
            thread.Join();

            if (exception != null) throw exception;
        }

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39614
Joined: 17-Aug-2003
# Posted on: 14-Feb-2019 15:48:33   

One extra entity could lead to a path that's too long (as it uses recursion to traverse through the graph, hence the stack overflow if the path is too long). If the data is exactly the same then that's odd, but as your data is tremendously big, it's very hard to track down I think.

Having that much entities in a single UoW likely also runs into performance problems when navigating that tree (I think we optimized it a bit in v5.5, but have to check). It might be a good idea (but you likely already thought of this and decided against it wink ) to consider having less entities in the graph before processing it. After all, you can execute multiple UoW's in the same transaction. This does result a bit more work as you might need to cut up the graph manually and do a pre-sort of the graph yourself (roughly) so you can use 1 UoW instance per chopped up part of the graph. (it does also give the problem that if you have one massive graph in memory, adding 1 entity of that graph to a UoW will still traverse the whole graph).

In any case, it's scheduled to be looked into for v5.6 as said the reason it wasn't done before isn't valid anymore.

Frans Bouma | Lead developer LLBLGen Pro
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39614
Joined: 17-Aug-2003
# Posted on: 18-Feb-2019 14:56:39   

There are two recursive processes: the discovery of the adjacency lists (the 'edges' in the graph and with those which entity depends on which) and the traversal of those adjacency lists using a DFS algorithm to topological sort them. This latter algorithm uses the Karjan approach (which requires DFS).

Looking at wikipedia there's also another algorithm, from Kahn, which can use BFS (or DFS, doesn't matter). It looks a bit less efficient tho, but not that much. The algorithm from Kahn can be done imperatively, so without recursion (as BFS doesn't use recursion).

Additionally the gathering of the adjacency lists can be done using an imperative algorithm as well, using a queue with a while loop instead of recursion. It might allocate a bit more memory although recursion does too (indirectly).

We'll see if these 2 changes are suitable for implementation in v5.6. simple_smile

Frans Bouma | Lead developer LLBLGen Pro
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39614
Joined: 17-Aug-2003
# Posted on: 19-Feb-2019 13:54:22   

Implemented the adjacency list construction using a whileloop and a queue, which was straighforward, and as we don't have to track what to recurse we could avoid an enumeration and a couple of lists so it's 10% faster.

The topological sort is now implemented using: https://stackoverflow.com/questions/20153488/topological-sort-using-dfs-without-recursion/20153735#20153735 so using a stack and a pair. Performance is on par with the recursive one. Haven't profiled the memory yet. All tests pass so I think it'll work out fine.

This will be in v5.6, so this should solve your problem simple_smile

Frans Bouma | Lead developer LLBLGen Pro
simmotech
User
Posts: 1024
Joined: 01-Feb-2006
# Posted on: 19-Feb-2019 17:14:10   

I'm impressed - you da man!

I would love to use the latest version but I've written so many add-ons and extensions based around LLCoolJ v4.2, I'm now too scared to upgrade!

I even have two projects, Framework.LLBLGen and Framework.LLBLGen.WinForms solely based around it. But you claim v5.x is 30-40% faster and has a much smaller code base and you know I like that sort of stuff! Soooo tempting....