Removing entities from EntityCollection dramatically slow

Posts   
 
    
miloszes
User
Posts: 222
Joined: 03-Apr-2007
# Posted on: 25-Nov-2011 12:34:15   

Hi,

I've just find out, that removing a few or most of the entities from EntityCollection is about 70 x times slower then removing the same entities from an array. In test scenario I've created 3000 elements and removed about 2200 of them (randomly - they were not in the same order as in the entity collection). It took almost one second in EntityCollection scenario. Profiler says, that during such operation you're rebuilding entity index each time, so the method get_ObjectID() is called 8,5 mln times and get_Count() 4,3 mln times. Could you optimize that somehow or provide a method to remove larger set of entities in quicker time?

Best Regards, MiloszeS

daelmo avatar
daelmo
Support Team
Posts: 8245
Joined: 28-Nov-2005
# Posted on: 25-Nov-2011 19:29:33   

Hi MiloszeS,

Please show us more info about your tests (real code, LLBLGen runtime version, etc.). http://llblgen.com/tinyforum/Messages.aspx?ThreadID=7725

David Elizondo | LLBLGen Support Team
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39614
Joined: 17-Aug-2003
# Posted on: 28-Nov-2011 10:13:14   

It indeed removes the index after an entity is successfully removed. This is easier than potentially updating every element in the index. It's however hurting bulk deletes.

There's also no workaround, all remove calls end up in 1 method which tosses the index out the window, and as it does an indexOf call (which uses the index, or rebuilds it), it's always hitting the bottleneck. The index is actually there to avoid a linear search for the indexof, which is necessary because it has to execute event handlers before the removal, so it has to know whether the entity to remove is in the collection.

It's a tough problem though. At first I thought adding a removerange which doesn't remove the index at first element would work, but that's not true: the index is wrong after the first removal.

It can be done, but it's a bit cumbersome (the index is used for determination whether the element is in the collection. A wrong index will still state whether an element is in the collection (i.e it's in the index) or not.

What version are you using so we know to which version we have to backport the patch which we added to v3.5 for this?

Frans Bouma | Lead developer LLBLGen Pro
miloszes
User
Posts: 222
Joined: 03-Apr-2007
# Posted on: 28-Nov-2011 14:29:52   

Latest version of LLBLGen. I don't connect with db because problem can be saw on the following test.

It doesn't matter which entity do you use - I've figure it out that it's a same problem with tables with many columns as well as with 3 elements. If you need I can repare and send you a whole solution, but the following code can be also used to reproduce a problem. Please write me whether a solution will be necessary and sorry for a code quality - I'm really busy today rage .

Best Regards, MiloszeS

public void TestEntityColelctionRemoveTime()
        {
            int maxItems = 3000;
            int itemsToRemoveCount = 2700;

            EntityCollection<TestEntity> entityCollection = new EntityCollection<TestEntity>();
            entityCollection.AllowRemove = true;
            entityCollection.IsReadOnly = false;
            
            GenerateNewEntities(entityCollection, maxItems);

            TestEntity[] array = entityCollection.ToArray();
            List<TestEntity> list = array.ToList();

            List<TestEntity> elementsToRemove = ChoseElementsToRemove(array, itemsToRemoveCount);

            Stopwatch sw = new Stopwatch();

            sw.Start();

            RemoveElemenst(entityCollection, elementsToRemove);


            Console.WriteLine("aqq :)" + sw.ElapsedMilliseconds);

            sw.Reset();
            sw.Start();


            RemoveElemenst(list, elementsToRemove);


            sw.Stop();
            Console.WriteLine("aqq :)" + sw.ElapsedMilliseconds);
        }

        private List<TestEntity> ChoseElementsToRemove(TestEntity[] array, int itemsToRemoveCount)
        {
            //index, element
            SortedList<int, TestEntity> ret = new SortedList<int, TestEntity>();
            Random random = new Random();
            int elementIndx;

            for (int i = 0; i < itemsToRemoveCount; ++i)
            {
                while (ret.ContainsKey((elementIndx = random.Next(itemsToRemoveCount - 1))) == false)
                {
                    ret.Add(elementIndx, array[elementIndx]);
                }
            }


            return ret.Values.ToList();
        }

        private void GenerateNewEntities(EntityCollection<TestEntity> collection, int maxItems)
        {
            for (int i = 0; i < maxItems; ++i)
            {
                collection.Add(new TestEntity() { A = DateTime.Now, B = DateTime.UtcNow });
            }
        }

        private void RemoveElemenst(ICollection<TestEntity> collection, IEnumerable<TestEntity> elementsToRemove)
        {
            foreach (var entityToRemove in elementsToRemove)
            {
                collection.Remove(entityToRemove);
            }
        }
miloszes
User
Posts: 222
Joined: 03-Apr-2007
# Posted on: 28-Nov-2011 14:38:05   

LLBLGen 3.1 latest build (it was latest on friday).

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39614
Joined: 17-Aug-2003
# Posted on: 28-Nov-2011 18:14:34   

Thanks, we have implemented a patch for this (collection.RemoveRange(IEnumerable)), which removes the elements in bulk and doesn't re-create the index with every element. We'll backport this patch to v3.1 tomorrow (tuesday) so you can test it out in your code.

Frans Bouma | Lead developer LLBLGen Pro
miloszes
User
Posts: 222
Joined: 03-Apr-2007
# Posted on: 29-Nov-2011 08:27:41   

Great.

Thank you for a great support (as usual) Otis simple_smile .

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39614
Joined: 17-Aug-2003
# Posted on: 29-Nov-2011 09:53:33   

Please see attached dll. Call collection.RemoveRange(enumerable) once, instead of looping through the enumerable and calling Remove. This should solve your performance problems.

Attachments
Filename File size Added on Approval
SD.LLBLGen.Pro.ORMSupportClasses.NET20.zip 275,042 29-Nov-2011 09:53.42 Approved
Frans Bouma | Lead developer LLBLGen Pro
miloszes
User
Posts: 222
Joined: 03-Apr-2007
# Posted on: 29-Nov-2011 10:15:02   

Yep - works perfectly simple_smile . Does it a production assebly, or should I wait for an official build ?

Walaa avatar
Walaa
Support Team
Posts: 14950
Joined: 21-Aug-2005
# Posted on: 29-Nov-2011 10:37:18   

You can use this in production. And usually we release new RTM builds for fixes like these very shortly, after confirming the fix is functioning correctly.

miloszes
User
Posts: 222
Joined: 03-Apr-2007
# Posted on: 29-Nov-2011 10:40:59   

Great.

Best Regards, MiloszeS

miloszes
User
Posts: 222
Joined: 03-Apr-2007
# Posted on: 29-Nov-2011 10:48:35   

Sorry that I'm asking in the same thread, but it could be revelant - or I just didn't found it.

Do you have a similar RemoveAll method() which will not just clean the collection but remove all items and automatically adds them into the removed entities tracker ?

Sorry for a silly question but my intelli sense has broke down because of assembly change and I don't see methods from a new assembly.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39614
Joined: 17-Aug-2003
# Posted on: 29-Nov-2011 14:52:21   

There's a Clear() method simple_smile

Frans Bouma | Lead developer LLBLGen Pro
miloszes
User
Posts: 222
Joined: 03-Apr-2007
# Posted on: 29-Nov-2011 15:04:25   

But does it work? I remeber that we had a problem in the past that clear method doesn't add items into the removed entities tracker (which is a good solution for some other cases - in case you want to filter something but not delete). That's why we made a work around with remove all entities one by one.

You can check it using the following pseudo code:

int maxItems = 3000;
            int itemsToRemoveCount = 2700;

            EntityCollection<TestEntity> entityCollection = new EntityCollection<TestEntity>();
            entityCollection.AllowRemove = true;
            entityCollection.IsReadOnly = false;
            
            GenerateNewEntities(entityCollection, maxItems);


            entityCollection.RemovedEntitiesTracker = new EntityCollection<TestEntity>();

            int elementsCount = entityCollection.RemovedEntitiesTracker.Count;

            Console.WriteLine("# :" + elementsCount);

            entityCollection.Clear();

            elementsCount = entityCollection.RemovedEntitiesTracker.Count;

            Console.WriteLine("# :" + elementsCount);

I hope I didn't made somethig stupid simple_smile .

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39614
Joined: 17-Aug-2003
# Posted on: 30-Nov-2011 10:36:39   

Clear doesn't add the entities to the removedentitiestracker, as it might be you want that: get rid of the entities. This has been the behavior since the beginning and we can't change this without breaking code.

I'm not happy with a RemoveAll() next to a Clear() method. What you could do is: collection.RemoveRange(collection.ToList());

this will add the entities to the removed entities tracker. Does that work for you?

Frans Bouma | Lead developer LLBLGen Pro
miloszes
User
Posts: 222
Joined: 03-Apr-2007
# Posted on: 30-Nov-2011 12:18:02   

I do that that way right now (RemoveRange(coll.ToList()). I also understand why you don't wan't to add RemoveAll (it won't be clear on the first look what the programmer should use). Is it possible to oveerride a Clear method that it'll contain some switch which will control whether a tracker will be filled or not?

I know that creating a list with 2000 or more elements isn't o much costly. But unfortunateley currently I stucked with a tool validating and fixing many millions of records with about 30 prefetch paths. It takes a few days (now with RemoveRange much less) to validate them all and it uses a gob of memory. Creating each time a collection only for remove all elements from a collection it's like harvest a grass with a combine harvester. In know that I'm extravagance a little bit simple_smile . The code is a part of a generic method which is called simultaneously so it be a problem to cache it somehow.

I can live with that and I know that you can be very busy wright now and have more important things to do. If you add it it'll be great. If not I'll live with that.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39614
Joined: 17-Aug-2003
# Posted on: 01-Dec-2011 11:06:31   

Adding a Clear(bool) could be useful indeed, although I must say, it's likely not used that much, as the tracker is often used for tracking remove actions performed by bound controls, which are impossible / hard to intercept.

I'll add it to the todo list for a future addition.

Frans Bouma | Lead developer LLBLGen Pro
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39614
Joined: 17-Aug-2003
# Posted on: 31-May-2018 10:34:51   

Implemented in the upcoming v5.5

Frans Bouma | Lead developer LLBLGen Pro