Xml writing/reading/size :)

Posts   
 
    
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39612
Joined: 17-Aug-2003
# Posted on: 25-May-2007 15:37:36   

You might think I fell off the face of the earth, but work is still progressing nicely. We're getting closer to the beta, and one of the time consuming things I'm currently working on is improving XML serialization.

Besides the polymorphic deserialization stuff which is currently not possible in v2, two things aren't that great about LLBLGen Pro's current XML serialization/deserialization support: - the code isn't the fastest - the XML is still very verbose compared to plain element-data xml.

So, what's coming is a new XmlFormatAspect, Compact25. This aspect overrules the existing one, Compact if specified and Compact25 is the format of choice in webservices. We kept the other two formats (none and Compact) so people who serialized data to Xml in v2 can read that data back in v2.x. Compact25 is supported in Adapter.

Xml serialization/deserialization is possible for selfservicing and adapter. Adapter however is also the paradigm of choice in webservice scenario's where entity graphs are send over the wire. THis is also why Adapter gets support for Compact25, selfservicing does not. This is a maintenance issue mostly because code can't be shared among adapter and selfservicing for this and the scope where graph serialization is used in webservice scenario's with selfservicing is very very limited (because selfservicing shouldn't be used when entities are exposed by the webservice). You might say: ahhhh come on, add it for selfservicing as well, but it won't be added, at least not now. Selfservicing will still be able to serialize/deserialize to XML as before, so no worries.

Main goal The main goal is to be able to send a full entity graph over the wire in XML format and back, it's NOT meant to be a simple entity to XML data transformation. Please understand that an XML tree isn't a graph, its the data in xml format. You can't rebuild a graph from that without extra info. Furthermore, state tracking is essential. This means that this info is still present in the XML, because the goal is to send entity graphs and be able to manipulate them on the client and send them back to the service.

Details Anyway, Let's look at some numbers first. simple_smile (debug build code) I loaded from northwind all customers from germany, with their orders and employees using a prefetch path. The object graph is then serialized to 3 files in the 3 formats, and then deserialized in a new collection (and verified).

Here are the numbers: Writing compact25 Done. Took: 36ms Writing compact Done. Took: 81ms Writing verbose Done. Took: 81ms Reading back Compact25 Reading xml back done. Took: 146ms Reading back Compact Reading xml back done. Took: 529ms Reading back verbose Reading xml back done. Took: 431ms Done

The filesizes: Compact25 xml: 340KB Compact xml: 833KB Verbose xml: 939KB

Verbose is almost as small as compact because there arent any validators etc. set. If these are set, verbose will grow a lot because the types are serialized into the xml, in compact they're not.

Reading back compact turns out to be slower, because it mainly slows down on the lookup of single nodes in the XmlDocument, as these can be optional. Anyway, not many people will use this format in adapter anyway, as the native webservice format is Compact25 now.

All formats now use an XmlWriter for writing the XML. This should give much higher throughput because profiling showed that the XmlDocument manipulation was slowing the routine down. Reading it back is done for Compact25 with an XmlReader. The other 2 formats will keep the existing routine, because reading XML with an XmlReader is a true pain and getting it right takes a lot of time, especially with optional elements etc. etc.

The Compact25 XML looks like: (I indented it for display purposes. Of course normally there's no whitespace)


<EntityCollection Factory="Northwind.FactoryClasses.CustomerEntityFactory, Northwind" Format="Compact25">
    <CustomerEntity ObjectID="16cf6938-1dbc-46bf-9d71-819964b5877d">
        <CustomerId>ALFKI</CustomerId>
        <CompanyName>Alfreds Futterkiste </CompanyName>
        <ContactName>Maria Anders</ContactName>
        <ContactTitle>Sales Representative</ContactTitle>
        <Address>Obere Str. 57</Address>
        <City>Berlin</City>
        <PostalCode>12209</PostalCode>
        <Country>Germany</Country>
        <Phone>030-0074321</Phone>
        <Fax>030-0076545</Fax>
        <OrderCollection>
            <OrderEntity ObjectID="a3e37c2f-4fdb-44a7-bb13-044df519d0fc">
                <OrderId>10643</OrderId>
                <CustomerId>ALFKI</CustomerId>
                <EmployeeId>6</EmployeeId>
                <OrderDate>1997-08-25T00:00:00.0000000+02:00</OrderDate>
                <RequiredDate>1997-09-22T00:00:00.0000000+02:00</RequiredDate>
                <ShippedDate>1997-09-02T00:00:00.0000000+02:00</ShippedDate>
                <ShipVia>1</ShipVia>
                <Freight>29,4600</Freight>
                <ShipName>Alfreds Futterkiste</ShipName>
                <ShipAddress>Obere Str. 57</ShipAddress>
                <ShipCity>Berlin</ShipCity>
                <ShipPostalCode>12209</ShipPostalCode>
                <ShipCountry>Germany</ShipCountry>
                <Customer Ref="16cf6938-1dbc-46bf-9d71-819964b5877d" />
                <Employee ObjectID="c67c2c59-653b-4538-970d-5f18ecaf3262">
                    <EmployeeId>6</EmployeeId>
                    <LastName>Suyama</LastName>
                    <FirstName>Michael</FirstName>
                    <Title>Sales Representative</Title>
                    <TitleOfCourtesy>Mr.</TitleOfCourtesy>
                    <BirthDate>1963-07-02T00:00:00.0000000+02:00</BirthDate>
                    <HireDate>1993-10-17T00:00:00.0000000+02:00</HireDate>
                    <Address>
                        Coventry House
                        Miner Rd.
                    </Address>
                    <City>London</City>
                    .....
        </OrderCollection>
        <_lps fs="BCAA" es="1">
            <dbv>
                <CompanyName>Alfreds Futterkiste</CompanyName>
            </dbv>
            <ee>TestError</ee>
            <efes>
                <CompanyName>TestErrorCompanyname</CompanyName>
                <CustomerId>TestErrorCustomerId</CustomerId>
            </efes>
        </_lps>
    </CustomerEntity>
    <CustomerEntity Ref="d16bbe93-fc9e-493e-8ddf-10ed6c234268" />
    <CustomerEntity Ref="f1d3f48b-c561-4b66-9079-29b9065ce5ec" />
    <CustomerEntity Ref="7ae0e6cf-ed5f-4580-9435-036823843db7" />
    <CustomerEntity Ref="35a13d51-bb10-4ce5-81fe-af666084ee4d" />
    <CustomerEntity Ref="1c34ea82-144c-4176-a3c5-7e816915b0ef" />
    <CustomerEntity Ref="eb75dfa8-598e-46c1-b10c-2ee54feafa26" />
    <CustomerEntity Ref="a5434975-ea04-479f-80a9-0c5734a87cae" />
    <CustomerEntity Ref="e980fe69-d9fa-4ff0-b288-99c8aa228c77" />
    <CustomerEntity Ref="bcdd69e0-12a6-49f2-be63-3d3032869ba2" />
    <CustomerEntity Ref="4caefdf0-cb59-4ff3-9cbf-b7517a51f5b5" />
    <_lps f="0" />
</EntityCollection>

The _lps block is an optional element which contains the state info. It only contains info which is necessary. Flags for fields are packed in a bitarray, elements with error info are only emitted if there are errors defined. The dbv block contains the DbValues of fields which were changed and where DbValue wasn't null. Fields which are null aren't emitted.

I kept the ObjectIDs as it's important to have the ObjectIDs the same at the client and service so you can use a context object. To leverage existing code I used the guids, instead of a guid-> int map which could save a couple of bytes. I didn't go that far because after all, it's XML and it's verbose by nature. If a few bytes are a problem, don't use XML (but instead the very fast binary serialization we'll have thanks to Simon simple_smile )

Frans Bouma | Lead developer LLBLGen Pro
Answer
User
Posts: 363
Joined: 28-Jun-2004
# Posted on: 29-May-2007 18:21:00   

Frans,

Is it possible for comparison purposes run this same test using ISerializable interface?

I am currently looking at WCF and am very disappointed that it must use the IXmlSerializable interface rage (I dont see a way of changing the order ISerializable > IXmlSerializable)

I suppose moving ixmlserializable out from the entitybase2 class and into the generated code would be out of the question rage

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39612
Joined: 17-Aug-2003
# Posted on: 29-May-2007 18:53:35   

Answer wrote:

Frans,

Is it possible for comparison purposes run this same test using ISerializable interface?

I am currently looking at WCF and am very disappointed that it must use the IXmlSerializable interface rage

WCF is always XML indeed. The XML I now get is very compact and emits/deserializes pretty fast.

Writing fast binary Done. Took: 92ms. Total size is: 190334 bytes. Deserializing fast from stream Done. Took: 65ms Writing fast binary Done. Took: 9ms. Total size is: 190334 bytes. Deserializing fast from stream Done. Took: 9ms Writing normal binary Done. Took: 146ms. Total size is: 422944 bytes. Deserializing fast from stream Done. Took: 172ms

Code: (without prints/stopwatch calls)


MemoryStream stream = new MemoryStream();
SerializationHelper.Optimization = SerializationOptimization.Fast;
BinaryFormatter formatter = new BinaryFormatter();
formatter.Serialize(stream, customers);
stream.Seek(0, SeekOrigin.Begin);
EntityCollection<CustomerEntity> customersFastDeserialized = (EntityCollection<CustomerEntity>)formatter.Deserialize(stream);
stream.Close();

I don't know why it takes longer the first time, I guess that's the internal caches building up, I've to profile it where it eats the most time. The second time it creates a new stream etc. and it's lighting fast. Also more compact than the xml and normal binary serialization.

I tried to remove all bottlenecks in the current xml code so it is much faster now and also more compact. I'm not sure how fast the netdatacontractserializer is but it will produce a truckload more XML due to the type names it needs to keep track of.

I suppose moving ixmlserializable out from the entitybase2 class and into the generated code would be out of the question rage

Moving it to the generated code won't win you a lot: cyclic refs, interface based types, change tracking data, etc. these make the xml serializer go bezerk. If you need very high-speed webservices, the only thing to do is the proper SOA route: high-level webservice api, message based interface. Then you can send back/forth small DTOs which you can fill with projections eventually but likely you want to work with a different set of data at that high level.

Frans Bouma | Lead developer LLBLGen Pro
Answer
User
Posts: 363
Joined: 28-Jun-2004
# Posted on: 29-May-2007 19:26:33   

Thanks For posting the numbers, that def puts the compact25 serializing into perspective now simple_smile

I currently use remoting + binary serialization since i control both client and server. I would like to move to WCF though, as it contains a lot of extra stuff that remoting doesnt have.

Since the new compact format coming out is just as fast as the current remoting+binary i think im in pretty good shape simple_smile

Then again, maybe a more SOA approach would be better so later on down the road i dont have to create another api simple_smile

PS. That fast binary is impressive!

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39612
Joined: 17-Aug-2003
# Posted on: 30-May-2007 10:58:52   

Answer wrote:

Thanks For posting the numbers, that def puts the compact25 serializing into perspective now simple_smile

I currently use remoting + binary serialization since i control both client and server. I would like to move to WCF though, as it contains a lot of extra stuff that remoting doesnt have.

Since the new compact format coming out is just as fast as the current remoting+binary i think im in pretty good shape simple_smile

Yes, I think the new compact format will make using WCF + entity graphs a solid combination. Still I'd go for highlevel webservices if you're using webservices though. wink

PS. That fast binary is impressive!

Thanks to Simon Hewitt who wrote the fast serialization code for v2.0 which we licensed from him simple_smile

Frans Bouma | Lead developer LLBLGen Pro
Answer
User
Posts: 363
Joined: 28-Jun-2004
# Posted on: 01-Jun-2007 17:00:32   

Now is the IXmlSerializable interface on the entities going to produce a schema as well?

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39612
Joined: 17-Aug-2003
# Posted on: 01-Jun-2007 17:40:36   

Answer wrote:

Now is the IXmlSerializable interface on the entities going to produce a schema as well?

With the webservice templates enabled, yes, for the schema importer. THough otherwise: no.

Frans Bouma | Lead developer LLBLGen Pro