Friday, March 28, 2008

Implementing Disconnected Deletion Change Tracking

In one of my previous blog posts, I described some of the difficulties with change tracking entities which have been removed (i.e. deleted).

The main problem was that once you remove an entity whilst "disconnected", it's no longer referenced by anything, and so the object disappears and hence the entity is no longer available when re-attaching to a new data context.

In the short term, I added a property called "IsDeleted" to the entity base which people could use instead of using the remove method (or setting a refrence to a child property to null), but this had it's disadvantages - mainly being that the user would have to set this themselves (i.e. it wouldn't get picked up automatically on remove) and would un-naturally need to keep the object around.

So the obvious thing to do was to keep a reference (some where?) to the entity when it's deleted (removed), so it can be re-attached and deleted later on. But where would this entity be kept? In the parent that deleted it? In the root object perhaps? In an external change tracking object?

To keep the Entity Base consistant, I decided to keep all the functionality in the Entity Base class, which ruled out having an external object tracking the changes.

Then I went through a lot of options regarding where to store the detached objects and came up with the simplest solution possible - I used the existing infrastructure provided by my Entity Base class - the ToEntityTree() method - as this was the option which seemed the least troublesome for the developer to use.

So, what I have done is implemented "SetAsChangeTrackingRoot()" method which the developer can call before making any changed to the entity objects.

The developer would use this method to mark the section of the Entity Tree (the Entity branch) that would be change tracked.

When this method is invoked on an entity, the following would happen:

1. A snapshot of the entity branch would be taken from the entity that method was invoked on.

2. Indicate to each entity in the branch that it is being change tracked.

The meant a snapshot of the entity branch would be kept locally with the root of the branch, and this also meant the entity that would used for syncronisation with the data context later on.

From there, it was just a matter of waiting for the property changed event to fire on an entity (exposed by INotifyPropertyChanged), and to look to see if the property being changed was a Foreign Key reference (meaning a child to parent relatinship) and that the value was being set to NULL (i.e. detaching the entity from it's parent). Once these conditions were met, I set the IsDeleted flag automatically marking the object for delete.

Next, I modified the ToEntityTree() method to include these "deleted" entities, as these entities would now not be picted up in the traversal of the entity tree, returning a complete list of all entities including the deleted objects.

The SyncroniseWithDataContext() method then used the information returned from the ToEntityTree() method to figure out what to attach, insert and delete.

One issue I came accross was the deletion of child entities under the entity that was marked for deletion. If you simply removed an entity that already had children, the submit changes would fail because LINQ to SQL doesn't support cascading deletes unless specified in the Database Schema and so any Foreign key constraints linking to the record being deleted would mean an exception would be thrown by SQL Server.

I also couldn't rely on the developer to delete the child entities first and then delete the top most entity as they could do it in any order and the order of the deletions is crucial - the child objects must be deleted first.

Instead, I decided by default to have my own cascade delete functionality so when an object is removed, I automatically remove any child objects starting with the child leaves of the branch first. This was achieved by call the ToEntityTree() method internally and using the reverse function so that it would be from order from child leave all the way back to the root of the change tracking.

Even though by default calling the SyncroniseWithDataContext() will perform cascading deletes, I have added an optional parameter so that it can be disabled if need be - which is handy if you didn't expect there to be children of the object you are deleting OR you are handling cascading deletes in the database anyway.

So thats how I've achieved automatic deletion tracking :).

Some more thoughts

After building the LINQ Entity base class in this way, I realised it would be reasonably easy to move all the logic into an external object (not an entity) which was similar to the standard DataContext but performed the tasks in an offline way.

Some people would feel more comfortable with this perhaps, because of the similarities with the existing data context.

I may shortly in the future investigate this further, and perhaps we'll have a alternative if people want it for change tracking whilst disconnected.

Thursday, March 27, 2008

New Version of LINQ to SQL Entity Base Class (Beta 2.0)

Hi there!

Just wanted everyone to know I've release a new version of the LINQ to SQL Entity base class.

It now supports change tracking for deletes, as well as the ability to cascade delete.

I guess from my point of view it's feature complete for ASP.NET use, assuming you store the entities in the session.

Still have to work on serialization for WCF and other uses.

LINQ to SQL Entity Base Class Version 1.0 Beta 2.0

Anyhow, check it out and let me know what you think.


Cheers

Matthew Hunter

Friday, March 14, 2008

Implementing Change Tracking when disconnected.

If you have a look at the LINQ to SQL Entity Base source code on codeplex, you'll see that the way I've implemented change tracking is by putting a few flags on the base class IsNew, IsModified, IsDeleted.

I've been able to get IsNew & IsModified to set automacally, here's how they work:

IsNew
This can be established by checking the entities RowVersion (TimeStamp) field (which BTW is a requirement to have for this to work). If the RowVersion is null, it's never been applied to the database (as the database sets this value not the developer) and hence we can tell with absolute certainty that it's a new object.
But it's in the child class how did I accomplish this?
Since the RowVersion property is in the class, there's a few options we can use to achieve this:

(1) Write extra code in the child entity class
Nope, this is out of the question! We are trying to avoid coding here!

(2) Create an interface (OK)
This is probably the best action for performance, but you need to make sure that all entities use the same column name for the RowVersion. If you use this method, you could cast the current object to this an interface and get the value that way. Of course, to get the entity to implement this, you'll need to force it to implement the interface by adding it to the DataContext dbml file (just like described here) .
However, as I'm writing something to share with one and all, and no doubt everyones gonna want to name it differently, this isn't the appropriate option (however it's still a damn good one!).

(3) Use a virtual property and override it in the entity (OK)
This is also good for performance, however it also means that you need to set every RowVersion field property property to have it's access modifier set to "override" which is a bit annoying. Personally I'm impressed that you can do this in the DBML model viewer, but it's still a little hard to maintain when you are adding tables - just another thing to remember, and again everthing has to be set to the same name for it to work.

(4) Use reflection (OK)
This is not so bad, we can simple get the properties using reflection and find which property is marked with ColumnAttribute.IsVersion = true. Seeing there can only be 1 of these per table (enforce by SQL Server) this is pretty safe. It also means i can throw a custom exception with a message if I discover that there is now RowVersion field and let the developer know.

So, after considering the options, I went with the later option being reflection mainly because it's the most flexible for this situation, but all things being equal I think the best option if you can control it is to use an interface as in (2) above.



IsModified
This ones easy, there's already an interface supplied called INotifyPropertyChanged that each entity implements, which you can then use and attach to the childs PropertyChangedEventHandler in your parent class. Whenever this event is raised, we know a column has been changed and we know to set the IsModified Flag to true.
Interestingly enough too, if the event is raised and it's the field that is the RowVersion (TimeStamp) property that's being updated, we know that the data has just been applied to the database, and hence we know we can reset the IsNew and IsModified and IsDeleted to false. So this is something we definately want to do, if after committing the data we want to keep working with our Entity Tree.
One propblem is though, I noticed that this event is also raised for child entities and entity collections not just columns. I need to avoid these non-column events because they are not the type of property changes I am looking for. So, with a little reflection, I can find out if the property has an AssociationAttribue applied to it and ignore the change events raised for these. So that's solved too.

IsDeleted
*** UPDATE --> I've come up with a solution to this problem, see this link for more details ***

I'm still looking for a good way to do this. Unfortantely, the one draw back with the way the entities are organised when disconnected, is there's no good place to handle this because if you remove the entity, you remove the entity - it's gone - not much use setting a flag if you can no longer find it!

One Idea I have is to store the object in the parent, but I haven't got around to working this one out. It's definately the trickiest of the lot.

So for now, there's just a simple flag indicating that the object needs to be deleted, which is not ideal at this stage, but it mostly works, apart from where you have a single child entity (not a collection of entities) and you want to delete it and replace it with a diffent object - currenlty you have to commit to the database in between otherwise, again you'll loose the original object.

Cheers

Matt.

Entity Base Project Added to Codeplex

For anyone that's interested, I've added an example project of some of the things I've found to codeplex and Called it the LINQ to SQL Entity Base.

Check it out at the following Address:

http://www.codeplex.com/LINQ2SQLEB

It just demonstrates some change tracking, the entity tree enumeration feature and auto-syncing to the database.

Anyway, go check it out if you want to see how some of these things I've found out can be put to use!

Cheers

Matt.

Thursday, March 13, 2008

Disconnected LINQ To SQL Tips Part 2



How to Enumerate the Entity Tree Graph


I really, really like the way that the standard IEnumerator interface works, and in conjunction with "yield" keyword and a little, I came up with a cunning plan that would help me with my Disconnected model.

Basically, I needed a way to find all objects that were changed tracked in an entity tree, otherwise it would be very manual and an awful lot of code to re-attach entity. It's not so much of a problem when you have just a parent and some children like so:

Customer1
-> Order1
-> Order2
-> Order3


Here you could just re-attach you're objects, first with the custmer and then in a nice little loop attch the orders.


This will be fairly light and not a lot of code, but I thought that in a lot of circumstances the tree would be a lot more complicated like this:


Customer1
-> Order1
->->OrderDetails1
->->OrderDetails2
-> Order2
->->OrderDetails3
->->OrderDetails4
->Order3
->Order4


This starts to look a little complicated because now you have to write a lot of code that loops through each level of the tree attaching as necessary... and you can start to imagine trees that may be 20 entities deep and all of the place. Yuk.


So, basically with a little help from reflection and with the use of the base class, I could put togther a nice little function that would traverse the enitre tree in one list. This is useful for a lot of reasons, not just for finding changed objects, but you would now have the ability to find an object or objects in the tree without hardcoding paths in your linq statements.


Of course, I've only done parent->child relationships and ignore foriegn key ones (Child <- Parent) to avoid overflowing the stack.

This code is in the base class for all entities that will allow you to enumerate against a tree of entities. Please note the following:

  • Yes, it will work in connected mode as well, i.e. when a datacontext is attached.

  • You can query it with LINQ

  • -> var temp = from c in customer.GetEntityHierarchy().OfType() select c);
  • If you're wondering why I've put it in a private class instead of just exposing IEnumerator/IEnumerable on the entity it's self - it's because at runtime it seems to fail because I think Microsoft have put some sort of runtime check on it, perhaps because they want to reserve the IEnumerator for implementation later on.




using System;
using System.Collections;
using System.Collections.Generic;
using System.Data.Linq;
using System.Data.Linq.Mapping;
using System.Linq;
using System.Text;
using System.ComponentModel;
using System.Runtime.Serialization;
using System.Reflection;
 
namespace LINQEntityBaseExample1
{
    public abstract class LINQEntityBase
    {
        // stores the property info for associations
        private Dictionary<string, PropertyInfo> _entityAssociationProperties 
                    = new Dictionary<string, PropertyInfo>(); 
        //used to hold the private class that allows entity hierarchy to be enumerated
        private EntityHierarchy _entityHierarchy; 
        
        /// <summary>
        /// Constructor!
        /// </summary>
        protected LINQEntityBase()
        {
            // Note: FindAssociations() finds association property info's 
            // using reflection (where IsForeignKey !=true)
            // Have left this function out just to keep this short.
            _entityAssociationProperties = FindAssociations();
            // pass in the current object and it's property associations
            _entityHierarchy = new EntityHierarchy(this, _entityAssociationProperties);
        }
 
        /// <summary>
        /// This method flattens the hierachy of objects into a single list that can be queried by linq
        /// </summary>
        /// <returns></returns>
        public IEnumerable<LINQEntityBase> GetEntityHierarchy()
        {
            return (from t in _entityHierarchy
                    select t);
        }
 
        /// <summary>
        /// This class is used internally to implement IEnumerable, so that the hierarchy can
        /// be enumerated by LINQ queries.
        /// </summary>
        private class EntityHierarchy : IEnumerable<LINQEntityBase>
        {
            private Dictionary<string, PropertyInfo> _entityAssociationProperties;
            private LINQEntityBase _entityRoot;
 
            public EntityHierarchy(LINQEntityBase EntityRoot, Dictionary<string, PropertyInfo> EntityAssociationProperties)
            {
                _entityRoot = EntityRoot;
                _entityAssociationProperties = EntityAssociationProperties;
            }
 
            // implement the GetEnumerator Type
            public IEnumerator<LINQEntityBase> GetEnumerator()
            {
                // return the current object
                yield return _entityRoot;
 
                // return the children (using reflection)
                foreach (PropertyInfo propInfo in _entityAssociationProperties.Values)
                {
                    // Is it an EntitySet<> ?
                    if (propInfo.PropertyType.IsGenericType && propInfo.PropertyType.GetGenericTypeDefinition() == typeof(EntitySet<>))
                    {
                        // It's an EntitySet<> so lets grab the value, loop through each value and
                        // return each value as an EntityBase.
                        IEnumerator entityList = (propInfo.GetValue(_entityRoot, null) as IEnumerable).GetEnumerator();
 
                        while (entityList.MoveNext() == true)
                        {
                            if (entityList.Current.GetType().IsSubclassOf(typeof(LINQEntityBase)))
                            {
                                LINQEntityBase currentEntity = (LINQEntityBase)entityList.Current;
                                foreach (LINQEntityBase subEntity in currentEntity.GetEntityHierarchy())
                                {
                                    yield return subEntity;
                                }
                            }
                        }
                    }
                    else if (propInfo.PropertyType.IsSubclassOf(typeof(LINQEntityBase)))
                    {
                        //Ask for these children for their section of the tree.
                        foreach (LINQEntityBase subEntity in (propInfo.GetValue(_entityRoot, null) as LINQEntityBase).GetEntityHierarchy())
                        {
                            yield return subEntity;
                        }
                    }
                }
            }
 
            // implement the GetEnumerator type
            IEnumerator IEnumerable.GetEnumerator()
            {
                return this.GetEnumerator();
            }
        }
 
    }
 
}

Tuesday, March 4, 2008

Superclass your entities without using SQLMetal

Superclass your entities without using SQLMetal

Often I look through posts and find that people are repeating a lot of code in partial classes, when some of this work could be done in a parent class. The only documented way to do this seems to be the SQLMetal.exe command line tool. However, it's entirely possible to do this without SQLMetal.exe. To make this possible, it's just a simple matter of using notepad to edit your existing dbml file and adding the following to the "Database" element:

EntityBase="[EntityBase]"

Where [EntityBase] can be replaced with the name of your superclass.

Next, save the file and go back to your project. Right click on the DBML file in the VS project and select "Run custom tool"... the next thing you know all your LINQ to SQL objects will be subclasses to what you specified.

You can update this anytime, without screwing anything up... and the way it does it is pretty lazy...There's no checking of any sort, so you can prefix your class with a namespace, add multiple interfaces or just have your entities implement an interface without a base class.

E.g.

EntityBase="Sample.EntityBase"
EntityBase="Sample.EntityBase, IMyInterface1, IMyInterface2"
EntityBase="IMyInterface"

Anyway, I've already used this for a number of reasons - quite useful.

Cheers

Matt.

Monday, March 3, 2008

Disconnected LINQ to SQL Tips Part 1

Intro!
In my research on LINQ to SQL and trying to work around the limititation of no "out of the box" disconnected (n-tier) mode, I've come accross a lot of things that others may find useful.

I was somewhat disappointed in this drawback as LINQ to SQL - seeing that it was only intended to be used in "connected" scenario. I saw it as a challenge to figure out ways in which a "disconnected" scenario could be done.

Hence, here I am a first time virgin blogger - who felt compelled to reduce the sweat and tears of others while dealing with this double edged sword.

I'll be blogging how "Disconnected LINQ" can be achieved in a later post, but first up here's some tips for some of you out there that might be struggling with "Disconnected LINQ".

The 'How to do Disconnected' Rules
First up, here are the rules for allowing disconnected LINQ to SQL. To successfully disconnect a LINQ to SQL entity from a Data Context and allow it to re-connect to a different Data Context, you must do the following:

1. Enable Concurrency Tracking
You can enable concurrency tracking by adding a timestamp (rowversion) field to your database and include this in your LINQ to SQL model.

Alternatively, if you don't care for concurrency tracking you can set all columns so that the Update Check is set to false.

2. Disable Deffered Loading, Load Everything or Serialize the objects
Disabling Lazy loading is fairly straight forward, it just a simple matter of setting the DeferredLoadingEnabled = False like so:



using (EntitiesDataContext db = new EntitiesDataContext())
{
    db.DeferredLoadingEnabled = false;
 
    var customers = from c in db.Customers
                    where c.CustomerId == CustomerId
                    select c;
}

This tells link not to query that database when a link association (another object or collection) is referenced, instead it will simply return a null.

Alternatively, you can also load all related objects so there is nothing to lazy load. This can be done by using the datacontext load options object like so:


using (EntitiesDataContext db = new EntitiesDataContext())
{
    DataLoadOptions lo = new DataLoadOptions();
    lo.LoadWith<Customer>(c => c.Dependants);
    db.LoadOptions = lo;
 
    var customers = from c in db.Customers
                    where c.CustomerId == CustomerId
                    select c;
}

Basically, this is telling the data context that whenever it loads a Customer, it should also load the Dependants for that customer. If all the related objects or collections are covered in this way, LINQ to SQL won't even consider deferred loading because it believes it has all the possible connected objects.

As for serialization, this automatically disables deferred loading as above ... and how to serialize is covered next....

Serializing and copying the objects
Linq to SQL entities cannot be serialized using standard serialization techniques, this means you can't just pop it in the view state or ASP.NET state server without running into some trouble.

In order to serialize an LINQ to SQL object graph (a root object and it's child entities) you'll need to use the WCF data contract serializer instead.

But wait! Even before you do that, you'll need to set your Data Contexts serialization mode to "Unidirectional" (which is available in the model properties). This means that only references in the object graph from parent to child will be serialized and therefore therefore child to parent references will be ignored. This is good because it means that circular references won't cause problems. The standard serializer can't be used for exactly this reason, because it would end up with failing because it would try to serialize circular references.

Here's a couple of functions which can be used to serialize/deserialize a LINQ to SQL object graph and also a handy copy function which will make a completely seperate copy of an object graph.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Runtime.Serialization;
using System.Text;
using System.IO;
using System.Xml;
 
namespace SampleFramework
{
    public static class LINQHelper
    {       
 
        /// <summary>
        /// Makes a copy of an existing LINQ to SQL entity and it's children.
        /// </summary>
        /// <typeparam name="T"></typeparam>
        /// <param name="entitySource">The LINQ to SQL entity to copy</param>
        /// <returns></returns>
        public static T CopyEntityDeep<T>(T entitySource)
        {
            if (entitySource == null)
                return default(T);
 
            return (T)DeserializeEntity(SerializeEntity(entitySource), entitySource.GetType());
        }
 
        /// <summary>
        /// Makes a copy of a list of existing LINQ to SQL entities and their children.
        /// </summary>
        /// <typeparam name="T"></typeparam>
        /// <param name="source">The LIST of SQL entities to copy
        /// </param>
        /// <returns></returns>
        public static List<T> CopyEntityListDeep<T>(List<T> entitySourceList)
        {
            List<T> result = new List<T>();
 
            if (entitySourceList == null)
                return null;
 
 
            foreach (T entitySource in entitySourceList)
            {
                T entityTarget = CopyEntityDeep(entitySource);
 
                result.Add(entityTarget);
            }
 
            return result;
 
        }
     
        public static string SerializeEntity<T>(T entitySource)
        {
            DataContractSerializer dcs = new DataContractSerializer(entitySource.GetType());
 
            if (entitySource == null)
                return null;
 
            StringBuilder sb = new StringBuilder();
            XmlWriter xmlw = XmlWriter.Create(sb);
            dcs.WriteObject(xmlw, entitySource);
            xmlw.Close();
 
            return sb.ToString();
        }
 
        public static object DeserializeEntity(string entitySource, Type entityType)
        {
            object entityTarget;
 
            if (entityType == null)
                return null;
 
            DataContractSerializer dcs = new DataContractSerializer(entityType);
 
            StringReader sr = new StringReader(entitySource);
            XmlTextReader xmltr = new XmlTextReader(sr);
            entityTarget = (object)dcs.ReadObject(xmltr);
            xmltr.Close();
 
            return entityTarget;
        }
    }
}

Wrapping up
Next post, I hope to move further into the disconnected model that I've come up with that uses some of the above techniques to track changes.

Cheers

Matt.