Thursday, June 17, 2010

Entity Framework Magic

The new ADO.NET Entity Framework has been all the buzz recently. I have to admit it is quite interesting. The idea of querying your database through LINQ is quite amazing. However, as somebody who's always written my own stored procedures and implementing my business logic in them, I've been skeptical.
But as we programmers get older, I've realized we can easily get skeptical about things we don't fully understand or when we see a technology that seems "too simple" and no doubt will be less efficient. While this is usually true, it's not always the best reasoning. A programmer who writes in byte-code certainly can scoff at my use of a "high-level language" knowing that he is able to write things that are far more processor efficient. While he is right, it does take much more time to write it and because machines are much more powerful than they used to be, practically "high-level language" is going to be the better choice most but certainly not all of the time.
It is with that in mind, that I've tried to approach this new technology. But before I do, I want to understand the magic. I want to see what is going on under the hood. What kind of SQL is being generated? Fortunately, I've been able to do that.

Take a look at this LINQ query from the NerdDinner tutorial:

public class DinnersController : Controller
{
DinnerRepository dinnerRepository = new DinnerRepository();
//
// GET: /Dinners/
public ActionResult Index(int? page)
{
const int pageSize = 4;
var dinners = dinnerRepository.FindUpcomingDinners();
var paginatedDinners = dinners.Skip((page ?? 0) * pageSize)
.Take(pageSize)
.ToList();
return View("Index", paginatedDinners);
}
//...
}

public class DinnerRepository
{
//...

public IQueryable FindUpcomingDinners()
{
//TODO: 1/1/2009 is a temporary hack so I can see the old data, replace with DateTime.Now before go live!
DateTime now = DateTime.Parse("1/1/2009");
return from dinner in db.Dinners
where dinner.EventDate > now
orderby dinner.EventDate
select dinner;
}
//...
}

The LINQ query in the DinnersController (and notice there is a nested LINQ query coming from another class) returns a page of 4 records. Now my skeptical mind would assume that the Framework would probably return all the records from the database that are in the nested query and then get the paged results. Wrong! The beautiful thing about LINQ is that it only executes when you actually begin to use the data and until then it is only a "query" waiting to be run.
I hooked up SQL Server Profiler to my SqlExpress database to see how this was actually run. Here is the final SQL that hit the database:

--It does an exec sp_executesql
exec sp_executesql N'SELECT TOP (4)
[Project1].[DinnerID] AS [DinnerID],
[Project1].[Title] AS [Title],
[Project1].[EventDate] AS [EventDate],
[Project1].[Description] AS [Description],
[Project1].[HostedBy] AS [HostedBy],
[Project1].[ContactPhone] AS [ContactPhone],
[Project1].[Address] AS [Address],
[Project1].[Country] AS [Country],
[Project1].[Latitude] AS [Latitude],
[Project1].[Longitude] AS [Longitude]
FROM ( SELECT [Project1].[DinnerID] AS [DinnerID], [Project1].[Title] AS [Title], [Project1].[EventDate] AS [EventDate], [Project1].[Description] AS [Description], [Project1].[HostedBy] AS [HostedBy], [Project1].[ContactPhone] AS [ContactPhone], [Project1].[Address] AS [Address], [Project1].[Country] AS [Country], [Project1].[Latitude] AS [Latitude], [Project1].[Longitude] AS [Longitude], row_number() OVER (ORDER BY [Project1].[EventDate] ASC) AS [row_number]
FROM ( SELECT
[Extent1].[DinnerID] AS [DinnerID],
[Extent1].[Title] AS [Title],
[Extent1].[EventDate] AS [EventDate],
[Extent1].[Description] AS [Description],
[Extent1].[HostedBy] AS [HostedBy],
[Extent1].[ContactPhone] AS [ContactPhone],
[Extent1].[Address] AS [Address],
[Extent1].[Country] AS [Country],
[Extent1].[Latitude] AS [Latitude],
[Extent1].[Longitude] AS [Longitude]
FROM [dbo].[Dinners] AS [Extent1]
WHERE [Extent1].[EventDate] > @p__linq__0
) AS [Project1]
) AS [Project1]
WHERE [Project1].[row_number] > 8
ORDER BY [Project1].[EventDate] ASC',N'@p__linq__0 datetime2(7)',@p__linq__0='2009-01-01 00:00:00'

--Which looks like this:
SELECT TOP (4)
[Project1].[DinnerID] AS [DinnerID],
[Project1].[Title] AS [Title],
[Project1].[EventDate] AS [EventDate],
[Project1].[Description] AS [Description],
[Project1].[HostedBy] AS [HostedBy],
[Project1].[ContactPhone] AS [ContactPhone],
[Project1].[Address] AS [Address],
[Project1].[Country] AS [Country],
[Project1].[Latitude] AS [Latitude],
[Project1].[Longitude] AS [Longitude]
FROM ( SELECT [Project1].[DinnerID] AS [DinnerID], [Project1].[Title] AS [Title], [Project1].[EventDate] AS [EventDate], [Project1].[Description] AS [Description], [Project1].[HostedBy] AS [HostedBy], [Project1].[ContactPhone] AS [ContactPhone], [Project1].[Address] AS [Address], [Project1].[Country] AS [Country], [Project1].[Latitude] AS [Latitude], [Project1].[Longitude] AS [Longitude], row_number() OVER (ORDER BY [Project1].[EventDate] ASC) AS [row_number]
FROM ( SELECT
[Extent1].[DinnerID] AS [DinnerID],
[Extent1].[Title] AS [Title],
[Extent1].[EventDate] AS [EventDate],
[Extent1].[Description] AS [Description],
[Extent1].[HostedBy] AS [HostedBy],
[Extent1].[ContactPhone] AS [ContactPhone],
[Extent1].[Address] AS [Address],
[Extent1].[Country] AS [Country],
[Extent1].[Latitude] AS [Latitude],
[Extent1].[Longitude] AS [Longitude]
FROM [dbo].[Dinners] AS [Extent1]
WHERE [Extent1].[EventDate] > @p__linq__0
) AS [Project1]
) AS [Project1]
WHERE [Project1].[row_number] > 8
ORDER BY [Project1].[EventDate] ASC


I did notice that in situations where one record was expected to be returned, the SQL was asking for the top 2. I assume that it does this so that if two records do come back, it can throw an exception. Interesting.
From what I could find, it appears that Entity Framework takes the LINQ query to generate Entity SQL. An Entity Client provider, for a particular database, then generates SQL in the language of the target database. All that to say there is a lot of magic going on but it seems to be a lot smarter than I had imagined.
However, I do think that the SQL Query Profiler is going to be the Entity Framework developer's new best friend. This can be difficult because in some work environments, developers may not have access to tools like this and will have to petition their over-worked DBAs to get these things for them. I know that was the situation at my company a few years ago, thankfully we now at least have these privileges on our Development server.
It will be good to know, particularly if your app is going to be heavily utilized, what kind of SQL will actually be hitting your database. You can then try and tune your database to increase performance before you push this out to Q/A and especially before production.

No comments:

Post a Comment