Monday, May 26, 2008

Blog moved

My blog moved a long long time ago to http://tonesdotnetblog.wordpress.com

Thursday, April 12, 2007

Transforming Xml using an Xsl stylesheet in .net 2

Here is a simple way to transform an xml document using xsl in .net 2.0. Note that the class name has changed from XslTransform to XslCompiledTransform between .net 1.1 and .net 2.0.

XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xmlDocumentAsString);
XslCompiledTransform xslTransform = new XslCompiledTransform();
System.Text.StringBuilder builder = new System.Text.StringBuilder();
xslTransform.Load(xslFilename);
System.IO.StringWriter writer = new System.IO.StringWriter(builder);
xslTransform.Transform(xmlDoc, new XmlTextWriter(writer));
writer.Flush();
string transformedString = builder.ToString();

Tuesday, April 03, 2007

Impressed owner of a new Zune!

I am in no way a gadget person. You don't see me sitting at a desk with my earphones on. Which is why when I found myself the owner of a new Zune I was skeptical about what I would do with it.

You see, I entered the work internal SQL Server competition at Readify. It was a 20 week competition with roughly 5 difficult questions a week, and was put together by MVP Greg Low (see http://msmvps.com/blogs/greglow/Default.aspx). Well, I won! It looks like I'm 2nd best SQL Server person at Readify (after Greg, of course). And I now own a Zune.

I plugged it in for the first time last night to try it out. It's actually pretty good. It has a 30Gig hard drive and gave me a 14-day free pass to access online content (music). So I downloaded a whole lot of stuff from the Zune marketplace - it's US centric, as the Zune hasn't been released here in Australia yet. I was disappointed it didn't have some albums I like, but it did have enough music that I did like to pretty much make up for it.

I liked the quality of the movie playback. That was impressive. I might put a few Doctor Who episodes onto it and sync a few webcasts. Then it might become really useful. But I still don't think you'll see me sitting at a desk with earphones on.

The kids are over at Mum and Dad's today, and Jarod pleaded to take it with him (he really IS into gadgets). So he's over there listening to Snee Snore Snappy and the Frog Song (the new mix of the Axel F theme from Beverley Hills Cop.) I do hope I get it back in one piece!

Tuesday, March 27, 2007

Use Workstation Garbage Collection (GC)

On March 8, 2007, Jeff Stuckey, the Systems Engineer Manager from Microsoft gave a web cast on IIS and the Garbage Collector which was, well, pretty surprising really!

I have painstakingly transcribed what he said about the Garbage Collector and it's performance in large, multi-application environments. Basically, the main jist of it is "Use Workstation GC" and secondly, beware of bad caching.

Here is a fragment of what he said:
"We're now running into a problem where we have 11 worker processes. Each of them have server gc running. Each of them have 4 high-priority threads trying to do Garbage collection work. This is potentially dangerous because depending on timing of whengarbage collection happens you could really drag out the collections for one particular worker process if this gets interrupted in any way by any of the other high priority GC threads or any of the worker processes, so the guidance that we've been given is to go with what they call the workstation GC. This setting is in aspnet.config, as follows:

notepad c:\windows\microsoft.net\framework64\v2.0.50727\aspnet.cofig

<configuration>
<runtime>
<gcServer enabled="false" />
</runtime>
</configuration>


Now what this does is that it converts the behaviour instead of having 4 dedicated GC threads running at high priority with their own segments, you have one GC thread and the allocation that it does for a native 64 bit machine is 256Meg with a 128Meg Large object heap. So if you have a lot of worker processes, 64-bit or 32-bit, doesn't matter, if you have a lot of worker processes on a multi-processor machine, the guidance is to run this workstation GC. It reduces the overall footprint and the initial footprint of the CLR. I believe it also kind of starts up faster. On 32-bit machines the segments are even smaller, 16Meg Initial segment and 16 Meg Large object heap.

Another option we've played around with, a little bit undocumented is the GC segment sizeregistry key...
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\.NETFramework]
"GCSegmentSize"=dword:04000000
This basically configures the segment size for server or workstation but we have been mainly using it for server GC to configure the 64-bit machines down to the 32-bit machine segment sizes just to see how it impacts performance. We've had pretty positive results. So anyway, that's just for your information...

[Later on he said...]

GC behaviour in production. This is what we see in our systems in production. ASP.NET caching generally drives your managed growth on, I don't know, 90% of our applications, when we look at GC activity, and we look at over activity, the CPU utilisation because of GC, ASP.NET caching is generally the culprit, either caching items with no expiration policy, or their caching as much as they can, that's our experience. Gen1 Ssize is typically very small compared to Gen 0 and Gen 2........Cost of GC is driven by the number of objects that survive, not dead objects.

On x64 with many application pools the system can experience memory pressure which can be catastrophic in terms of GC activity. So basically what happens is that you have a lot of applications and they're all caching and they're driving the system down to the point where GC says "what's the memory load" and the system says "ow, I'm this full" and the GC says, "ow, this is memory pressure, I better start collection". All 11 app pools see this at the same time and they all start collecting, and when this happens, since the GC threads are running at high priority they can block, even, http.sys from taking connections so that the behaviour that we saw, in the availability tests was that it was failing on some requests to static gif files, which is just bizarre as you wouldn't expect static gifs to have a connection failure or any kind of failure for that matter and it turned out that it was GC activity that was killing us.

....

XML is usually a large culprit because you end up with this XML document that has literally hundreds of objects that are linked to it, most of them strings, and people like to cache up XML for performance reasons. Caching's not bad, I'm not saying that, but the guidance would be that you really get a handle on exactly what you're caching and you cache it in an intelligent way so that you're not overutilising the cache. You cache only stuff that's hot, not stuff that's only hit, you know, once or twice. The goal is to improve performance of your caching. So you don't want to cache just about everything because you will eventually drive the system into memory pressure which has a very negative impact on performance.

...."

Move SQL Server Execution Plans to the middle tier

Most Enterprise developers are familiar with n-tier development. Over the years there has been some argument as to where you put certain logic.

In n-tier development, there is usually a UI layer for layout, a UI-oriented Business layer for UI-oriented validation logic, a Data-oriented Business layer for business rules, and a Data layer. However, it's not that simple.

You see, conventional Enterprise development says that business logic should be in the business layer(s), but then it turns around and says that if you need a particular piece of business logic to be more performant, then it should be re-written as a stored procedure.

Rewriting business logic in the database stored proc would be undoubtedly faster. However, the very idea that business logic is now turning up in the database is conceptually wrong.

Lets take it a step further. Say you want every piece of data-oriented business logic to be performant. Suddenly, every routine that previously existed in a the middle-tier business layer is now in the database. So that middle-tier now provides just a pass-through to the routines in the database.

This strategy actually works, and works well. I have seen a number of medium sized systems that are implemented this way. There are actually many benefits in coding this way. You can now write a patch script to change business logic that is transactional. To get an application upgraded generally requires more bureaucracy than writing a patch script for a few stored procedures. Again, this is not a catch-all, and I wouldn't write every system this way. As always, there are many ways to skin a cat.

A major benefit of stored procedures, and why when they are written well they are so fast, is that when they are compiled, they produce an execution plan. The stored procedure knows which indexes to use and exactly where to get the data from. The downside is that not everyone has experience writing stored procedures, and maybe they shouldn't have to.

What this all comes down to for me is this. Every developer that works in the data-oriented business layer should need to know at least SQL. Stored procedures themselves are just routines, and there should be a way to replicate this sort of functionality in the middle-tier.

At this point, the smart people at Microsoft should be able to work out a system to ensure that the execution plans are compiled in the middle tier based on the routines written there. No, I don't know exactly how they'll do it - Perhaps they'll have to originally write some sort of polling/replication to the database to achieve it.

The best place to put this is probably in a hybrid of the new LINQ framework. If LINQ is as good as I think it should be, then it should be relatively easy for companies like Microsoft to plug in an optimised block for compiling execution plans in the middle tier.

If the outcome is better performing applications, then I'm all for it. And if every Enterprise application world-wide suddenly becomes faster and more efficient, with all the business logic in one location, and without developers having to learn anything new, then that is a great thing.

Monday, March 26, 2007

ASP.NET 2.0 unhandled exceptions tear down IIS worker process

According to Jeff Stuckey, Systems Engineer Manager for Microsoft, unhandled exceptions cause the whole IIS worker process to be torn down. This has serious implications, as if you're running a whole lot of sites on IIS, you don't just lose the App Pool for the one that causes the error, you lose the whole lot!

Don't believe me? Check out the WebCast called "Debugging CLR Internals" on iis.net: http://www.iis.net/default.aspx?tabid=2&subtabid=26&i=1059 at about the 36:06 mark. The workaround is to implement legacy exception handling at http://msdn2.microsoft.com/en-us/library/ms228965.aspx

Thursday, October 13, 2005

Some fix time data

Someone at work wanted some info on estimating, so I went to my trusty bookshelf and pulled out a copy of Watts Humphrey's "A discipline for software engineering". It's got some info on how to better estimate based on historical estimate vs. actuals.

But that's not what this post is about. One little gem I came across is on page 275. It's about the cost of resolving a defect found during the different phases of the development process. As follows:

Some Fix Time Data. There are not many published data on the time required to identify software defects. Following are some that are available:

  • IBM: An unpublished IBM rule of thumb for the relative costs to identify software defects: during design, 1.5; prior to coding, 1; during coding, 1.5; prior to test, 10; during test, 60; in field use, 100.
  • TRW: The relative times to identify defects: during requirements, 1: during design, 3 to 6; during coding, 10; in development test, 15 to 40; in acceptance test, 30 to 70; during operation, 40 to 1000 [Boehm 81]
  • IBM: The relative time to identify defects: during design reviews, 1; during code inspections, 20; during machine test, 82 [Remus]
  • JPL: Bush reports an average cost per defect: $90 to $120 in inspections and $10,000 in test [Bush] {Note-published 1995}
  • Freedman and Weinberg: They report that projects that used reviews and inspections had tenfold reduction in the number of defects found in test and a 50 percent to 80 percent reduction in test costs, including the costs of the reviews and inspections.

Clearly, defect identification costs are higher during test and use. Thus anyone who seeks to reduce development cost or time should focus on preventing or removing defects before starting test.