Generating Excel Workooks without Excel


Why I wrote Excel Xml Writer
One day I found myself having to build a Web Application that one of the requirements involved generating a nice Excel Workbook that users could then play with. This required quite some styling and several other features that you can only do with Excel (like setting printer options and document properties).
 
Well, coming from Consulting, this requirement was no strange at all to me, and I had to dealt with this problem many times years ago. However then, it was a different story, Managed code and Xml were not even created, and COM was the only option I had. The only solution was using Excel Automation to build Workbooks that supported all the features I required. Yes, I know HTML could do the trick to just generate a table and set the content type to an Excel Application but this certainly leaves you lacking of control in several Excel features like document properties (Author, custom properties, etc), printer settings and more.
 
Excel Automation
If you ever worked with Excel Automation you know that it is an extremely powerful (and complicated) object model. However, this power does not come for free, every time you create an Excel.Application, you are essentially running a new Excel.exe instance which is nothing but cheap, and for that reason (and many more) you certainly do not want to do that in a Web Application where thousands of users might run the nice “Export to Excel” link and you end up with thousand of processes being created and destroyed.
 
Just to illustrate my point, I created the following sample.
 
C#
using System;
using 
Excel Microsoft.Office.Interop.Excel;
using 
System.Runtime.InteropServices;
using 
Missing System.Reflection.Missing;

static class 
Program {

    
static void Main() {
        
int tick Environment.TickCount;
        
// Create the Excel Application
        
Excel.Application excel = new Excel.Application();

        try 
{
            
// make it visible for demostration purposes
            
excel.Visible = true;

            
// Add a Workbook
            
Excel.Workbook workbook excel.Workbooks.Add(Missing.Value);

            
// Set the author
            
workbook.Author "CarlosAg";

            
// Create a Style
            
Excel.Style style workbook.Styles.Add("Style1", Missing.Value);
            
style.Font.Bold = true;

            
// Add a new Worksheet
            
Excel.Worksheet sheet =
                
(Excel.Worksheet)workbook.Worksheets.Add(Missing.Value, Missing.Value, Missing.Value, Missing.Value);

            
// Set some text to a cell
            
Excel.Range range ((Excel.Range)sheet.Cells[11]);
            
range.Style style;
            
range.Value2 "Hello World";

            
workbook.SaveAs(@"c:\test.xls", Missing.Value, Missing.Value, Missing.Value,
                            Missing.Value, Missing.Value, Microsoft.Office.Interop.Excel.XlSaveAsAccessMode.xlExclusive,
                            Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value)
;

            
// Finally close the Workbook and save it
            
workbook.Close(false, Missing.Value, Missing.Value);

            
// Close Excel
            
excel.Quit();

        
}
        
finally {
            
// Make sure we release the reference to the underlying COM object
            
Marshal.ReleaseComObject(excel);
        
}

        Console.WriteLine(
"Time: {0}", Environment.TickCount - tick);
    
}
}

Colorized by: CarlosAg.CodeColorizer Note: If you want to run the application you need to add a Reference to Microsoft Excel COM library.
 
Well, I ran this really simple application cold in my machine that has Office 2003, and running it took almost 3000 milliseconds. Of course if you run it again it runs in about 1 second. But this solution will just not scale in a Web Application.
 
Another big problem with this approach is the code itself, if you take a close look I had to type almost 20 references to Missing.Value.
 
Solution
Luckily ever since Office XP, Excel supports a new file format called Xml Workbook (or something like that), this allow you to create an Xml document that follows a certain schema and Excel will work as if it were the binary XLS format (though not all features at supported like Charts).
 
Now I have new options, I could just generate the Xml using an XmlDocument or even better using an XmlWriter; but doing so it is quite cumbersome, since you need to understand a lot of Xml, Schemas, and Namespaces and it is quite probably that you will mess up something like closing an element incorrectly or adding the wrong namespace, or prefix, etc.
 
For that reason I thought to build a lightweight fast wrapper to the Excel Xml Workbook schema. This way my application manipulates an object model that looks similar to Excel Automation OM but it is lightweight, 100% managed, and that in the end serialize itself into an Xml using an XmlWriter.
 
This is exactly what Excel Xml Writer is, just a simple object model that generates Xml following the Excel Xml Workbook schema. After almost done with it, I thought I could add the ability to load the Xml as well, so I added that feature. This turned out to be extremely useful when loading Excel worksheets from the Office Web Components, and really cool usage, so that you can embed workbooks in your page, and then use AJAX like technology to post the XMLData property back and load it in the server side to do the actual processing in your Database, etc.
 
You can download it for free at http://www.carlosag.net/Tools/ExcelXmlWriter/Default.aspx.
 
Now, the code to generate the same workbook we just did using my library the code looks like:
 
 
using System;
using 
CarlosAg.ExcelXmlWriter;

class 
Program {
    
static void Main(string[] args) {
        
int ticks Environment.TickCount;

        
// Create the workbook
        
Workbook book = new Workbook();
        
// Set the author
        
book.Properties.Author "CarlosAg";

        
// Add some style
        
WorksheetStyle style book.Styles.Add("style1");
        
style.Font.Bold = true;

        
Worksheet sheet book.Worksheets.Add("SampleSheet");

        
WorksheetRow Row0 sheet.Table.Rows.Add();
        
// Add a cell
        
Row0.Cells.Add("Hello World", DataType.String, "style1");

        
// Save it
        
book.Save(@"c:\test.xls");

        
Console.WriteLine("Time:{0}", Environment.TickCount - ticks);
    
}
}

Colorized by: CarlosAg.CodeColorizer

Several differences:
1)      You don’t actually need Excel installed in your server to run this program since it does not uses Excel at all, just Xml.
2)      Working set of your application is way smaller than using the Interop libraries
3)      This is more than 100 times faster to run.
4)      Code looks much more simpler.
5)      Since it generates Xml, you can actually stream it directly in an ASP.NET application to the Response.OutputStream without ever saving it into the file system.
6)      This solution will scale to thousands of users since it does not require any creation of processes.
 
Now, even better I decided to write a code generator tool so that you don’t need to write all the styling code and superficial stuff and just focus on the actual data. This tool allows you to open an Excel Xml Workbook that you have created in Excel, and it will generate the C# or VB.NET code that you can use to generate it. This means that you can create the skeleton with all the formatting options in Excel and then just generate the code.
 
 
Conclusion
Don’t get me wrong Excel team did an awesome job with the Automation support for all Office products, however this has been around for several years, but it definitely lacks of support for Web based applications (asside from Office Web Components). Luckily they are addressing this in the next release of Office where they will have some awesome support for server side applications and many exciting stuff.
 
In the mean time you might find really exciting working with Xml in Office since they have great support and it will only get better with time.

No Comments