Brian Swan's Blog

  • Analyzing Windows Azure Storage Logs

    A few weeks ago, I wrote a post that described how to maximize throughput between HDInsight clusters and Windows Azure Storage. One of the suggestions I made was to adjust your HDInsight cluster’s self-throttling mechanism - i.e. tune the fs.azure.selfthrottling.read/write.factor parameters. I also suggested that the best way to find the optimal parameter values was ultimately to turn on storage account logging and analyze the logs after you had run a job or two. This post describes how to use a new command-line tool (available as part of the .NET SDK for Hadoop) that makes analysis of storage account logs easy.

  • Accessing Hadoop Logs in HDInsight

    One of the questions the HDInsight team sees a lot is a variation of the question “How do I figure out what went wrong when something does go wrong?” If you are familiar with Hadoop, you are probably also familiar with rolling up your sleeves and digging into Hadoop logs to answer this question. However, we’ve found that many folks using HDInsight don’t know that much of the logging information they are accustomed to using is easily available to them for HDInsight clusters. This is a quick post to outline the types of logs that are written to your Azure storage account when you spin up an HDInsight cluster:

  • Maximizing HDInsight throughput to Azure Blob Storage

    The HDInsight service supports both HDFS and Windows Azure Storage (BLOB Service) for storing data. Using BLOB Storage with HDInsight gives you low-cost, redundant storage, and allows you to scale your storage needs independently of your compute needs. However, Windows Azure Storage allocates bandwidth to a storage account that can be exceeded by HDInsight clusters of sufficient size. If this occurs, Windows Azure Storage will throttle requests. This article describes when throttling may occur and how to maximize throughput to BLOB Storage by avoiding throttling.

  • Insights on HDInsight

    I think it’s about time I dust off this blog and realign it with my current focus: HDInsight. I’ve been heads-down since February (when I joined the HDInsight team) learning about “big data” and Hadoop. I haven’t had much time for writing, but I’m hoping to change that. I’ve learned quite a bit in the last few months, and I find that writing is the best way to solidify my learning (not to mention share what I’ve learned). If you have topics you’d like to see covered, let me know in the comments or on Twitter (@brian_swan) – I do what I can to cover them.

  • Azure Real World: Managing and Monitoring Drupal Sites on Windows Azure

    A few weeks ago, I co-authored an article (with my colleague Rama Ramani) about how the Screen Actors Guild Awards website migrated its Drupal deployment from LAMP to Windows Azure: Azure Real World: Migrating a Drupal Site from LAMP to Windows Azure. Since then, Rama and another colleague, Jason Roth, have been working on writing up how the SAG Awards website was managed and monitored in Windows Azure. The article below is the fruit of their work…a very interesting/educational read.

  • Deploying Drupal at Scale on Microsoft Platform

    I find that every conference I attend is a humbling experience. There are just so many knowledgeable people that I’m constantly reminded of how much I don’t know. The pre-conference training at DrupalCon Denver was no exception (and the real conference hadn’t even begun!). In the Deploying Drupal at Scale on Microsoft Platform training yesterday, Alessandro Pilotti delivered a densely packed training session that, once again, left me feeling humble. Alessandro’s breadth and depth of knowledge about running PHP applications on Windows (and Drupal in particular) is truly impressive.

  • Azure Real World: Migrating Drupal from LAMP to Windows Azure

    Last month, the Interoperability team at Microsoft highlighted work done to move the Screen Actors Guild Awards Drupal website from a Linux-Apache-MySQL-PHP (LAMP) environment to the Windows Azure platform: SAG Awards Drupal Website Moves to Windows Azure. The move was the result of collaboration between SAG Awards engineers and engineers from Microsoft’s Interoperability Team and Customer Advisory Team (CAT). The move allowed the SAG Awards website to handle a sustained traffic spike during the SAG Awards show in January. Since then, I’ve had the opportunity to talk with some of the engineers who helped with the move. In this post I’ll describe the challenges and steps taken in moving the SAG Awards website from a LAMP environment to the Windows Azure platform.

  • What is Microsoft Doing at DrupalCon Denver?

    Microsoft will be at DrupalCon Denver next week, and I have the good fortune of being one of the Microsoft representatives that will be attending. The program looks great – it’s packed with great speakers and sessions, and there are lots of fun events planned. I’m excited about going for those reasons, but also because I’m curious about how this conference will be different than the last DrupalCon I attended (DrupalCon San Francisco, 2010). At that conference, I was frequently asked “What is Microsoft doing here?” You can read more about that in the post I wrote after the conference, What was Microsoft Doing at DrupalCon? (be sure to read the comments), but suffice it to say that I hope the fact that we will be at a Drupal conference (as a sponsor, no less) isn’t as surprising as it was then. And, because of that post, I’m going to Denver with great interest in the community reaction to us today. Essentially, I said that our commitment to Drupal would (and should) be judged by our continued involvement with and contributions to the community. Now that two years have passed since my last DrupalCon, I hope that our actions do speak to our continued involvement and contribution.

  • What SQL Server 2012 Means for PHP Developers

    Last week, Microsoft held a virtual conference to announce the availability of SQL Server 2012. The conference included a number of events (speakers, videos, training activities, etc.) that focused on the new functionality available in this release. Now that most of the fanfare has died down a bit, I’d like to take a look at what some of that new functionality means for PHP developers. Combined with the release of the Microsoft Driver for SQL Server for PHP, I think the SQL Server 2012 release makes some big improvements in developing PHP/SQL Server applications.