Eliminating Magic Strings in ASP.NET MVC

One man's quest to get rid of quotes.
Published on Monday, January 5, 2015

Except for direct screen output, I really dislike coded string literals. Using strings to refer to properties, methods, classes, etc. makes it much easier to introduce code quality problems. This includes things like mistyped identifiers, missed refactoring renames, and poor code analysis capabilities. I am constantly on the hunt for ways to remove these "magic strings" and replace them with strongly-typed counterparts. This post describes several of the tools and techniques I've found that work well. While this post addresses the elimination of magic strings from ASP.NET MVC web applications, many of the strategies are applicable to other code as well.

What Are Magic Strings?

Before discussing how to get rid of them, lets consider what a "magic string" is. A magic string is just a string literal that's used to refer to a code artifact. They often become necessary when a language uses dynamic features that require references to other code. For example, when you want to generate a link in ASP.NET MVC you need to pass in the name of the controller and action:

@Html.ActionLink("User Profile", "UserProfileAction", "ProfileController")

Another area where you see a lot of magic strings in .NET is when you need to refer to code artifacts in attribute arguments. Since attributes are essentially constructed at compile time, they can only use primitive constants as arguments. This results in attribute declarations that look like this:

[MyAttribute("SomeClassName", "SomePropertyName")]

To some extent a lot of view code is also magic strings. For example, when you refer to a CSS class or the path to a stylesheet or JavaScript file. Heck, even the HTML tag names are magic strings. Generally speaking, a string (or identifier) that is not meant for output and that is not checked at compile-time is a magic string. Notice that I also included identifiers that aren't checked at compile-time. Magic strings don't necessarily have to be strings. The identifiers used in dynamic and anonymous objects in C# are really just syntactic sugar for strings (like the keys in a dictionary) and subject to the same shortcomings.

So why are these so bad? There are many reasons why you want to try and avoid this type of string literal if you can:

  • Mistyping. It's very easy to mistype a string literal. Without compile-time checking you'd never know you got it wrong until something doesn't work right at run-time. If you're lucky you'll get an exception. If you're not so lucky (such as a mistyped CSS class name) things just won't look quite right.
  • Refactoring. It's much easier to refactor strongly-typed identifiers. When refactoring magic strings you have to make sure you change every instance. And if you miss one, there's no warning or error until run-time.
  • Duplicative. Magic strings cause you to repeat yourself. Sometimes a lot. If you refer to the same controller and action from multiple views, you may end up using those strings in many places.
  • Lack of static code analysis. We now have so many great tools for performing static code analysis. These tools can quickly identify all the uses of a particular identifier, perform templated refactoring, and generally help us write better code faster. This includes IntelliSense, which uses static code analysis. Magic strings are invisible to static code analyzers. They can't be searched for (other than by plain text), won't show up in IntelliSense, etc.

I will admit that perhaps "magic string" is a bad term for what I'm trying to describe and eventually eliminate. In fact, Rob Conery writes a good case against this term and argues it should apply only to conceptual mismatches. He also takes issue with the underlying premise that this type of string literal is bad. Regardless, the term seems to be accepted within the C# community and I've certainly experienced complications from some of the problems above, so I do find them to be a real issue.

Now that we've considered what a magic string is and why they're bad, lets investigate some strategies for getting rid of them.

T4MVC

By far the best thing you can do in ASP.NET MVC is to get T4MVC. This library uses a T4 template to scan your project and generate all sorts of strongly-typed helpers and string constants for your controllers, actions, resources, and more. It even encapsulates all of this in special helper extensions so that you never even have to use strings and instead get a nice IntelliSense experience.

For example, you can write:

@Html.ActionLink("User Profile", MVC.Profile.UserProfile())

Instead of:

@Html.ActionLink("User Profile", "UserProfileAction", "ProfileController")

There are too many features to list here, so check out the documentation to learn more.

Also, since T4MVC uses a T4 template, the template results need to be rebuilt when your code changes. Visual Studio usually only processes T4 templates when the template itself changes. To get Visual Studio to regenerate the T4MVC helpers on every build you'll want to check out AutoT4MVC.

MetaGen

While T4MVC works wonders for all the MVC code in your application, what about the other stuff? MetaGen is a small T4 template similar to T4MVC that reads all the classes and interfaces in the project and outputs static metadata classes with const strings for their members. It's especially helpful when creating bindings in view code to properties of your view models. For example, instead of having to put a magic string into the name and/or id attributes of HTML form fields like this:

<input id="Username" name="Username" type="text">

You can refer instead to the generated const string from MetaGen like this:

<input id="@LoginMetadata.Username" name="@LoginMetadata.Username" type="text">

As with T4MVC, the MetaGen T4 template won't normally be run on code changes. AutoT4 is another Visual Studio extension that will force processing of any T4 template on project rebuild.

CSS Classes

Referring to your CSS classes can be another area where magic strings can really trip you up. Because it's so tempting to fine-tune stylesheets as you see problems, it's not unusual to accidentally remove or rename a CSS class that's being used from your view code. Thankfully there's a relatively easy way to use T4 to create a static class with const strings for all your CSS classes. This not only guards against accidental deletion or renaming, but also provides nice IntelliSense support when writing view code (and yes, Visual Studio can now scan CSS stylesheets and provides IntelliSense anyway so this last point is less important now).

I don't have a particular library to point you towards for this one, but the T4 template is relatively simple and is included below. It's not perfect and will probably need to be tweaked for your particular stylesheet (this one is designed for the Bootstrap CSS files and is used from FluentBootstrap), but it should catch most of your CSS classes.

<#@ template language="C#" hostSpecific="true" #>
<#@ assembly name="System.Core" #> 
<#@ import namespace="System.Linq" #>
<#@ import namespace="System.Globalization" #>
<#@ import namespace="System.Collections.Generic" #>
<#@ import namespace="System.Text.RegularExpressions" #>
<# Process(); #>
<#+
	// Regex for CSS classes from http://paul.kinlan.me/regex-to-get-class-names-from-css-2-0/
	readonly Regex regex = new Regex(@"\.[-]?[_a-zA-Z][_a-zA-Z0-9-]*|[^\0-\177]*\\[0-9a-f]{1,6}(\r\n[ \n\r\t\f])?|\\[^\n\r\f0-9a-f]*", RegexOptions.Compiled);

	// Regexes for removing comments from http://stackoverflow.com/questions/3524317/regex-to-strip-line-comments-from-c-sharp/3524689#3524689
	string blockComments = @"/\*(.*?)\*/";
	string lineComments = @"//(.*?)\r?\n";
	string strings = @"""((\\[^\n]|[^"\n])*)""";
	string verbatimStrings = @"@(""[^""]*"")+";
	TextInfo textInfo = CultureInfo.InvariantCulture.TextInfo;

	public void Process()
	{
		WriteLine("namespace FluentBootstrap");
		WriteLine("{");
		WriteLine("\tpublic static class Css");
		WriteLine("\t{");

		// Read the CSS file and strip comments
		string css = System.IO.File.ReadAllText(Host.ResolvePath(@"Content\bootstrap.css"));
		css = css.Replace("\r\n", "\n");
		css = Regex.Replace(css,
			blockComments + "|" + lineComments + "|" + strings + "|" + verbatimStrings,
			me => {
				if (me.Value.StartsWith("/*") || me.Value.StartsWith("//"))
					return me.Value.StartsWith("//") ? Environment.NewLine : "";
				// Keep the literal strings
				return me.Value;
			},
			RegexOptions.Singleline);

		// Get all CSS classes in the file (except for icons)
		HashSet<string> cssClasses = new HashSet<string>();
		foreach (Match match in regex.Matches(css))
		{
			if(match.Success && !string.IsNullOrWhiteSpace(match.Groups[0].Value) && !match.Groups[0].Value.StartsWith(@"\") && !match.Groups[0].Value.StartsWith(".glyphicon-"))
			{
				cssClasses.Add(match.Groups[0].Value.Substring(1));	
			}
		}
		// Separate alpha & numeric portions for ordering http://stackoverflow.com/a/19288352/3042939
		// Generate the const strings
		foreach(string cssClass in cssClasses.OrderBy(x => new string(x.Where(char.IsLetter).ToArray()) + new string(x.Where(char.IsDigit).ToArray()).PadLeft(2, '0') ))
		{
			WriteLine("\t\tpublic const string " + String.Join(null, cssClass.Split(new char[]{'-'}, StringSplitOptions.RemoveEmptyEntries)
				.Select(x => textInfo.ToTitleCase(x))) + " = \"" + cssClass + "\";");		
		}

		WriteLine("\t}");
		WriteLine("}");
	}
#>

Icons and Icon Fonts

Icons are another area where CSS can come in handy, as long as your icons are declared in CSS as well. For a detailed post of how to do this, see here.

HTML Elements

You can even eliminate most HTML elements and rely on code to generate them instead. I'm not suggesting this is always a good idea. For example, there's really no reason why you should replace a simple <p> tag with code, but it can be helpful for more exotic HTML tags or when using a library like Bootstrap.

There are several libraries to assist in generating HTML from your view code. For starters, native ASP.NET MVC contains a number of HtmlHelper extensions to generate form elements. In the third-party space, the FubuMVC project published a library to create generic HTML elements via code. There are also wrappers for common CSS frameworks like Bootstrap such as TwitterBootstrapMVC or my own FluentBootstrap. There is also planned support for tag-like syntax for code helpers in ASP.NET vNext via Tag Helpers.

DIY

There is also a lot you can do in terms of best practice to eliminate magic strings. For example, try not to rely on dynamic objects if possible and use strongly-typed view models instead. Jimmy Bogard also has some good tips.

nameof()

Finally, it looks like the compiler and language designers are recognizing some of the problems with certain kinds of magic strings. One of the features I'm most looking forward to in C# 6 is the introduction of a new nameof() expression. It will return a string literal for any property and can help eliminate magic strings from error messages, validation checks, UI output, etc.