Sunday, December 26, 2010

Impress your colleagues with your knowledge about… PDB files

Most developers know that PDB files help you in some way with debugging, but that's about it. They are a dark art for most developers only completely understand by a few evil magicians. Let me help you understand what PDB files are and how they can help you making your debugging experience a lot easier. First read the following 3 important rules and never forget them!

Rule 1 – PDB files are as important as source code

First and foremost PDB files are as important as source code! Debugging bugs on a production server without finding the matching PDB files for the deployed build can cost you tons of money. Without the matching PDB files you just made your debugging challenge nearly impossible.

Rule 2 – As a development shop, I should have a Symbol Server

At a minimum, every development shop must set up a Symbol Server. A Symbol Server stores the PDBs and binaries for all your public builds. That way no matter what build someone reports a crash or problem, you have the exact matching PDB file for that public build the debugger can access. Both Visual Studio and WinDBG know how to access Symbol Servers and if the binary is from a public build, the debugger will get the matching PDB file automatically.

Rule 3 – A Source Server is a Symbol Server best friend

A Symbol Server is not that useful without one extra step. That step is to run the Source Server tools across your public PDB files, which is called source indexing. The indexing embeds the version control commands to pull the exact source file used in that particular public build. Thus, when you are debugging that public build you never have to worry about finding the source file for that build. If you are using TFS 2010, out of the box the Build server will have the build task for Source Indexing and Symbol Server copying as part of your build enabled.

Now you know these 3 rules, let’s have a look at the PDB file itself. A .NET PDB only contains two pieces of information, the source file names and their lines and the local variable names. All the other information is already in the .NET metadata so there is no need to duplicate the same information in a PDB file.

When you load a module into the process address space, the debugger uses two pieces of information to find the matching PDB file. The first is obviously the name of the file. If you load ABC.DLL, the debugger looks for ABC.PDB. The extremely important part is how the debugger knows this is the exact matching PDB file for this binary. That's done through a GUID that's embedded in both the PDB file and the binary. If the GUID does not match, no debugging at source code level is possible.

With the knowledge of how the debugger determines the correctly matching PDB file, the last question that remains is where the debugger looks for the PDB files. You can see all of this order loading yourself by looking at the Visual Studio Modules window, Symbol File column when debugging. The first place searched is the directory where the binary was loaded. If the PDB file is not there, the second place the debugger looks is the hard coded build directory embedded in the Debug Directories in the PE file. If the PDB file is not in the first two locations, and a Symbol Server is set up for the on the machine, the debugger looks in the Symbol Server cache directory. Finally, if the debugger does not find the PDB file in the Symbol Server cache directory, it looks in the Symbol Server itself.

I hope this information helped you understand PDB files and hopefully you start understanding and using their full potential.

No comments: