Friday, June 26, 2009

Review: Regular Expressions Cookbook

Authors: Jan Goyvaerts and Steven Levithan Format: Paperback, 510 pages Publisher: O'Reilly Media, Inc.; 1st edition (June 4, 2009) ISBN-10: 0596520689 ISBN-13: 978-0596520687 According to Wikipedia, "In computing, regular expressions provide a concise and flexible means for identifying strings of text of interest, such as particular characters, words, or patterns of characters". If you even dabble in open source, shell scripts, and writing code, you are likely at least somewhat aware of regular expressions. Some people are even really good at using regular expressions, but this method can be a struggle for others. O'Reilly's Cookbook series offers over 100 "recipes" using regular expressions to solve common tasks. The question is, will this cookbook help you whip up the dish you need to serve? I hate to cave in to the consensus of reviews on the web, but this book is hot! That said, there is a caveat. Despite the fact that the Intended Audience section of the book's front matter states the book is written for anyone who "regularly work(s) with text on a computer, whether that's searching through a pile of documents, manipulating text in a text editor, or developing software...", many of the people who are editing articles, books, or a web page source won't need or want to access the power regular expressions have to offer. Also, a large number of the book's recipes are written for programming languages or for web development, so to make the most out of this book, you'll need to be doing work in those areas. I was a little surprised that the book didn't require any prior experience with regular expressions at all. It offers the reader a complete introduction from the first chapter on what regular expressions are, then proceeds to teach the basic skills in chapter two. I doubt it will replace O'Reilly's Mastering Regular Expressions but it might be a way in the "side door" of using regular expressions for someone who doesn't need to "master" this process. For those of you who are programmers (which is most likely the majority of people reading this review), the content that will interest you starts in Chapter 3. Languages covered in the recipes includes .NET Java, JavaScript, Perl, PHP, Python, and Ruby. Regardless of your level of expertise, there's bound to be a recipe or two in this book that will make your life easier. A regular expressions cookbook doesn't require that you be able to use this method "from scratch", any more than a food-related cookbook expects you to create a cheese soufflé from scratch. The "ingredients" are all listed, how they are to be mixed, the temperature of the oven, how long to bake, all the little details should be included. Of course, you'll need to know the difference between a cup and a tablespoon, and how to use all the tools typically found in a kitchen. The regular expressions cookbook is no different, except the ingredients and process aren't designed around food but rather text. You also have to know what task you want to perform and the cookbook must contain information on that task. If you want to make an omelet and your cookbook doesn't contain that recipe, you're out of luck. That's the limit of the Regular Expressions Cookbook. At least some of the recipes contained within its pages must apply to the tasks you need to perform. A random (I just flipped to a page) example is as follows. In recipe 3.5 Test Whether a Match Can Be Found Within a Subject String, the problem is presented in summary and then the solutions are offered. In this case, the solutions presented are for C#, VB.NET, Java, JavaScript, PHP, Perl, Python, and Ruby. The discussion section talks about any additional information or special cases involved in any of the languages. In this example, discussion sections are available for each of the aforementioned languages. The reader is also referred to Recipes 3.6 and 3.7 for more information. I just described the general formatting of recipes in any of O'Reilly's Cookbook texts and this is what you can expect when you use this specific book. If you want (and this is the way the common cookbook is used), just thumb through the table of contents or index to find the particular recipe you require and then have at it. Occasionally, a novice cook will consult the beginning sections of a cookbook to familiarize themselves with food preparation basics. If that's describes you relative to regular expressions, you'll find the early parts of the book quite handy. Bottom line, I'd have to say that the Regular Expressions Cookbook is best used by someone who doesn't use regular expressions regularly (sorry), but whose work efficiency would be enhanced by using a regex engine. The types of recipes that are accessible through this book are also correctly "biased" to the audience, so you won't find that only one or two bits are useful while the rest are impractical or unrealistic for typical coding and text manipulation tasks. I'd have to say that authors Goyvaerts and Levithan and O'Reilly have hit a home run with this book.