Wednesday, March 17, 2010

Search Patterns: Design for Discovery

Authors: Peter Morville and Jeffery Callender
Format: Paperback, 192 pages
Publisher: O'Reilly Media; 1st edition (January 26, 2010)
ISBN-10: 0596802277
ISBN-13: 978-0596802271

Whether you think "search" is sexy or not, you probably can't live without it. In fact, according to the blurb on the book's back cover, "It (search) influences what we buy and where we go. It shapes how we learn and what we believe." That's a powerful statement, and probably more true than we realize (or we wish). While most of us experience search as users, Morville and Callender provide a practical guide that allows you to build your own search applications...but how good of a guide is it? I decided to find out (hence this review).

I'm a visual learner (who isn't) and this book fairly panders to my needs and desires as a student. The high quality glossy paper used in this book helps produce very slick and vivid graphics. The first page of the Preface even has a full color cartoon strip featuring the two authors, though Peter seemed to lack some "dimension" as a toon. I guess that's the difference between a graphic designer (Jeff) and an information architect (Peter).

Humor and personality wasn't limited just to the visuals of this book. The authors managed to project their personalities into their writing along with the technical aspects of search, right from page one. Search is immediately presented as a tool that needs to talk to and interact with human beings and adapt to who we are, rather than we adapting to the "needs" of an application.

Not only is this a fun book to read, but it is really useful, particularly in communicating about both the conceptual and nuts-and-bolts aspects of search design. A great deal of information about "usability" is leveraged in the creation of this book since, without users, search is without a purpose. The whole idea of building search is building for people.

This creation is so effective that as you travel more into the technical aspects of the book, you may not notice. I got a distinct sense of being pulled along, page after page as I was reading. I can't say that I completely absorbed every single detail as I progressed, but that's more an effect of my need to understand search design better rather than any fault of the authors.

For the beginner interested in learning how to build search, Search Patterns is an excellent introduction. Yet, the book was also written for designers and information architects (but not so much for developers as far as I can see) who need to learn more about not only the current state of search but its future implementation.

The only criticism I can offer is that the book seems to be the proverbial "a mile wide but an inch deep". It is an introduction, but it won't tell you all you need to know about designing search. This book will get you started and enhance whatever knowledge you may already possess, but once your appetite is whetted, you'll want more.


Monday, March 15, 2010

Ubuntu 10.04: Waiting for the Lucid Lynx

This'll be short. I read a review of the current incarnation of Ubuntu 10.04, code named "Lucid Lynx" at the In a Tux blog this morning. The author pointed out a number of flaws, great and small, with the Lynx but finished up the review by saying, "This version of Ubuntu 10.04 is not a stable or final release of Ubuntu, so some of these thing my change. Please do not judge them to soon" (and the spelling errors are the sole property of the In a Tux author).

Since the Daylight Savings Time change has "jet lagged" me into near-incomprehensibility (and that's hard to spell when you're really tired), I wasn't quite sure when the Lynx was to be released and I decided to look up the release schedule at Ubuntu.

According to, we've got a solid 6 weeks until the Final Release becomes available on April 29th. In fact, the Beta 1 release is still four days (March 18th) off as I write this. While some betas can function almost as well as the final product, you should expect a beta and particularly an alpha (and the best the In a Tux author could have been working with is Alpha 3), to have a few, or more than a few, outstanding bugs.

I'm not being critical of reviewing pre-release software and in fact, it's a necessary part of the development process, particularly in the open source world where all contributions are important. Yet, I agree that the Lynx shouldn't be judged too harshly while still in the womb, so to speak. You can find all of the currently known Lucid Lynx bugs at, so you'll know what bumps in the road to expect if you decide to sample the Lynx as it exists today.

I'm particularly interested in this particular release of Ubuntu since I've been using the previous Ubuntu LTS 8.04 Hardy Heron and am looking forward to upgrading. For those of you in the same boat, if you want to keep current on the moment-by-moment (almost, anyway) changes to Lucid, you can sign up to receive email notifications.

Waiting for something can be difficult and, after all, in the world of technology, six-weeks is almost an eternity. If patience is your virtue though, April 29th is right around the corner.


Friday, March 5, 2010

Converting a PDF to a Word Doc with KWord

I was posed with a challenge yesterday and fortunately, the challenge was cancelled. Let me explain why I say "fortunately". At my day job, my boss wanted me to convert a document produced in LaTeX to a Word document. I work with LaTeX in Kile and this isn't an option that seems available. The native output of my little set up is PDF but the PDF to Word doc conversion options didn't look promising either.

As I said, the immediacy of the challenge was cancelled and another solution was found, but the request could come up again and I thought it would be nice to find an answer now while I have a bit of time on my hands. Long story short, I haven't found a way to convert LaTeX to a Word doc format, but there is a way to open a PDF and save it as a Word doc, using KWord.

I did a fair amount of searching and finally discovered an article at the blog. It's older information...almost three years old, but I thought I'd see if the solution works, since it promises to be able to open a PDF, save it as an odt or doc, and preserve the formatting. This last part is important, because I really need tables in the PDF to still be tables in the doc.

I dutifully installed KWord on my Ubuntu machine and gave it a shot. While the latest incarnation of KWord does a more or less OK job of preserving format, it is far from perfect. Here are my examples. The first image is the sample PDF page I chose to work with. No, it doesn't have tables, but I'll get to that in a minute.

The second image is the same page opened in KWord. Not exactly a stunning likeness of the first image, but it is pretty good. That said, I tried it on the actual pages I had been asked to work with yesterday and the table formatting completely disappeared when imported into KWord.

I looked at the example of the process in the 2007 blog article vs. what I performed, and the steps and features seem identical. While it looks like KWord (as part of KOffice) is continuing to be developed and maintained, this particular feature doesn't appear to have changed much, if at all, in the past almost three years.

I guess I can't complain too much. This is the closest I've come to solving my little problem, but if converting a PDF to a Word doc is a task on someone's plate at KOffice, I humbly request that it get a little more attention. It would be a big help. Honest.

Afterword: I regularly use Writer to convert odt and doc files to PDF and it works just great. Too bad the abundant resources being fed into OOo development can't also be used to include reversing the process.

Thursday, March 4, 2010

When Did We Forget about Big Brother?

I remember a time when every criticism about Government surveillance invoked George Orwell's classic novel 1984 and the spectre of Big Brother. In Orwell's novel, Big Brother had sort of an appearance, but it was made deliberately vague and we couldn't be sure if he represented an actual character or was more a projection of "the Party". I don't hear much about Big Brother anymore, which is odd...but then again, maybe it's not so odd.

When's the last time you worried about your privacy? Sure, maybe you worry about it all the time, particularly when your very identity can not only be invaded, but stolen and used for all sorts of purposes, not the least of which is to buy just tons of stuff using your name and credit card number.

While we still register danger at the thought of identity theft for the purpose of fraud, when's the last time you considered just how many people and agencies have access to the most intimate details about your life? How many private and public databases contain your name, date of birth, social security number, and a raft of other sensitive information about you? I'm talking about those entities that you've authorized to possess such data, never mind Government security agencies and the like (as if the NSA really cares about what you say on your cell phone).

I was prompted to write this blog by the use of something with a rather benign name: Einstein technology. More specifically, today's article Feds weigh expansion of Internet monitoring in which Homeland Security Secretary Janet Napolitano assures the American public that the DHS's proposed plan to extend the Einstein technology, now monitoring the public areas of the Internet, into private networks won't constitute an invasion of privacy. Really?

The Einstein technology is designed to "detect and prevent electronic attacks, to networks operated by the private sector" and was created for use on federal communications networks, however, according to the CNET article, the latest version of Einstein can read email content and AT&T has been asked to test its capacities on their system. In response to concerns about the proposed use of Einstein, Greg Schaffer, assistant secretary for cybersecurity and communications warmed my heart by speaking thus:
"I don't think you have to be Big Brother in order to provide a level of protection either for federal government systems or otherwise," Schaffer said. "As a practical matter, you're looking at data that's relevant to malicious activity, and that's the data that you're focused on. It's not necessary to go into a space where someone will say you're acting like Big Brother. It can be done without crossing over into a space that's problematic from a privacy perspective."
Nice to know my "old friend" Big Brother has been let out of history's basement for a breath of fresh air.

Not that the boogie man of Cyberterrorism is anything to sneeze at (and it shows up often enough in fiction, such as the recent film Live Free or Die Hard). I fully believe that security must be established and maintained along our electronic and cybercommunications frontiers as well as any of our physical borders, and that insufficient protections invite attack, but there's always a price to be paid.

In 1968, the first federal seat belt law for motor vehicles (except for buses) came into being. Few people argue that seat belts save lives and provide a measure of protection in car accidents, but the cost of that protection is the loss of a certain amount of freedom within the interior of the car. Just ask any parent who's tried to turn around at the wheel (while the car was stopped, of course) to yell at misbehaving children in the back seat. Potentially save your life, vs. some lack of mobility. Seems like a reasonable trade off.

The trade off for having more (but is it enough?) security when flying on an airplane is to have you and your personal property scanned and searched by federal officials. Less likelihood of a terrorist planning a bomb on your flight vs. having your body wanded and your luggage ransacked. Do you consider that a tough choice?

What is the trade off for protection against Cyberterrorism? What are the dangers and how much are we as citizens willing to surrender for protection from said-dangers? Are we talking about defacing the IRS website, a DDoS attack against the INS database servers, or what? How imminent is the threat?

Turns out this is nothing new. According to an example cited at Wikipedia:
In 1999 hackers attacked NATO computers. The computers flooded them with email and hit them with a denial of service (DoS). The hackers were protesting against the NATO bombings in Kosovo. Businesses, public organizations and academic institutions were bombarded with highly politicized emails containing viruses from other European countries.
1999? Certainly Governments have gotten better at protecting themselves against such intrusions since them. Yes they have. Enter Einstein. Of course, we have to assume the tools to create such attacks have gotten better, too. Still, is all this worth the possibility of having your private or business communications potentially accessed?

You can't really say that's a personal choice. The private sector is being asked to cooperate and to allow Einstein in the door, so to speak. It's not like wearing a seat belt where you could say "screw the rules" and take the risk anyway. It's more like getting on a commercial flight where you don't have a choice. You will be scanned and potentially searched. Well, yes you do have a choice. You can choose to drive or take a train (do they still have trains?) if you don't want to put up with the intrusion, but travel will take longer and getting from San Francisco to Tokyo is kind of tough by car without the world's longest bridge being available.

In your personal life, you probably spew just a ton of personal information in social networking venues such as Facebook and twitter, but we're not talking about Einstein peeping in your bedroom least not at this point. In your business life, you are likely required or at least expected to use email and other forms of electronic information transfer and data storage. As far as the company is concerned, when you use their computers, servers, and email, the information that moves across belongs to them. Now, at least to some degree, security for your company is not just the business of your company, it's the business of the Federal government, too.

I Googled "cyberterrorism" to try to get a handle on just how real this threat is, but it's a topic presenting too much data, a lot of it being conflicting. Einstein and the DHS is a specific example of how Governments tend to operate. Like programs such as Health Care or the Stimulus, plans are created and then enacted upon masses of people, some who don't mind and others who object, and yet all experience the same impact. It's like turning on the lights in a bedroom. Maybe one person wanted to read a book but the other person wanted to go to sleep. In a house, the reader can go to another room, but a nation is just one big room. In effect, so is the Internet and so is Einstein's potential for peeking through your company's windows or mine.

Despite everything I've just written, I don't wind myself up so I can't sleep at night worrying about this stuff. One of the reasons I figure Big Brother isn't talked about much anymore is that we've all gotten used to the idea that we don't have a great deal of privacy anyway as individuals or corporate entities. As long as it doesn't have a visible impact on our day-to-day lives, most people don't care what information is gathered about them. At this point, it's being proposed that Einstein enter the private sector but not the private home. Business is being asked to cooperate with the DHS to ensure the greater good, and whatever information is gathered, is to be squirrelled away behind the "national security" curtain for the country's protection. Is it worth the trade off?

Afterword: While I was writing this article, I was struck with the urge to look up an old textbook I used back in the late 1970s, The American Police State: The Government Against the People by David Wise. As I recall, it's about the abuses of the Nixon administration against corporate entities and private citizens in the cause of suppressing dissent against the administration's interests. The site summarizes the book in part:
This contribution to the spate of books to emerge out of Watergate was one of the better efforts. Two chapters concern the CIA -- one on domestic surveillance and the other on CIA involvement in Watergate. Additional chapters include the FBI and IRS and their role in suppressing domestic dissent, and the machinations of the CREEP plumbers, Kissinger, and black- bag jobs in general. His final chapter is an editorial against the official methods: "If we accept the values of the enemy as our own, we will become the enemy."
I wonder why I started thinking about Wise's book now?