Yes, I know there are a lot of software development "top n" lists out there, and many dealing with writing "-[i/a]ble" software (extensible, maintainable, saleable, etc), but personally I wanted to get these concepts that have been rattling around in my brain for sometime down somewhere and hopefully start a discussion around them. Even though this isn't a wiki, I'll probably end up treating this like a living document myself.
For a long time (in software years), I've been maintaining the code base for the JDBC driver for MySQL. I also work very closely with the support team on various connectivity-related issues (JDBC, ODBC, ADO.Net, P-this-or-that, etc.) that our customers have, and have had the (mis)fortune of debugging all kinds of problems in various stacks over the years.
Sometimes we (MySQL) make debugging more of a problem than it needs to be, and sometimes it's just inherent problems in the "stack". Given that I have a Java (or at least VM-based language) "bent", maybe not all of these concepts are workable at a technical level, but I'd like to at least see that on my team we adhere to as many of them as possible.
Keep as low a bug count as possible - this is common-sense, keeping the bug count low (through quality design, maintenence, frequent releases, etc.) makes sure that your users aren't running into stupid issues that waste their time (and yours, in diagnosing the issues and working with the user on issues that really just shouldn't happen).
Grow empathy, not contempt for your users by being a user, rather than just acting like one (or relying on your QA team to act like one) - I'm a firm believer of eating one's own dogfood. I run MySQL "internally" (i.e. personally) for a lot of different things, and am always looking for more opportunites to use the software I work on for day-to-day tasks.
If you use your own software for day-to-day things (especially if it helps you run the photo gallery that the family all looks at, helps with monitoring your network, and keeps the e-mail flowing), you'll run into similar issues that your users will, and end up creating an empathy for your users that is much better than the contempt for them that I have felt and seen from developers of various products.
In my opinion, it's time to "throw in the towel" if you get to the point where as a developer you start to have contempt for your users and the issues they have with your software. If it weren't for the users, why would the software you work on exist in the first place?Stay close to your users - blog (yes, I could be better about this!), answer questions on the forums, participate in IRC, etc. Make sure that you help out the "newbies" as well as the "old hands with tough problems", because they span the gamut of user types your software will have, and will bring more than one viewpoint to how easy to use (and thus supportable) your software is.
Stay close to the sales and support teams (if you're working in a commercial software house) - I think you can do no wrong by staying close to the sales and support teams. Both teams have the information you need that you might not always get from the community, and you'll get a lot of real-world, grounded feedback about what's working and what's not from issues that existing and prospective customers are having. You'll also hear about the extra work your software is causing (or preventing) by how supportable it is through your interactions with these teams.
"Never ask a customer to re-crash the car" - this is lifted from Elliot Murphy's blog, but I wholeheartedly believe in what he writes - don't make users run a "special" build of your software to diagnose problems if you can at all avoid it. Either provide built-in diagnostic tools or make it possible to figure out what's going on (at some level) by the error information you provide, which leads me to....
Make error messages meaningful - any time your software encounters an error, and interacts with the user to inform them of it, you have an opportunity to show, as a developer how much you actually value the user's time.
In my opinion, an error message should give as much of the story to the user as that tenet you learned about in elementary school: "Who, What, Where, When (Why, and perhaps "how" to resolve the issue)".
Too often we just write up a string that barely describes the issue in our own terminology, which then just sends the user (or the person supporting them) into some search for what is really happening when that error message is encountered.
Luckily, the "who" is usually assumed, the "where" is covered by a stack trace (at least in VM-based languages), and the "when" is covered by the application log. That leaves coming up with a good "what", "why" and "how" to the developer.
One error message I'm proud of in Connector/J is the one that you get when the connection to the database is lost, or can't be made in the first place. The driver knows the various kinds of exceptions that can happen at connect-time (connection refused, bind refused, etc.) and how long it's been since the driver has (ever) communicated with the server. Based on this information it presents different kinds of error messages, and even points to solutions. For example, if there are no client-side ephemeral ports to bind to, the user gets an error message like this:
"The driver was unable to create a connection due to an inability to establish the client portion of a socket.
This is usually caused by a limit on the number of sockets imposed by the operating system.
This limit is usually configurable. For Unix-based platforms, see the manual page for the 'ulimit' command. Kernel or system configuration may also be required. For Windows-based platforms, see Microsoft Knowledge Base Article 196271 (Q196271)."
If the driver thinks that your connection has been idle for longer than the value "wait_timeout" on the server, you get an error message like this:
"The last communications with the server was nnnn seconds ago, which is longer than the server configured value of 'wait_timeout'.
You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem."
There's a few more types of messages that you get depending on the state of things, but the underlying theme is that the software scrapes up what it knows about the current situation, posits why the situation might be happening, and even tries to offer suggestions of what to do to prevent the error from happening.
Not to say that everything's perfect in Connector/J, I do have this very useful error message still lurking about:
"General error"
The general work of collecting stupid error messages has already been done, but needless to say, I've got work to do fixing some of the few bonehead ones left in our wares.
Okay, so that's six points, I hope to grow this to at least ten, but I've got this blog entry off of my "todo" list for now...