2005-12-14

Yahoo, I just left but I may be back

I remember the days when Yahoo! was basically a public bookmark list by some clever fellows at Stanford. They'd come a long way quickly, even when portals were en vogue and everyone's least favorite uncle was creating their own portals, I still managed to stay fairly faithful to Yahoo. That of course changed in recent years, most recently with the rise of Google and the plethora of social networking sites. I'll be honest and say that in my opinion Yahoo remained kind of bloated and stagnant from a user experience point of view most of that time; though during a lot of that time they were still on top in my book, but more in the "lesser amongst the evils" kind of way. They've been on the prowl this year. Acquiring Flickr and now del.icio.us almost seems like an act of desparation. Something like "we need to be cool fast, let's buy the cool stuff and claim it as our own." I guess that's common and fortunately they're nowhere near as atrocious about it as... some other big monopolistic companies.

My take is that it took some serious ingenuity and competition to get them off their arses. That's a good thing. Whereas before they were not happy with their place but still didn't truly experiment and encourage imagination, now they see that that is the key: making stuff that's cool and works good. So I finally move my email over to their new beta email client and I have to say I'm impressed. It looks and works pretty much like what most email clients (e.g. Thunderbird) have evolved into, except through a web interface. I like it. Good for them. I hated having to page through my emails 26 at a time, and with abysmal threading and search features. They even upped their storage quota to 2Gig (direct response to Gmail) which was something I used to have to pay for. Actually I'm still annoyed that I still have to pay for POP3 access, spam filtering, etc... something that's free on at least one other free amazingly popular email service. If this had come along, oh, about half a year or a year ago, I would have been even more pleased. Sadly, it's just a liiiiiittle bit too late. I really dig the Googlemail interface. Yes it has bugs and all the features I want aren't there but I consider the basic label+conversation+search paradigm the wave of the future and the way I want to handle my email from now on. Sadly, their interface is also just a web interface and not the way my normal machine clients work so I'm kind of stuck doing things the old way with my work email and such. Ah well, maybe someday.

I'm glad to see Yahoo getting aggressive. Although I may not fully embrace their new direction, especially in the presense of some serious competition, it's still good for Joe Consumer.

2005-11-30

A Ruby in the rough

I've noticed in the past few months more articles in the nerd channels about Ruby (mostly in regards to RAILS which I don't use). I'm sure it's simply selective attention on my part because I really love Ruby, more fervently than I did Perl when I first started using it some 10 years ago.

Ruby is a scripting language and I've been using it nigh unto a year now. It is currently my favorite scripting language and it was so practically from the first day I started playing with it. You can find plenty of advocacy, zealotry, demogoguery, and fanatacism in the usual places online; I'm sure even more if you can read and understand Japanese. Fortunately for me I did not see or read any of that online bantering about how "my scripting language can beat up yours" when I decided to play with it, only enough about it to give it a try.

Two things sold me on it immediately:
  • Nice data structures
  • The presense of "=~"
I'd been a Perl die-hard for years, was forced to use Python for work reasons, learned to really appreciate Python for what it was, and then cursed both for not having what I liked and needed to have in each. Perl makes for great quick-and-dirty guerrilla-warfare scripting but became clumsy for larger projects. Python was systematic, clean, and good for such larger projects, mostly due to its data structures, but it really dropped the ball when it chose not to add that little simple piece of syntactic sugar ("=~") to handle regular expressions. I deal with lots of text data but even if I didn't I do consider regular expression tools a basic necessity to life on the command line. And using regular expressions in Python just blows! There are other annoyances in each but those are currently of primary importance to me and my trade.

Enter Ruby, which upon first encounter appears to have the best of both worlds. Oh joy! I got busy with it immediately. It also had an interactive mode like Python, something Perl really needed. With a bit more experience, I discovered and began to appreciate some of its other and even novel features:
  • Truly object oriented in presentation (even the number 1 acts like an object) which adds a kind of consistency.
  • Interesting use of code blocks (I didn't like lambda's in Python, does anyone?).
  • Ease with which to extend even basic classes and objects.
With more use I found myself rethinking my former ways of coding a task (e.g. I found myself considering even simple things more often in terms of iterators). There are also things I find truly elegent (e.g. using code blocks for initializing).

The unfortunate aspects:
  • It's still rather new (outside Japan) and so the community and support is rather smaller than Perl and Python.
  • The syntax and keyword naming is a mite unfortunate and takes some getting used to. But at least it doesn't strongly rely on syntactically significant white spaces like Python.
  • It has automatic variables. You don't have to use them, and I don't mind them, but it freaks some people out. It does mean you can run cool stuff on the command line like Perl though.
  • It's yet another scripting language, and there will be some inertia to overcome, especially in the staunch work environment.
  • I miss Python's use of named and initialized arguments. This is almost a deal-breaker. Grrr... nothing's perfect it seems. Fortunately, that might change in the future for Ruby, and there are hacks to simulate it now but they are just hacks. I don't like the hacks though.
  • At present there only a relative few books on Ruby in English. Fortunately the first one out is a great one, on par with the Camel book in my opinion.

Currently I'm the only one at my office that uses Ruby. It will probably stay that way due to the inertia problem. But I decided to continue writing and infesting the company research code with it until someday they all seccumb to my will and see the light.

Comment first, code second

Read this article on commenting. Kind of skimmed it actually. It did make me think about it though. I was going to post a comment on how I do it but I saw a comment from someone else that does it like I do.

Basically, I try and comment BEFORE I code. It solidifies what I'm about to code, its purpose and method. If implementing an algorithm it often helps to explain the steps in words, or even in pseudocode. I've seen people site papers or web resources as well which is acceptable. It also helps to note the limitations, what needs to happen before and after, the overall use of the code snippet and what-not. I find that doing this also helps me to code better. The steps are more clear and potential problems and bugs tend to get avoided when you see a description of the algorithm layed out in laymans terms.

Like most people I often work with other people's code. I get frustrated a lot at the lack of comments or the poor description. I've also done evil in the past, writing code myself that was uncommented, I try not to do that anymore. The bigger the system, the more the need for good comments. I understand that often in the throws of coding marathons, such sensibilities fall to the way-side. This is probably why there are auto-documentation systems out there. I'm convinced that there are people out there who think that this is actual documentation and they don't need to enter their own. In my experience, these do almost nothing to help me understand what I'm looking at. I would not miss it if they disappeared. Granted, sometimes it does help to automatically insert comments in places that need it (functions and such) as mere placeholders and to take stock of higher level stuff. But all in all, comments should be made as the code is written if at all possible.

2005-11-22

The infamous Python indentation

I don't care how good it looks, how clean it appears, how visually aesthetic to the reader it becomes... having the proper indentation as part of the Python syntax is just an annoyingly bad idea. I use emacs for most coding but occasionally use vi(m) for quick fixes, especially when I'm running from another terminal or someone else's machine. They are not always configured equally so that they produce different indentation lengths. It's just an annoyance but it occurs often enough to make me groan. I've read people defending this type of syntactically significant white space by claiming that all editors worth their weight can be configured to do the right thing and the clean listing is worth the "slight" inconvenience. I think it's wrong to dictate a certain level of editor in order to properly and easily edit code. I should be able to use whatever I want without having to configure it, even a DOS editor. But here's a short list of how it inconveniences my edits:
  • When developing, I often disable chunks of code with an "if (0)" or "if (false)". Because there's no end-of-block keyword (indentations mark it) I can't do this in Python. I have to resort to highlighting the entire section and re-indenting it en masse. Or i have to highlight it all and comment it out. Again, this requires an editor with the correct macros. But it is still more work than the simple if(true|false) I can use with most other languages.
  • Python's lack of block end keywords makes it easy to unintentionally nest blocks. Ever have consecutive if-statements and the second if auto-indents under the first one instead of at the same level? This is a common gotcha when I'm editting and replacing a line of code.
  • In emacs I often use tab to auto-indent/clean-up code. Lack of block end keywords means the indentation is not unique. It also means that you cannot have a beautifier program.
  • It's dangerous to cut and paste code from one section to the other, or off of web pages or other programs because the indent may be different (and difficult to clean up) or it may be at the wrong level.
  • Using Python interactively from the command line becomes a pain too since you have to keep track of indents for every line.
I sympathise with the desire to produce consistent, pleasant, and hence more maintainable code. I think enforcing it this way created more annoyances than it meant to solve. A simple beautify program often does the trick when you want the script cleaned up after a lot of cut/paste/reorganization.

2005-11-16

YAML

YAML once meant "Yet Another Markup Language" and now means "YAML Aint Markup Language" but I would prefer it to mean "What do you get when you breed a Yack and a Camel". It's a data serialization format that (and this is the important part) is human readable text. It was brought to my attention a year ago by a coworker and I had since hacked it in to our system to better handle log file parsing. I completely agreed it to be the right idea. It has made my log file parsing much less stressful and more consistent. It just sucks that the online community for it is still... lacking. Some of the main YAML sites seem grossly out of date and nearly empty.

My recent frustration is dealing with incomplete implementations in the different scripting languages. I'm currently parsing Python generated YAML streams with Ruby. They seem to conform to different versions of the specification; most notoriously for me right now is how they each treat boolean values. My Python implementation wants to turn "+" and "-" into true/false (I had since hacked that out). Ruby wants to turn strings like "true/false", "yes/no", "Y/N", and even "on/off" into boolean true/false. This is causing all sorts of minor annoyances since I'm dealing with text that will have this as literal strings and not as code for boolean values. I in fact think that having the YAML spec interpreting strings as booleans is just a bad idea with perhaps some exception. Auto-interpretation of data is a touchy subject and you're bound to upset someone no matter which way you go. I'd prefer the "everything's a string" method, since that would at least guarantee consistency. But for now I just hack in fixes for my own purposes. I may submit a patch someday. I'd feel safer using the format if I had a sense of more active development however.

Ruby seems to be the only safe-haven for YAML at the moment. They seem to be trying to turn it into a Pickling or Marshalling replacement though. That's not really what I want out of YAML. It's just a really good way to display nested list and hash structures in a clean, pleasant, and consistent way. Anything more may make it too complicated to be portable and general purpose. I'd still rather use it than write my own though, since I don't really have the time.

2005-11-15

Simp[l]y Delicious

I like using the social bookmarking system. It allows me to easily access boomarks from anywhere. More importantly it's a more powerful and flexible system than the bookmarks built into my browser because labels are more versatile than folders (Google knows this) and with most browsers you can only organize by folders. It also allows me to be more shameless about the number of things I can bookmark without becoming overburdened with organizing the links. It also lets me search through topics other people have marked with the same kinds of labels. There are other features related to this of course.

I started off with del.icio.us which is still the most popular. Then a number of other ones popped up. I started using simpy because it had some more features. Then I ended up coming back to del.icio.us, and for nearly the same reason people stuck with VHS over Betamax, Atari 2600 over Intellivision, and to some degree Microsoft over Apple. There are plenty of examples of sad cases where the superior bows to the popular. With del.icio.us however, it made more sense to favor the popular because it's a social web concept whereby the better service was the one with the most participants.

Why again?

I'm not a hardcore programmer and only a mild computing aficionado but it is part of my trade. As I sit at my desk I often stumble across or realize things that wrinkle my brow and are sometimes worth a quick note. I mean, as long as I'm not working I might as well look like I'm doing something by writing on things I use in relation to work. Right? So why not meta-tab over to my browser, log in, and make a quick blurb about it?