jk's blog

Python str.split, annoying design.

Even after a year of diddling, I’m still a python newb, and things like str.split(None) are why.

Everyone knows split splits strings on a character (or in the civilized world, a regex). str.split(None) splits on whitespace and then trims leading and trailing whitespace. It’s a great feature, but why not call it str.split_whitespace?

‘a b’.split(None) returns [‘a’,’b’].

‘a,,,b’.split(‘,’) does not return [‘a’,’b’]. It returns [‘a’,”,”,’b’].

This makes Python harder to learn, because you cannot assume the same behavior from the named method. Split is like an “irregular verb” that requires a little extra brain-space.

There’s another side benefit to distinguishing the algorithm with a unique name: more algorithms.

There’s already str.splitlines. There are obvious variations that could be added: str.split_csv, str.split_path.