29 January 2009

Bad Python

I've seen quite a lot of bad Python, even though Python makes the Path of Good Code relatively easier to find than other languages where spaghetti is the result without extra discipline and years of dedicated study of the language on the part of the programmer. Such is the Tao of Perl. Much bad Python however is from programmers who only knew statically typed OO languages (Java/C++/C#) and have not yet grokked dynamic typing, first-class functions, pervasive use of iterators, properties, etc, leading to eyesores such as:
  • Accessors such as getDistance() and setDistance(), instead of using an attribute. In Python, attributes can be turned into properties later, preserving the class interface.
  • Asserting the type of every argument and returned value, taking up maybe 30% of the code itself and 80% of the unit test code. Checking is usually pointless because the interpreter itself will let you know if someduck didn't .quack() like a duck, and makes the code less flexible.
  • Uber-private attributes (for no good reason), going so far as to use double-underscores on each side, which are supposed to be reserved for language features.
  • Dozens of customising parameters in constructors, such as reversed and strip and maxlen - when passing in a general transform function would be so much more elegant and could do so much more than just reverse the strings that the class works with.
  • Delegates where first-class function will do.
  • Wrapping things classes when dicts and tuples would be cleaner.
Partly, the developers don't recognise the ways in which many Design Patterns become trivial in Python, to the extent that they are more like one-line idioms than chapter-worthy Patterns with capital "P". Singleton Pattern? Write a module. Iterator? It's fundamental to the language. Need a Factory Pattern? Write a function and thanks to dynamic typing you can substitute makedummywidget for the makewidget during testing.Flyweight objects or Command Dispatch? Just use a dictionary. See also Python Patterns. (correction: previously referred to Abstract Factory, which is is not a factory but a group of related factories, e.g. for widgets from a given UI toolkit). Besides bad habits acquired from static OO, one can write bad code, in any language with
  • Vague and misleading identifiers (topic of future post)
  • Massive 'god' classes
  • Source files having no discernable structure.
  • Awkward decompositions of function
    • Forces similar logic to be repeated in dozens of places
    • Prevents parts from being reused e.g. in unit tests
    • Prevents dependencies from being stubbed out e.g. in unit tests
    • Mixing logic with orthogonal aspects like error-handling and logging for a harder-to-maintain mess.
Oh well. I recently spent a day re-doing about 30% of the functionallity of 8,600 lines of someone elses Java-style Python into 200 lines of real Python to support the greater flexibility I needed. Code can be that bad. (later expanded to just under 300 lines thanks to feature creep) From the comments:

25 comments:

Jonathan Ballet said...

This could be cool to have a document (such as "Patterns in Python") for those problems you highlight here.

Something like "Bad Python - Good Python".

james ronic said...

I agree, if you are not part of the solution then you are part of the problem. Lets not just whine about it, show some concrete examples for those of us that are learning Python after Java/C#/C++. Everyone knows that breaking bad habits is a very hard thing to do. But, the first step is recognizing the bad habit.

Mathieu Pagé said...

I am one of thoses that write python as C++. I recognize myself in the problems you exposed.

I agree with james and Jonathan, what we need is a documents that shows the pythonic ways of doing things to programmer used to a more OO paradigm.

The honey monster said...

What the hell is static OO?

Do you mean statically typed?

If you are going to write a critical piece, you might want to start from a solid foundation. Or do you think it is perfectly acceptable to redefine the domain nomenclature?

TAO said...

Your cliches are getting old. "Look at us, we are so much cleaner than Perl". Surely, after the first trillion times, you are no longer processing this ad, but come on, get over it already. Find new ones, because if can only define Python in relation to Perl aesthetics...
"Oh noes, bad code is not the TAO of Python, but of Perl, how come code written in our dear language is so increasingly full of it ?"
Ah, so there might be some hope for you to begin to understand that the language itself or the rules it has have nothing to do with your aesthetic perceptions and that bad programmers are ubiquitous, regardless. This, of course, if you can get past the ads you've been munging for the past 10 years or so. Or maybe not... :)

Anonymous said...

I've also seen people doing needless imports. Ex:

import threading
import threading.Thread
thread=threading.Thread(target=foo)

This seems like a holdover from Java, which (iirc) requires you to do an import statement seperately for every class you want to import. This shows a fundimental misunderstanding of modules.

Jason Baker said...

Philip Eby (the guy who wrote the WSGI spec) has a good article along these lines: http://dirtsimple.org/2004/12/python-is-not-java.html

Graham said...

@Jonathan Ballet: Indeed, I have not seen a comprehensive document about "writing good python" beyond the Python Style Guide (PEP 8 and PEP 257)

@james ronic: I didn't want to implicate anyone. Go to http://sourceforge.net/projects/febrl and judge for yourself. I replaced dataset.py and indexing.py due to their inflexibility - the compactness and style is secondary.

@Mathiew Page: I'll look into writing something on "Pitfalls to avoid when moving from statically typed languages to Python"

@honey monstor: Yes, by "static" I mean statically typed. I also mean that coding in C++ and Java especially feels "static" as in stodgy because you can't fling arbitrary objects and transformation functions around freely, instead having to write so much code to achieve so little.

@TAO: The problem is indeed bad programmers. I do however claim that other languages make it harder than necessary to write good code. It requires a lot more skill and discipline to write structured Perl or PHP than to write structured Python.

So, TAO, if you can write beautiful Perl (and I know people who do), hats off to you.

@Jason Baker: Thank you, I remember the "Python is Not Java" article now. Read it a long time ago. I especially agree with "XML is not the answer". To me, XML is for data interchange - you should not be introducing XML to do internal things.

shevegen said...

"So, TAO, if you can write beautiful Perl (and I know people who do), hats off to you."

I guess beauty lies in the eye of the beholder, or whoever created it. For example, I consider 80% of my ruby code beautiful. However, I only consider 20% of my very OLD ruby code beautiful. In other words, my own definition of beauty constantly changes.

It is not necessarily that I get better (but I do think I improve a little bit over time), it is mostly because some aspects change...

Like, in the past I thought very terse code to be hard to read. This is true - but as long as the code works, there is simply no reason against making it as short as possible. And when I realized this, I started to like terse short code (if it does not confuse me)

In general I found the shortest way to good code be a very GOOD general rule.

What I have however noticed is that there are not many websites which focus on beauty and illustrate it with examples.

I guess patterns of beauty are not that easy to describe. :)

Anonymous said...

I think that Java is a pretty cool guy. eh writes good code and doesn't afraid of it.

Justin Akehurst said...

Author said:
I recently spent a day re-doing about 30% of the functionallity of 8,600 lines of someone elses Java-style Python into 200 lines of real Python to support the greater flexibility I needed. Code can be that bad.

I would be interested if you could post the offending Java-style Python file, and step through how you refactored it into something Pythonic. Perhaps we can all learn some good ways to refactor the bad looking Python.

noid said...

A commenter on reddit shared a link to Code Like a Pythonista: Idiomatic Python by David Goodger.

Anonymous said...

This is just bad advice. If you expect a type, make it explicit, Python won't do it for you.

These are not bad habits, they are conservative habits. If you truly still think this is true then Python is a giant ghetto and I'm leaving right now.

Safety matters and don't pretend because your trivial webapp doesn't need it that other people don't need it.

Heikki Toivonen said...

Could you post some good examples of transorm function usage.

When you say you can swap out a function at will, do you mean monkeypatching? That isn't really good advice. Use it when you must, but avoid it if you can. Using a factory class would enable you to cleanly override whatever you want. (There was a recent debate about this at San Francisco Bay Area Python User's group meeting and I think Alex Martelli advocated using classes for more maintainable and robust code. My memory is a bit hazy on this, though.)

Heikki Toivonen said...

@Anonymous: More often than not you don't care about type, you just want compatible interface. In the rare cases when type does matter, then of course you need to explicitly test for it.

Graham said...

@Anonymous: I find I rarely need to expclitely check for a type in Python. Often I avoid it with a cast: float(x) means my function will work even if x is an integer. Otherwise much of my code is "generic", taking iterables of pairs of hashables, or streams of formatted text, and not caring about much else.

I mainly use type-checking to make hyperpolymorphic functions. "for x in fields getfield(record, x)". x might be a getitem string or instance attribute, or an integer index, or a function operating on the record: but my getfield function will still return whatever it is that x refers to. Its a little bit of Perlish DWIM (Do What I Mean), but means transformation code will work with records that are tuples, dicts or namedtuples, and with advanced forms of field lookup.

A function doesn't need to assert the type if it can accept both ducks and geese. The interpreter will throw an exception if someone passes it a goat, but that that's just a bug in the code that should be caught by the unit tests.

@Heikki Toivonen: by transform function, I was thinking in particular of transforming database record fields prior to comparing or encoding them. Instead of having options to reverse or truncate the field, just accept a transform to do anything.

@Heikki: re factories, I mean you can pass a makewidget abstract factory function to the constructor normally, but pass makedummywidget for unit-testing. By "swap out" I mean dependency injection of the needed factory, not monkey-patching of the class.

Harold Fowler said...
This post has been removed by a blog administrator.
blankfrank said...

There's something like this for PHP. Check out:

http://badphp.com

Anonymous said...

Great post, yet another one that's whispering in my ear that I should become proficient (ie, more than a just-get-it-done capacity) in Python.

Can you recommend any good books that would take me in that direction? I'm looking for something that covers more than just the syntax and standard library, etc. I'd like to see these Pythonic idioms at work so I can see the problem set from a Python point of view.

At present, I too write Python from a static OO perspective, and that needs to change.

TIA.

Jason Baker said...

@Anonymous - Might I recommend the Python cookbook?

Alex Marandon said...

One coding anti-pattern that is particularly harmful in Python is writing very long functions or blocks. I've come across functions that were 800 lines long and conditional branches that were several hundreds lines long. Because Python doesn't use printable characters to mark the end of blocks, this sort of coding style is even harder to read than in other languages.

That's why I think forced indentation and the lack of block termination marker are actually false good ideas. It doesn't prevent writing very bad code and it makes very bad code even harder to read. On top of that, now and again, it causes headaches when the code has been edited with an editor which was not properly configured and contains a mix of spaces and tabulations for indentation.

This is however a really minor caveat compared to all the other great features Python has to offer.

Paddy3118 said...

@Alex Marandon:
You seem to recognise that large functions are an anti-pattern, but then suggest Python change its indentation as blockk-delimeter because you find it hard to follow in large functions.

Might you try not following the anti-pattern?

- Paddy.

Alex Marandon said...

@Paddy3118 Of course I try to keep my own functions and blocks short, but I also have to maintain other people's code.

Paddy3118 said...

@Alex,
That's great. I find it difficult to manage large functions in many languages - I've seen it in Basic, Pascal, assembler, C, VHDL verilog Skill, and probably others. It doesn't help much seeing a lone end, or '}' on a page any more than a lone dedent in Python. I think it best to get programmers to split large functions where possible rather than single out Pythons indentation for block delimiting as needing correction because of its handling of over-long functions.

- Paddy.

Amr Bekhit said...

Nice article - I am currently learning Python, coming from a heavy C/C++ and VB.NET background. While learning the language, I can see its elegance, but I am concerned that I will end up writing Python in a C/VB way.

Are there any resources that can help me avoid this?

Post a Comment