A controversial post about bad code that I observed on an unnamed Python project, in which I describe outdated idioms, Java-style code, and bad programming practices. [3 minutes]
Fun fact: an earlier version of this post (which also disparaged the programmers - not cool) made the Reddit front page after someone posted it to r/programming, granting me 15 minutes of internet fame. I take it that controversial opinions get more attention that bland facts.
Fun fact: an earlier version of this post (which also disparaged the programmers - not cool) made the Reddit front page after someone posted it to r/programming, granting me 15 minutes of internet fame. I take it that controversial opinions get more attention that bland facts.
The Kinds of Bad Python
I spent a day re-doing about 30% of the functionallity of 8,600 lines of elses Java-style Python into 200 lines of cleaner Python that is also more flexible. I've observed a few major kinds of code I would call "Bad Python":
- "Bad Python" is often "Old Python": using only the conveniences available in Python before 2.4/2.5. In a fast-moving language, the old ways are often going to look bad.
- "Bad Python" is often "Java Python": Python written in Java idioms. It's poor form to write in one language using the idioms of another, and forgoing the benefits of dynamic typing, first-class functions, native iterators, properties, and so forth
- "Bad Python" is often "Bad Programming": many of the practices would be poor form in any programming language.
Specific Examples
Some specific examples of Bad Python eyesores:- Accessor methods such as getDistance() and setDistance(), instead of using an attribute. In Python, attributes can be turned into properties later, preserving the class interface.
- Asserting the type of every argument and returned value, taking up maybe 30% of the code itself and 80% of the unit test code. Checking is usually pointless because the interpreter itself will let you know if someduck didn't .quack() like a duck, and makes the code less flexible.
- Using super-private attributes for everything, so far as to use double-underscores on each side, which are supposed to be reserved for language features.
- Dozens of customising parameters in constructors, such as reversed and strip and maxlen - when passing in a general transform function would be so much more elegant and could do so much more than just reverse the strings that the class works with.
- Using delegates where first-class function will do.
- Wrapping things classes that offer no more behavior than a dict or a tuple, while adding a whole lot of intermediate code. namedtuple could help there.
Java Patterns in Python
Some Bad Python comes from using certain Design Patterns that can in Python be expressed in one-line idioms, not worth writing a chapter about. Singleton Pattern? Write a module. Iterator? It's fundamental to the language. Factory Pattern? Write a function make_foo, and substitute it with make_dummy_foo in tests. Flyweight Objects and Command Dispatch? Use a dictionary. A good resources is Python Patterns for patterns specific to the language. I think it's too common to assume that the Gang-of-Four Design Patterns apply to every language, not just Java/C++/C#.Bad Programming
One can of course write bad code in any language, Python included, by doing:- Vague and misleading identifiers (topic of future post)
- Massive 'god' classes
- Source files having no discernable structure
- Awkward decompositions of function
- Forces similar logic to be repeated in dozens of places
- Prevents parts from being reused e.g. in unit tests
- Prevents dependencies from being stubbed out e.g. in unit tests
- Mixing logic with orthogonal aspects like error-handling and logging for a harder-to-maintain mess.
Further Reading
- Python is not Java (a related rant about using Java idioms in Python)
- Idiomatic Python (if you want examples of idioms in Python)
- Python Cookbook (preview online at Safari Books)
- How not to write FORTRAN in any language especially on readability and how a language can help and hinder good design.
P.S. you might also like my opinions on code generation versus metaclasses, or see how to show the current branch in your Bash prompt when coding.
This could be cool to have a document (such as "Patterns in Python") for those problems you highlight here.
ReplyDeleteSomething like "Bad Python - Good Python".
I agree, if you are not part of the solution then you are part of the problem. Lets not just whine about it, show some concrete examples for those of us that are learning Python after Java/C#/C++. Everyone knows that breaking bad habits is a very hard thing to do. But, the first step is recognizing the bad habit.
ReplyDeleteI am one of thoses that write python as C++. I recognize myself in the problems you exposed.
ReplyDeleteI agree with james and Jonathan, what we need is a documents that shows the pythonic ways of doing things to programmer used to a more OO paradigm.
I've also seen people doing needless imports. Ex:
ReplyDeleteimport threading
import threading.Thread
thread=threading.Thread(target=foo)
This seems like a holdover from Java, which (iirc) requires you to do an import statement seperately for every class you want to import. This shows a fundimental misunderstanding of modules.
Philip Eby (the guy who wrote the WSGI spec) has a good article along these lines: http://dirtsimple.org/2004/12/python-is-not-java.html
ReplyDelete@Jonathan Ballet: Indeed, I have not seen a comprehensive document about "writing good python" beyond the Python Style Guide (PEP 8 and PEP 257)
ReplyDelete@james ronic: I didn't want to implicate anyone. Go to http://sourceforge.net/projects/febrl and judge for yourself. I replaced dataset.py and indexing.py due to their inflexibility - the compactness and style is secondary.
@Mathiew Page: I'll look into writing something on "Pitfalls to avoid when moving from statically typed languages to Python"
@honey monstor: Yes, by "static" I mean statically typed. I also mean that coding in C++ and Java especially feels "static" as in stodgy because you can't fling arbitrary objects and transformation functions around freely, instead having to write so much code to achieve so little.
@TAO: The problem is indeed bad programmers. I do however claim that other languages make it harder than necessary to write good code. It requires a lot more skill and discipline to write structured Perl or PHP than to write structured Python.
So, TAO, if you can write beautiful Perl (and I know people who do), hats off to you.
@Jason Baker: Thank you, I remember the "Python is Not Java" article now. Read it a long time ago. I especially agree with "XML is not the answer". To me, XML is for data interchange - you should not be introducing XML to do internal things.
"So, TAO, if you can write beautiful Perl (and I know people who do), hats off to you."
ReplyDeleteI guess beauty lies in the eye of the beholder, or whoever created it. For example, I consider 80% of my ruby code beautiful. However, I only consider 20% of my very OLD ruby code beautiful. In other words, my own definition of beauty constantly changes.
It is not necessarily that I get better (but I do think I improve a little bit over time), it is mostly because some aspects change...
Like, in the past I thought very terse code to be hard to read. This is true - but as long as the code works, there is simply no reason against making it as short as possible. And when I realized this, I started to like terse short code (if it does not confuse me)
In general I found the shortest way to good code be a very GOOD general rule.
What I have however noticed is that there are not many websites which focus on beauty and illustrate it with examples.
I guess patterns of beauty are not that easy to describe. :)
A commenter on reddit shared a link to Code Like a Pythonista: Idiomatic Python by David Goodger.
ReplyDeleteCould you post some good examples of transorm function usage.
ReplyDeleteWhen you say you can swap out a function at will, do you mean monkeypatching? That isn't really good advice. Use it when you must, but avoid it if you can. Using a factory class would enable you to cleanly override whatever you want. (There was a recent debate about this at San Francisco Bay Area Python User's group meeting and I think Alex Martelli advocated using classes for more maintainable and robust code. My memory is a bit hazy on this, though.)
@Anonymous: More often than not you don't care about type, you just want compatible interface. In the rare cases when type does matter, then of course you need to explicitly test for it.
ReplyDelete@Anonymous: I find I rarely need to expclitely check for a type in Python. Often I avoid it with a cast: float(x) means my function will work even if x is an integer. Otherwise much of my code is "generic", taking iterables of pairs of hashables, or streams of formatted text, and not caring about much else.
ReplyDeleteI mainly use type-checking to make hyperpolymorphic functions. "for x in fields getfield(record, x)". x might be a getitem string or instance attribute, or an integer index, or a function operating on the record: but my getfield function will still return whatever it is that x refers to. Its a little bit of Perlish DWIM (Do What I Mean), but means transformation code will work with records that are tuples, dicts or namedtuples, and with advanced forms of field lookup.
A function doesn't need to assert the type if it can accept both ducks and geese. The interpreter will throw an exception if someone passes it a goat, but that that's just a bug in the code that should be caught by the unit tests.
@Heikki Toivonen: by transform function, I was thinking in particular of transforming database record fields prior to comparing or encoding them. Instead of having options to reverse or truncate the field, just accept a transform to do anything.
@Heikki: re factories, I mean you can pass a makewidget abstract factory function to the constructor normally, but pass makedummywidget for unit-testing. By "swap out" I mean dependency injection of the needed factory, not monkey-patching of the class.
There's something like this for PHP. Check out:
ReplyDeletehttp://badphp.com
Great post, yet another one that's whispering in my ear that I should become proficient (ie, more than a just-get-it-done capacity) in Python.
ReplyDeleteCan you recommend any good books that would take me in that direction? I'm looking for something that covers more than just the syntax and standard library, etc. I'd like to see these Pythonic idioms at work so I can see the problem set from a Python point of view.
At present, I too write Python from a static OO perspective, and that needs to change.
TIA.
@Anonymous - Might I recommend the Python cookbook?
ReplyDeleteOne coding anti-pattern that is particularly harmful in Python is writing very long functions or blocks. I've come across functions that were 800 lines long and conditional branches that were several hundreds lines long. Because Python doesn't use printable characters to mark the end of blocks, this sort of coding style is even harder to read than in other languages.
ReplyDeleteThat's why I think forced indentation and the lack of block termination marker are actually false good ideas. It doesn't prevent writing very bad code and it makes very bad code even harder to read. On top of that, now and again, it causes headaches when the code has been edited with an editor which was not properly configured and contains a mix of spaces and tabulations for indentation.
This is however a really minor caveat compared to all the other great features Python has to offer.
@Alex Marandon:
ReplyDeleteYou seem to recognise that large functions are an anti-pattern, but then suggest Python change its indentation as blockk-delimeter because you find it hard to follow in large functions.
Might you try not following the anti-pattern?
- Paddy.
@Paddy3118 Of course I try to keep my own functions and blocks short, but I also have to maintain other people's code.
ReplyDelete@Alex,
ReplyDeleteThat's great. I find it difficult to manage large functions in many languages - I've seen it in Basic, Pascal, assembler, C, VHDL verilog Skill, and probably others. It doesn't help much seeing a lone end, or '}' on a page any more than a lone dedent in Python. I think it best to get programmers to split large functions where possible rather than single out Pythons indentation for block delimiting as needing correction because of its handling of over-long functions.
- Paddy.
Nice article - I am currently learning Python, coming from a heavy C/C++ and VB.NET background. While learning the language, I can see its elegance, but I am concerned that I will end up writing Python in a C/VB way.
ReplyDeleteAre there any resources that can help me avoid this?