Programming Style
Provide sound justifications for basic rules of coding style. Refactor one-page programs to make them more readable and justify the changes. Use Python community coding standards (PEP-8).
This page is adapted from the Software Carpentry Python Gapminder lesson, Copyright (c) The Carpentries. The original material is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
Changes made: Content has been modified and expanded by Innovations for Poverty Action (IPA) to include IPA-specific examples and context relevant to research data analysis.
Original citation: Achterberg, et al. (2024). swcarpentry/python-novice-gapminder: Software Carpentry: Plotting and Programming in Python (v2024.6.27.1). Zenodo. https://doi.org/10.5281/zenodo.12167686
- Provide sound justifications for basic rules of coding style.
- Refactor one-page programs to make them more readable and justify the changes.
- Use Python community coding standards (PEP-8).
Coding style
A consistent coding style helps others (including our future selves) read and understand code more easily. Code is read much more often than it is written, and as the Zen of Python states, “Readability counts”. Python proposed a standard style through one of its first Python Enhancement Proposals (PEP), PEP8.
Some points worth highlighting:
- document your code and ensure that assumptions, internal algorithms, expected inputs, expected outputs, etc., are clear
- use clear, semantically meaningful variable names
- use white-space, not tabs, to indent lines (tabs can cause problems across different text editors, operating systems, and version control systems)
Follow standard Python style in your code
- PEP8: a style guide for Python that discusses topics such as how to name variables, how to indent your code, how to structure your
importstatements, etc. Adhering to PEP8 makes it easier for other Python developers to read and understand your code, and to understand what their contributions should look like. - To check your code for compliance with PEP8, you can use modern tools like Ruff, which combines linting and formatting in a single fast tool. Ruff can automatically format your code to conform to PEP8 and can fix many common issues automatically. It works with both regular Python files and Jupyter notebooks.
- Some groups and organizations follow different style guidelines besides PEP8. For example, the Google style guide on Python makes slightly different recommendations. Modern formatters like Ruff can be configured to follow different style preferences through their configuration files.
- For IPA projects, we use Ruff for both linting and formatting. See the Python guide for setup instructions and configuration details. You can run
ruff checkto find issues andruff formatto automatically format your code. - With respect to coding style, the key is consistency. Choose a style for your project be it PEP8, the Google style, or something else and do your best to ensure that you and anyone else you are collaborating with sticks to it. Consistency within a project is often more impactful than the particular style used. A consistent style will make your software easier to read and understand for others and for your future self.
Use assertions to check for internal errors
Assertions are a simple but powerful method for making sure that the context in which your code is executing is as you expect.
def calc_bulk_density(mass, volume):
'''Return dry bulk density = powder mass / powder volume.'''
assert volume > 0
return mass / volumeIf the assertion is False, the Python interpreter raises an AssertionError runtime exception. The source code for the expression that failed will be displayed as part of the error message. To ignore assertions in your code run the interpreter with the ‘-O’ (optimize) switch. Assertions should contain only simple checks and never change the state of the program. For example, an assertion should never contain an assignment.
Use docstrings to provide builtin help
If the first thing in a function is a character string that is not assigned directly to a variable, Python attaches it to the function, accessible via the builtin help function. This string that provides documentation is also known as a docstring.
def average(values):
"Return average of values, or None if no values are supplied."
if len(values) == 0:
return None
return sum(values) / len(values)
help(average)Help on function average in module __main__:
average(values)
Return average of values, or None if no values are supplied.
Multiline Strings
Often use multiline strings for documentation. These start and end with three quote characters (either single or double) and end with three matching characters.
"""This string spans
multiple lines.
Blank lines are allowed."""Highlight the lines in the code below that will be available as online help. Are there lines that should be made available, but won’t be? Will any lines produce a syntax error or a runtime error?
"Find maximum edit distance between multiple sequences."
# This finds the maximum distance between all sequences.
def overall_max(sequences):
'''Determine overall maximum edit distance.'''
highest = 0
for left in sequences:
for right in sequences:
'''Avoid checking sequence against itself.'''
if left != right:
this = edit_distance(left, right)
highest = max(highest, this)
# Report.
return highestUse comments to describe and help others understand potentially unintuitive sections or individual lines of code. They are especially useful to whoever may need to understand and edit your code in the future, including yourself.
Use docstrings to document the acceptable inputs and expected outputs of a method or class, its purpose, assumptions and intended behavior. Docstrings are displayed when a user invokes the builtin help method on your method or class.
Turn the comment in the following function into a docstring and check that help displays it properly.
def middle(a, b, c):
# Return the middle value of three.
# Assumes the values can actually be compared.
values = [a, b, c]
values.sort()
return values[1]def middle(a, b, c):
'''Return the middle value of three.
Assumes the values can actually be compared.'''
values = [a, b, c]
values.sort()
return values[1]- Read this short program and try to predict what it does.
- Run it: how accurate was your prediction?
- Refactor the program to make it more readable. Remember to run it after each change to ensure its behavior hasn’t changed.
- Compare your rewrite with your neighbor’s. What did you do the same? What did you do differently, and why?
n = 10
s = 'et cetera'
print(s)
i = 0
while i < n:
# print('at', j)
new = ''
for j in range(len(s)):
left = j-1
right = (j+1)%len(s)
if s[left]==s[right]: new = new + '-'
else: new = new + '*'
s=''.join(new)
print(s)
i += 1Here’s one solution.
def string_machine(input_string, iterations):
"""
Takes input_string and generates a new string with -'s and *'s
corresponding to characters that have identical adjacent characters
or not, respectively. Iterates through this procedure with the resultant
strings for the supplied number of iterations.
"""
print(input_string)
input_string_length = len(input_string)
old = input_string
for i in range(iterations):
new = ''
# iterate through characters in previous string
for j in range(input_string_length):
left = j-1
right = (j+1) % input_string_length # ensure right index wraps around
if old[left] == old[right]:
new = new + '-'
else:
new = new + '*'
print(new)
# store new string as old
old = new
string_machine('et cetera', 10)et cetera
*****-***
----*-*--
---*---*-
--*-*-*-*
**-------
***-----*
--**---**
*****-***
----*-*--
---*---*-
- Follow standard Python style in your code.
- Use docstrings to provide builtin help.