I first started using Python some 4 years ago, and soon after I fell in love with it. I understood Python easy enough but more importantly it understood me. However, the way I wrote scripts 4 years ago and the way I do it now has changed drastically and for the better. This article summarizes the lessons I learned to increase efficiency and readability of whatever I wrote using Python. You might find it helpful too.
1. Use comments
You never know when a script needs to be revisited again. A well commented code make its easy to remember or help someone else what’s happening on the script. I learned this the hard way.
Leave the comments in a line above the code which it explains, like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 | # This is a good way to leave comments, comments and code are separated # Also comments tells what the upcoming statement/s do/es # Imports beautiful soup import bs4 # This is not a good way to leave comments # Comments should precede statement it explains import bs4 # imports beautiful soup # This is also not a good way to leave comment # Modification of statements gets difficult because of the comment import bs4 # Imports beautiful soup. |
2. Use conventions and guidelines
Conventions dictate the coding practice of an organization or a team and helps maintain consistency when working with a team. Especially follow PEP8 guidelines.
Trust me, when you have to work on a script somebody else has already worked on and see variables declared and used in different styles on a same script, it gets to your nerves.
1 2 3 4 5 6 7 8 9 10 | # Follow the same convention on a script # and if possible throughout the project python_variable = "some_text" PYTHON_CONSTANT = 100 # On the same script do not use other formats like anotherVariable = "new_format" AnotherVariable = "yet_another_format" Python_Constant = "this_is_getting_ridiculous" AnotherCONSTANT = "dude_seriously_stop_it" |
3. Avoid Magic Numbers
Lets look at following script:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | """Script to take PTR values and calculate Simple Interest. All values will be shown upto 2 decimal places.""" # Get the user input p = round(float(input("Principle (P): ")), 2) t = round(float(input("Time (T): ")), 2) r = round(float(input("Rate (R): ")), 2) # Display Input parameters print("You entered: P={}, T={}, R={}".format(p, t, r)) # Calculate interest i = round((p * t * r) / 100, 2) # Display interest calculated to the user print("Interest (I) is: {}".format(i)) |
Now suddenly you need to display the results upto 4 decimal places instead of two. In this state of code, you need to change the value of 2 to 4 wherever round()
function is used. like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | """Script to take PTR values and calculate Simple Interest. All values will be shown upto 4 decimal places.""" # Get the user input p = round(float(input("Principle (P): ")), 4) t = round(float(input("Time (T): ")), 4) r = round(float(input("Rate (R): ")), 4) # Display Input parameters print("You entered: P={}, T={}, R={}".format(p, t, r)) # Calculate interest i = round((p * t * r) / 100, 4) # Display interest calculated to the user print("Interest (I) is: {}".format(i)) |
When you have to change these kind of settings frequently or where such values are used extensively (here precision value of 2 places was used 4 time in 6 lined script comments excluded. Just think what happens if the script is 1000 lines long), there is a high probability something will be missed when updating the script.
To avoid this, use variables like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | """Script to take PTR values and calculate Simple Interest. All values will be shown upto 4 decimal places.""" # Declare precision value PRECISION = 4 # Get the user input p = round(float(input("Principle (P): ")), PRECISION) t = round(float(input("Time (T): ")), PRECISION) r = round(float(input("Rate (R): ")), PRECISION) # Display Input parameters print("You entered: P={}, T={}, R={}".format(p, t, r)) # Calculate interest i = round((p * t * r) / 100, PRECISION) # Display interest calculated to the user print("Interest (I) is: {}".format(i)) |
Now if needed, modification can be done in just one line and it will get reflected across whole script.
4 Use docstrings
If we go back to the basics, use of docstrings goes a long way when working with a team. There are tools available that will create a documentation from docstrings and inline comments.
5.1 Use String formatting instead of string concatenation
If we look at line 13 of section 3 (last code block)
1 | print("You entered: P={}, T={}, R={}".format(p, t, r)) |
This is written making use of python’s string formatting. It could have been written with string concatenation as:
1 | print("You entered: P=" + str(p) + ", T=" + str(t) + ", R=" + str(r)) |
Firstly, use of string formatting increases the readability of the code. Also, string formatting will coerce any data type to their string representation and you don’t have to worry about data types. The variables p
, t
, r
here are floats. On the contrary, string concatenation only works with strings and you have to explicitly cast other data types to strings. And most importantly, many standard python libraries like python’s logging module use string formatting for various purposes.
Using string formatting can be a very good habit to develop.
5.2 Avoid nested loops as much as possible
When I don’t find a any other way to avoid nested loops, I get nightmares. First of all, figuring out how to break out of nested loops can be tricky which is a big headache in itself. When you also account for the time complexity a nested loop adds to the script, these are roads best not traveled. Usually nested loops have complexity of O(n2) which put simply means that as the number of items to go through nested loop, the time needed for completion increases exponentially. This is kind of obvious, if you really think about it.
1 2 3 4 5 6 | # This is a pseudo code of nested for loop for (i=0, i<n, i++) { for (j=0, j<n, j++) { // do something } } |
Lets not make this complicated and look at the concept. The outer loop is executed n
times. For each value of i
, the inner loop gets executed n
times. To complete the execution it takes n*n=n
2
times. As the value of n increases, time needed for execution rises tremendously.
5.3 Avoid recursive functions
Recursive functions are those functions which call themselves. They are like wild animals: tempting, exotic but dangerous. They are best viewed on the TV screens and tutorials. Let me tell you about a time I worked for my previous employers.
I can’t go into the details (confidentiality, duh!!) but in general, we had an API, and it kept on getting killed every 5 – 10 minutes. After analyzing the logs, we finally caught the culprit. As you might have guessed, it turned out to be a recursive function that would return the result via the API. What was happening was because we had millions of data points the recursive function had to process, it got called as many times. As a result, Python experienced Stack Overflow (that’s right, I came across stack overflow in action, not just in the internet) which caused our entire API to be down and our service to be inaccessible.
Unless you know exactly what you are doing, I advise against using recursive functions.
5.4 Sleep or Pass?
Which of the following two is more efficient do you reckon?
1 2 3 4 5 6 7 8 | # Case 1 while some_variable is False: pass # Case 2 import time import sleep while some_variable is False: sleep(5) |
Both cases do nothing but which one is better? I ran into this problem when I had to keep script #1 running until a shared variable was set to True by script #2. First I used case #1 and found that 25% of CPU was always busy. Then I changed to case #2, and CPU usage was about 1%. This is because, in case #1, the while statement is checked continuously but in case #2, it is only check 5 seconds.
Well, that’s the end of part one, keep an eye out for part two.
Related posts
Today's pick
Categories
- Computer Vision/ML (3)
- Javascript (1)
- Linux (1)
- Python (20)
- Advance Python (3)
- Basic Python (6)
- Intermediate Python (11)
- Uncategorized (1)