| Home | | Python | | ![]() | ![]() | Share This Page |
A Python source code cleanup utility
— P. Lutus — Message Page —
Copyright © 2010, P. Lutus
(double-click any word to see its definition)
I may have mentioned that I don't take a language seriously unless I can create a beautifier for its source files, preferably written in the language itself. My Ruby beautifier has become quite popular, and writing it helped me learn many of that language's traits. I even wrote a beautifier for Bash scripts, when I was writing a lof of those — but I decided against trying to write it as a Bash shell script.
I had resisted taking up Python for a long time because of one of its less desirable characteristics — whitespace is syntactically significant. I regard this as an abomination, but over time I got involved with some projects that relied on Python (Sage and Blender among others). I eventually weakened and started writing in Python, and I have decided it's worth its defects.
I wouldn't be emphasizing the whitespace issue except that PyBeautify needs to work around the implications of the whitespace issue. Unlike beautifiers for other languages, PyBeautify can only change the overall indentation of a program (and a few other things) — it can't use the language's block syntax tokens to control the indentation, because Python's block syntax is controlled by indentation, not by tokens.
Here is what PyBeautify does:
In pass one, PyBeautify scans a source file and determines which indentation the file uses — one or more spaces.
In pass two, PyBeautify indents the program based on either PyBeautify's default indentation of two spaces or a user-entered specification. This feature can be used to reliably change a file's oveall indentation from one standard to another, and any indentation between 1 and 64 spaces can be specified.
PyBeautify also checks the program's indentation for consistency. The assumption is that a program will always use a multiple of a basic indentation — say, four spaces — and each indentation is a multiple of this value.
If PyBeautify finds any indentation inconsistencies, for each one it prints a warning with a file name and a line number, but it doesn't try to change the indentation.
PyBeautify also turns all tabs into eight-space blocks. I think it's generally accepted that tabs should be removed from the world of computing. PyBeautify does its little part.
Here is what PyBeautify won't do:
Make your source files beautiful (the program's name is more a tradition than a description), unless you regard removal of tabs as a move toward beauty (as I do).
Change the indentation of lines it thinks are errors. It will print a warning message for each one, but any changes are up to you.
Here's how to use PyBeautify:
- Use as a stream filter:
./pybeautify.py - < input.py > output.py- Specify an indentation other than two spaces:
./pybeautify.py 4 - < input.py > output.py- Replace a file in place, specifying an indentation of 4 spaces (makes a backup copy):
./pybeautify.py 4 input.py- Process all Python files in a directory in the same way:
./pybeautify.py 4 *.py
Licensing, Source
PyBeautify is released under the GNU General Public License. Here is the plain-text source file without line numbers.
Revision History
- Version 1.0 12/01/2010. Initial Public Release.
Program Listing
1: #!/usr/bin/env python
2: # -*- coding: utf-8 -*-
3:
4: # Version 1.0 12/01/2010
5:
6: # ***************************************************************************
7: # * Copyright (C) 2010, Paul Lutus *
8: # * *
9: # * This program is free software; you can redistribute it and/or modify *
10: # * it under the terms of the GNU General Public License as published by *
11: # * the Free Software Foundation; either version 2 of the License, or *
12: # * (at your option) any later version. *
13: # * *
14: # * This program is distributed in the hope that it will be useful, *
15: # * but WITHOUT ANY WARRANTY; without even the implied warranty of *
16: # * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *
17: # * GNU General Public License for more details. *
18: # * *
19: # * You should have received a copy of the GNU General Public License *
20: # * along with this program; if not, write to the *
21: # * Free Software Foundation, Inc., *
22: # * 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. *
23: # ***************************************************************************
24:
25: import re, sys, shutil
26:
27: class PyBeautify:
28:
29: def __init__(self):
30: self.default_indent = 2
31:
32: # split line into indent and content
33: def parse_line(self,s):
34: indent,content = re.search(r'^(\s*)(.*)$',s).groups()
35: return indent,len(indent),content
36:
37: def parse_stream(self,stream,path,indv):
38: lines = [line.expandtabs().rstrip() for line in stream.readlines()]
39:
40: # pass 1: find the minimum indent
41: mi = 1000000
42: for line in lines:
43: if(re.search(r'\S',line)): # only non-blank lines
44: indent,li,content = self.parse_line(line)
45: if(li > 0 and li < mi): mi = li
46:
47: # pass 2: create output string with specified indentation
48: output = []
49: for n,line in enumerate(lines):
50: indent,li,content = self.parse_line(line)
51: if(li % mi != 0): # if indentation is not a multiple of mi
52: sys.stderr.write("Warning: inconsistent indentation in line %d of file \"%s\".\n" \
53: % (n+1,path))
54: iv = li * indv / mi # create indent value
55: output.append("%s%s" % (' ' * iv,content))
56: return '\n'.join(output) + '\n'
57:
58: def parse_file(self,path,indv):
59: if (path == '-'): # stdin, stdout
60: print(self.parse_stream(sys.stdin,path,indv)) # end = ' '
61: else: # it's a file
62: try: # making a backup copy
63: shutil.copyfile(path,path+"~")
64: except: # backup failed
65: sys.stderr.write("Error: unable to create backup copy of file \"%s\", quitting.\n" \
66: % path)
67: exit(1)
68: with open(path) as fh: # read the file
69: output = self.parse_stream(fh,path,indv)
70: with open(path,'w') as fh: # write the result
71: fh.write(output)
72:
73: def process(self):
74: sys.argv.pop(0) # drop program name
75: if (not sys.argv): # no program arguments
76: sys.stderr.write("Usage: [indent default %d] filenames or \"-\" for stream\n" \
77: % self.default_indent)
78: exit(0)
79: else:
80: try: # is the first argument a number?
81: indent = int(sys.argv[0])
82: sys.argv.pop(0) # drop the number
83: except: # not a number, probably a file name
84: indent = self.default_indent
85: if(indent <= 0 or indent > 64): # test of acceptable indentations
86: sys.stderr.write("Error: bad indent entry value: \"%d\", quitting.\n" \
87: % indent)
88: exit(1)
89: for path in sys.argv:
90: self.parse_file(path,indent)
91:
92:
93: PyBeautify().process()
94:
| Home | | Python | | ![]() | ![]() | Share This Page |