Coding conventions

If you want to contribute to the development of pythoncad, Art wrote very interesting guidelines for coding here.

Other good ideas are in: http://wiki.inkscape.org/wiki/index.php/DeveloperManual

A number of conventions have emerged while writing the Python code for PythonCAD. These conventions, unfortunately, are not consistent throughout the entire code base, as they tend to arise after recognizing shortcomings in code written in what was considered the previous standard. The newer the code is, the more likely it is to adhere to the current standards, though fixing older code will bring it up to current practices. This page is an effort to list out how the code is organized, what the current conventions are for writing code, and a brief bit on how things came to be this way.

The conventions for things like variable naming and function/method names are likely to change as the project evolves. As multiple developers begin adding or modifying code, their style of coding will be reflected in the patches or files they bring to the project. It is certain that some of this input will be demonstrably better than whatever the current practices are, so the standards below should really be considered somewhat fluid, and likely to change. Regardless of whatever the current rules are for things like variables, function, and method names, there are a couple of things that should always be striven for in the code.

  • Correctness
  • Maintainability
  • Readability
  • Speed

Please note that optimizing for speed falls below the highest priorities in writing the code. Correct code is unquestionably the topmost requirement, and maintainability and readability are next. Obviously a fast program is better than a slow one, but a fast buggy program does not do much good for anyone. A fast, correct program, though, is a work of art.

Code Organization

An overall goal is to keep the front-end interface code as separate from the back-end code as can be achieved. Using a high level language like Python helps immensely in moving towards this goal, as Python provides basic data structures like lists, tuples, and directories (hashes) as built-in components of the language. There is no need to hand-roll these data structures so that all the code, both generic and interface specific, could make use of these essential building blocks of a program. As both the interface and back-end code have these structures available by the nature of the language, the task of keeping the interface code and generic code separated is much simpler.

All code that comprises the core objects in the program, and anything else that is interface neutral, is kept in the Generic subdirectory. The interface code is kept in an Interface directory, and the code for a specific interface is kept in a subdirectory below that. The initial release has a Gtk subdirectory, and it is hoped that there will be several companion subdirectories eventually.

Generic Directory

There are presently no subdirectories in this directory. It is in this directory that the code for things like points, segments, layers, etc., is located. Also, there is code in here for things such as calculating intersections of the objects, compressed file reading and writing, saving and loading of files, and various utility functions. There will probably be several subdirectories in here at some point, and files performing certain functions will be moved as needed, but these changes will depend on how the program evolves. All code that finds its way into this directory should never rely on any Python modules outside of the standard set of Python modules.

Interface Directory

Here is where the Gtk subdirectory sits, and in that directory is all the code for presenting the user interface. It is this code that relies on the PyGTK module. So far that module is the only third party module needed for running PythonCAD.

As more interfaces are added, their code should then be placed in this directory. All code in the Interface directory uses code in the Generic directory to whatever extent is needed. If a particular interface is requiring some specific functionality in the drawing entities, rather than code being added in this directory that performs the function, that functionality should be considered being moved to the entity itself so that other interfaces may also benefit. By enriching the basic functionality of the core objects, the interface code can be kept to a minimum, and interface-specific object functionality will be reduced. This should hopefully produce a more robust core code-base.

Code Conventions

In the months prior to the release of the first development version of PythonCAD, it was obvious from looking at the code how recently that particular bit of code had been written. Even over the course of a single month, one could see how a particular preference in variable names could change, or functions and methods would gain or lose default parameters. As development progressed, these changes have somewhat slowed, so what is listed below are the current guidelines as to how any new code should be written.

Classes

Python is an object-oriented language, and one of its strengths is in how simply a programmer can design classes for their project. Nearly all the generic code for the back end is designed so that the drawing entities are all objects of a class, and each entity has certain methods suitable for whatever it is supposed to do.

Here is a modified version of the basic 2-D point class. The real code is kept in the Generic/point.py file. What is below is meant just for example purposes.

class _Point(object):
    """The basic point class.

A _Point has the following methods:

getCoords(): Return the location of the _Point.
"""
def __init__(self, x, y):
"""Initialize a _Point.
_Point(x, y)
x: The x-coordinate
y: The y-coordinate

Both values should be floats.

"""
_x = x
if not isinstance(_x, float):
        _x = float(x)
        _y = y
if not isinstance(_y, float):
        _y = float(y)
        self.__x = _x
        self.__y = _y
 
def __str__(self):
        return "(%g, %g)" % (self.__x, self.__y)
 
def getx(self):
        """Return the x-coordinate.
getx()
        """ 
        return self.__x
 
def setx(self, x):
        """Set the x-coordinate.
setx(x)

The argument x should be a float.

"""
_x = x
if not isinstance(_x, float):
    _x = float(x)
    self.__x = _x
    x = property(getx, setx, None, "X-Coordinate.")
 
def clone(self):
"""Make an identical copy of a _Point.
 
clone()
"""
return _Point(self.__x, self.__y)
 
def getCoords(self):
"""Get the location of a _Point.
 
getCoords()

The results are returned in a tuple.

"""
return (self.__x, self.__y)

Here is what should be noted from the sample class:

Doc strings are provided for the class definition, and should list all the methods of the class.

Doc strings are provided for many of the class methods, and should list the method call, the arguments, and maybe a brief description of what the method returns. Derive a base class from object. This is a new feature in Python 2.2, and provides some advantages over what are now called classic classes (base classes not derived from object).

In the init function, efforts should be made made ensuring that all arguments are of the correct type. In the example above, the arguments were checked to see if they were floats, and if not, were converted to floats. Note that if the float() call failed, an exception would be raised, but there is no effort made to catch it in a try/except block. The right place to catch the error is higher up the calling chain, so it is pointless to place a try/except block around some code, and then simply raise the exception again.

The variables in the object are prefixed with , which in Python hides them (to an extent). This should be done for all objects, as it encourages the creation of methods to manipulate these values. As Python 2.2 brings a new feature called properties to the language, hiding the internal fields in an object can be aided with a means of safely presenting them as attributes of the object, and yet control access and modification of their values more simply than what was possible earlier with getattr and setattr__ overrides.

The use of properties is strongly recommended, as they present an apparent direct access to the object's attributes yet really ensure that these variables are guarded against invalid values should they be modified.

Method names start with a lower-case letter, and any names that are phrases should capitalize the first letter of each phrase. So, methods like clone() and getCoords() are fine, but a name like Stomp(), or long_method_name() are bad.

Method names that are meant to be private to the class can begin with an underscore _. These methods may have less error testing regarding their arguments than the public methods, or they may make use of assertions. If a method is becoming very large, it will often make sense to split it into several private methods. The code for layer objects does this, and for now can be used as example code in using private methods. At some point there may be a better way to create and use private methods in Python, but for now this naming convention works.

Variables local to a method should be prefixed with _. This is something that, regrettably, has not been done in many places in the code. For methods that are composed of a handful of lines, in many places this convention was followed. For larger methods, variable names are hit-and-miss. The functions and methods should be a short as possible. This is an obvious statement, but looking at places in the code it was clearly not followed. Some methods and functions are many lines long, and almost any method or function falling into this trap needs cleanup. Some of these large methods and functions are the product of just trying to figure out how to write the code to do the job, so the code length was irrelevant. Once it became clear(er) what to do, then the code could be condensed. In other cases the method or function grew because new objects were added to the program, and they needed to be dealt with. There is not a maximum line count as to what is too big, but if function does not fit within one or two screenfuls of whatever editor you use, it probably is getting too large. For examples of too large functions and methods, look for the draw_layer() function in Interface/Gtk/gtkimage.py. In future releases these large methods will be split up, as was done with the update() method for layer objects.

More later …

There have not been any definite patterns established regarding functions or methods with named or optional arguments. Sometimes an optional argument is set to None, and this argument is checked in the method or function, and sometimes the argument is set to some default value. Also, similar functions or methods do not always share common default or optional values. This variability is a weakness with the code, and will be addressed in future revisions.

Functions

Not everything in PythonCAD is coded as objects with method calls. Much of the code dealing with the interface is written in a more procedural style of coding. As such, it is somewhat specific to the GTK interface, and may reflect a bias towards this graphical library package that someone writing a KDE interface should not necessarily follow. When the code for that interface shows up is when that question will be handled. The items below are what has been established in writing this type of code.

Function calls are named with all lower-case letters, and names that are phrases have each word separated with an _. A function named have_some_stuff() is good; OutOfStuff() is not. Functions that start with an underscore are meant only for use for other functions or object method calls within the file. An analogy to this are static function in C. When writing these functions, the usage of doc strings is optional, and the error checking in the function code should probably rely more on assert statements if the code is trying to test that the correct type of object is being passed as an argument. The usage of try/except blocks, though, is fine where appropriate.

More later …

Error Handling

This is one aspect of coding in Python which has changed markedly over the initial development. Python supports try/except blocks, where the except code block can attempt to handle the error, or simply raise it or a new error up the the caller. There are very few of these blocks in nearly all the code in the Generic subdirectory. Most of the error handling is done in the interface specific code, because through that code the error can be relayed to the user. It is an acknowledged weakness in the code that right now this relaying is not done well, and only a message is printed out. Many places in the code need a better means of transmitting to the user the fact that some error has occurred.

Python also supports try/finally blocks, where the code in the finally block is guaranteed to be executed if an exception occurs in the try block. These type of code blocks should be placed around bits of code that must ensure some state is resolved before proceeding to either proceed or raise an error. There are try/finally blocks around the file reading and writing code, as it is vital that the file handles are closed after reading or writing operations, even if an error occurs during those operations.

Translation / Internationalization

I notice a directory called po in the PythonCAD source that seems to be for translation of messages into Spanish. How does this work? How are these files used in combination with the Python source code?

The files in the 'po' directory are for internationalization. The file 'PythonCAD.pot' is the template file listing all the text in the code that have been marked as translatable. If you read this file you'll see each translatable string as 'msgid' and the various files in which this string was found. This file, btw, is generated with the 'xgettext' program.

Someone wanting to add a translation to PythonCAD would copy the '.pot' file into a new file named 'PythonCAD.${locale}.po', where '${locale}' is the standard abbreviation for the language you are translating into. The 'PythonCAD.es.po' is the Spanish translation. I suppose that someone translating into a slightly different dialect of Spanish, such as one spoken in Mexico, would use a ${locale} like 'es_mx' or something similar.

After the translator finishes adding the translated strings, the '.po' file is processed with the 'msgfmt' program to produce a file ending with the '.mo' extension. It is this file that is used by the gettext library to substitute translated strings for the english text in the code.

I'd like to modify the English messages in the source to make it clearer what the user is supposed to do. Will this affect the translation files?

Yes. Ideally what happens is all the code changes affecting translatable strings are completed several days prior to a release, and then the people providing translations have a couple of days to update their '.po' files.

My comments above are just a rough summary of how the localization of programs using gettext work. The complete details can be found with the gettext documentation.

Miscellaneous Ramblings

So why try and write a CAD package in Python?

Just what is wrong with writing in C/C++/Perl/Java/…? Where is the KDE/Gnome interface? How come it does not read any common CAD file formats? Will it? If a question like that had popped into your head while investigating just what PythonCAD does, then this is the place for you. What is below are answers to various questions that may or may never be asked.

Why try and write a open-source CAD program?

The short answer is I wanted to see a new CAD package for Linux. I've wanted to have a project like this to work on for some time, and in doing so try and contribute a new program back to the many open-source developers who have given me so much. Also, I've wanted to try and write something that could become the basis for a business.

There seems to be a shortage of this type of program for open-source software users. I have heard of several projects like this starting, but I have not seen them come to fruition. There are probably more projects out there similar to this one, and it would be good if several different projects emerged that provided open-source users with a good CAD package. A program like this is something I think many, many people would find useful, and also bring open-source software into many new places.

A number of commercial CAD packages for Linux have been announced over the last few years, and I think that is great, but these programs will almost certainly not be open-source. The fact that these programs are even available on platforms like Linux and BSD Unix is definitely a good development and will increase the areas in which open-source software is used. That trend is something I like to see happening.

Why Python? Why not ${LANGUAGE}...?

I like Python, having used it at a former job. I also like Perl and C also; C++ is lower on my choice of languages. But, why Python? I felt that choosing Python would be the quickest and easiest way to get this project moving, would be a good way for me to improve my Python coding skill, and would be the simplest way to get started doing some GUI programming (thanks to the PyGTK module) that I've wanted to learn to do.

I had fiddled with PyGTK a little bit before, and so had some familiarity with it. Also I really like and have followed the development of GTK from its start as the Motif replacement for the Gimp, through its use as the cornerstone of GNOME, up to today. GTK also is less taxing to compile and run than the C++ code of KDE. I spend my days on a 200 MHz Pentium machine (really), so code that a more modern machine could compile in hours takes literally days to build here. A newer machine is definitely in the future, but developing the code on a slow machine is a great incentive to make it run well here; the code should fly on newer hardware.

As of January 2003, I have KDE-3.1 on my machine, and also have installed PyQt. I can now try my hand at a PyQT front end, as well as test out any patches sent my way for creating a QT-based PythonCAD interface. It will take some time before I am able to do much with this new packages, so anyone wanting to see PythonCAD with at QT interface will have to help in writing the code. Once PyKDE for this latest KDE release becomes available, I will install it as well. The same caveat regarding PyQT and my experience with applies to this package as well, unfortunately. I am hopeful, though, that there are developers out there who will work with me in adding an interface using one or both of these python modules.

I do not have the GTK/Perl bindings installed, so that eliminated the option of using it for initial PythonCAD development. Incidentally, I have used Perl for over 7 years, and think it is a fantastic language. I am looking forward to Perl 6 and Parrot when they arrive. If Parrot can deliver on its ability to run Python code, and run it faster than the Python interpreter does, then the possibility of PythonCAD running on Parrot presents itself. This is a long way off yet, and maybe pieces of PythonCAD will migrate to Perl, and a name change to ParrotCAD will be required, but for now we stick with what works.

<soapbox> As for C, I do not want to spend my time endlessly fighting memory leaks, buffer overflows, core dumps, and many of the other issues that plague large-scale software development with what I believe is a language more suited for smaller projects. Writing things like low-level libraries or bits of code where speed is critical are things C excels at. As the lines of code increase, it becomes harder and harder to keep track of details like memory allocations for variables and when they need to be freed, or allocating the correct amount of space for strings, and many other common problems that have led to the development of tools such as Valgrind. Valgrind or a similar program is a definite must have for anyone doing C development. Also, as the number of lines of code increases, an increase in developers is needed to just maintain the code base, let alone advance the program. These problems can be overcome, but for PythonCAD I wanted to try and avoid them in the future by choosing a language which does not have these common pitfalls of C development.

C++ has the same problems, plus the added difficulties of dealing with compiler and platform limitations. My job at a former employer was in release management, and we built some C++ code on a variety of platforms. It was incredibly difficult to get the code to compile on all platforms due to the various compilers we had to use. There were code issues to be sure, but the compiler/platform issues were just another hurdle to jump before the programs were ready. I want to avoid repeating this problem, and Python lets me do that.

Finally, Java is just not something I am currently interested in using to write software. That may change in time, but right now it does not appeal to me at all. </soapbox>


coding_conventions.txt · Last modified: 2010/03/16 15:44 (external edit)