File format reference

Drawing properties

Drawing entities

File Format Information

PythonCAD uses XML for storing files. As text files can get rather large, the files are compressed with the zlib module when they are being saved. This is the default behavior now, and there is no way to toggle on or off the compression. If such functionality is eventually called for, then the code will be changed to allow saving in uncompressed text. For now, though, the files are compressed.

Using XML for the file format has provides several advantages over creating a new format. A short summary of reasons why XML was chosen is given below.

Open Standard

As the XML format is being maintained by W3C, it is open and available to anyone. The open nature of the format means everyone has access to the basic information giving the file format description.

Python Support

Python ships with a standard module for dealing with XML files, so there was no need to create a new module to deal with the file. Additionally, as the module comes with Python, it has the benefits of being used by many other people and projects, and this wide usage will tend to identify and repair bugs more quickly than a single use module.

Validation

With the use of either a DTD or a Schema, the layout of the file as well as the data the file is storing can be validated. This validation step can serve as both a check on the code to ensure that the data is being written out in the correct form, as well as ensure that all the needed bits of data are being saved. There are plans to create a Schema for the PythonCAD file format, as the Schema approach is able to define the format much more precisely than a DTD can. Currently there is no Schema available for checking the file, as the format has been changing due to the early development state of the program. The first schema will be created after some feedback has been generated with regards to the file format, and will likely see several revisions before the initial format is decided.

Buzzword Compatibility

There is a little of this in the choice to use XML as well. If nothing else, just using XML in a program may get more people to take a look at what the program does, and hopefully a few people will be interested enough to want to help improve the overall design of the PythonCAD file format and the PythonCAD program itself. The format is in its first incarnation, and as such most likely has various weaknesses, some more apparent than others. While simpler than creating an entirely new file format, defining a good XML file is an iterative process. The overall structure of the file is regulated by XML requirements, but deciding on what to call the tags, what should be attributes, how things should be arranged, etc., are all left to the designers.

A couple of obvious improvements can be made to the format initially. Much of the following will not be clear unless you've spent some time working with PythonCAD and looked at one of its saved files.

Don't save empty tags. They increase the file size slightly, and yet keep no useful information in the saved file. [This problem is addressed in release DS1-R2.] Only save colors, linetypes, styles, etc. that are actually used in the drawing. Currently all of the loaded things in the drawing are saved, but if there is only one or two of each used, only those should be saved. [This problem is addressed in release DS1-R4.] Some of the attribute names are inconsistent. Overriding style attributes do not have the same names as those in the style - color versus cid. Dimension color and font bits are also inconsistent, and the indirect storage of these objects as objects contained in a separate list instead of directly storing the object itself is confusing. Should some of the attributes become children tags? This seems more like a question of style, as what is most important is storing the information. The use of the Python minidom library to format the XML file also creates difficulties in file storage choices due to the way it formats values kept between opening and closing tags. As development work continues on PythonCAD, the shortcomings in the file format will be removed as much as possible. At such time as the first stable release is made, the file format will be frozen for that release. It is almost a certainty that the file format will need some changes after that point, so the format will evolve as the application does.

File Tags

Here are the tags used for saving a file. This number of tags will change as the file format is modified, as will the attributes the tag may contain. Also, some of the tags represent objects that are not necessarily fully functional. As development progresses, the program will eventually fully support all the below objects. Each object is listed below, with the XML tag given in brackets.

Drawing Properties

When working in a drawing, the active layer in the drawing is where whatever entities that get created are stored. No layer will have multiple equivalent copies of any entity, in order to save space in both memory and file size. What would be the purpose in having a layer will one million identical instances of a point at (0,0)? The objects stored in a layer will all have an unique id, but the id is only unique in that layer. Several different layers can have a Point with an id of 1; but each layer will never have two points with the same id.

Dimension objects are, so far, the most complex objects in a drawing. A dimension can refer to objects (points, circles, arc, etc) that are stored on a layer different from that of the dimension. This complexity comes with the price of storing more bits of information for each dimension than other drawing entities. As dimensions are the most complex objects in a drawing, they are also the least implemented in the code. The display of horizontal, vertical, linear, radial and angular dimensions all works, and each dimension type can be stored and retrieved. The interactive modification of dimensions is still lacking in PythonCAD though. This shortcoming will be addressed in future releases.

Color

A color is defined in terms of RGB values. Each value is an integral value between 0 and 255. The <Color> tag is as follows.

id: The unique identifier for the color. 
r: The red value of the color. 
g: The green value of the color. 
b: The blue value of the color. 
color: The color expressed as a hexidecimal string like #xxxxxx.

Attributes r, g and b were the original attributes used to store the color values. Attribute color was added to show the color in a hexidecimal format as is commonly seen for colors defined in web pages. At some point in the future the color storage will probably change from using attributes of the tag to placing the value as a text string within the tag like <image:Color>#xxxxx</image:Color>.

Linetype

A linetype is used to define a dashed or a solid line. The <Linetype> tag is as follows.

id: The unique identifier for the linetype. 
name: The name of the linetype. 
pattern: The dash pattern of the Linetype

The pattern can be None, thus making a solid line. Otherwise it must be a Python list defining the on/off bit pattern for the dashed line. Each item in the list must be an integral value, and the length of the list must be an even number.

Style

A style holds references to a particular color, linetype, and line thickness. The idea behind the style is to predefine a particular combination for each of these items, and when this combination is needed, the program will use the values in the style for new objects. The <Style> tag is as follows.

id: The unique identifier for the style. 
cid: The color id used in this style. 
ltid: The linetype id used in this style. 
name: The name of the style. 
thickness: The line thickness (for segments, circles, etc) this style defines. 

The values defined in a style can be overridden on a per-object basis, so a segment using a particular style may have a different color, linetype, or thickness than what the style defines.

Font Families

Each font used in the drawing must have a <FontFamily> tag associated with it. The tag itself is very compact and listed below.

id: The unique identifier for the font. 
name: The font family name 

The handling of fonts in a drawing is not robust, and will be an area in which developer time will eventually be spent. In the actual drawings, the display of fonts is still in its earliest stages, so there are many unresolved issues yet to be dealt with before the text display works well.

Units

A drawing must define what the basic length units are for that drawing. The dimensions can display different units than what this value is set to, but selecting what units an x/y coordinate pair represent is still necessary. The <Units> tag has the following attribute:

unit: The basic unit of length for the drawing 

There are several choices for units at the present. It is likely that one or two of these will be removed and replaced with some other choice before the first release is made. Millimeters are the default unit of length. Angular units are degrees, though adding an additional angular measurement unit is probable at some point. When an additional angular measurement unit is added, a new tag defining the default angular dimension will be added.

  • Millimeters
  • Micrometers
  • Meters
  • Kilometers
  • Inches
  • Feet
  • Yards
  • Miles

The unit attribute is a text string stating which unit is the default unit. Prior to developement release seven, the attribute was an an integer value that corresponds to a list of choices in the Generic/units.py file. As it is clearer to simply store a string like inches instead of some integer value, this tag was changed. The loading of a drawing with the attribute as an integer is still supported.

Dimension Styles

Dimension styles are an effort to contain a standard set of values for dimensioning attributes, such as line color, font size, text placement, etc. By setting and saving these values in a style, drawings can benefit from a uniform appearance, and ease the use of the program. Who would want to set many different values repeatedly when the collection of these values can be saved off and activated when needed? As dimensions are complex objects with many different attributes, however, a dimension style is also complex. Possibly the complexity can be reduced as development advances. The <DimStyle> tag has the following attributes:

id: The unique identifier for the dimension style. name: The dimension style name

Here is where storing a dimension style is somewhat different from many other objects in a drawing. The <DimStyle> tag contains <DimOpt> children tags, and each of those tags holds one part of the dimension style values. A DimStyle definition is an arbitrary number of these tags, and the value associated with the tag overrides the default value given in the table below.

As the development of PythonCAD has advanced, the DimStyle class has seen several modifications. The following table contains the valid options that can be used to define a DimStyle, the type of value the option represents, and the default value of the option.

Option Type Default Value
DIM_PRIMARY_FONT_FAMILYStringSans
DIM_PRIMARY_FONT_SIZEInteger12
DIM_PRIMARY_FONT_WEIGHTStringnormal
DIM_PRIMARY_FONT_STYLEStringnormal
DIM_PRIMARY_FONT_COLORString#ffffff (White)
DIM_PRIMARY_PREFIXUnicode String[Null]
DIM_PRIMARY_SUFFIXUnicode String[Null]
DIM_PRIMARY_PRECISIONInteger3
DIM_PRIMARY_UNITSStringmillimeters
DIM_PRIMARY_LEADING_ZEROBooleanTrue
DIM_PRIMARY_TRAILING_DECIMALBooleanTrue
DIM_SECONDARY_FONT_FAMILYStringSans
DIM_SECONDARY_FONT_SIZEInteger12
DIM_SECONDARY_FONT_WEIGHTStringnormal
DIM_SECONDARY_FONT_STYLEStringnormal
DIM_SECONDARY_FONT_COLORString#ffffff (White)
DIM_SECONDARY_PREFIXUnicode string[Null]
DIM_SECONDARY_SUFFIXUnicode string[Null]
DIM_SECONDARY_PRECISIONInteger3
DIM_SECONDARY_UNITSStringmillimeters
DIM_SECONDARY_LEADING_ZEROBooleanTrue
DIM_SECONDARY_TRAILING_DECIMALStringTrue
DIM_OFFSETFloat1.0
DIM_EXTENSIONFloat1.0
DIM_COLORString#ffa500 (Orange)
DIM_POSITIONStringsplit
DIM_ENDPOINTStringNone
DIM_ENDPOINT_SIZEFloat1.0
DIM_DUAL_MODEBooleanFalse
RADIAL_DIM_PRIMARY_PREFIXUnicode stringNone
RADIAL_DIM_PRIMARY_SUFFIXUnicode string[Null]
RADIAL_DIM_SECONDARY_PREFIXUnicode string[Null]
RADIAL_DIM_SECONDARY_SUFFIXUnicode string[Null]
RADIAL_DIM_DIA_MODEBooleanFalse
ANGULAR_DIM_PRIMARY_PREFIXUnicode string[Null]
ANGULAR_DIM_PRIMARY_SUFFIXUnicode string[Null]
ANGULAR_DIM_SECONDARY_PREFIXUnicode string[Null]
ANGULAR_DIM_SECONDARY_SUFFIXUnicode string [Null]

Here are the attributes for a <DimOpt> tag.

opt: The name of the dimension option. 
value: The value of the dimension option. 

As stated above, a DimStyle object is composed of any number of the entities in the list above. The DimStyle will use the default value if one is not provided in the list of options used to set the DimStyle when it is created.

For dimension options that specify colors or fonts, there isn&apos;t a value attribute; instead there is either cid which represents a color id value, or fid, which holds a font id. In the long term these different attributes may be removed, and the standard value attribute used for all <DimOpt> tags.

Text Styles

Text styles are similar to dimension styles in that they contain a set of common values for displaying text. By storing attributes like font family name, font size, font weight, font style, and color as a single object, any block of text that wants to share these attributes can reference the text style, and by sharing the common attribute set, memory requirements are reduced and saved file size can be reduced by removing redundant copies of identical values.

The manipulation of text in PythonCAD is very new, so it is almost guaranteed that the fields comprising a TextStyle entity will change in future release. As this entity changes, efforts will be made to ensure that the text styles saved in older versions of the entity will be read in the newer version. It is unlikely that the saving of new style entities in a manner that older PythonCAD releases will be able to use will happen, at least until the first official release is made. The <TextStyle> tag has the following attributes:

id: The unique identifier for the textstyle. 
name: The dimension style name 
cid: The id of the color used in this textstyle. 
fid: The id of the font family used in this textstyle. 
size: The font size of the textstyle. 
style: The font style of the textstyle. 
weight: The font weight of the textstyle. 

The sixth development release of PythonCAD stored the weight and style attributes as integer values, but it became clear quickly that doing so was needlessly confusing. With development release seven, these attributes are stored as text strings. It is so much simpler to understand an attribute when the value is a string like bold or italic than some number. Until the text handling features in PythonCAD become stable, it is very likely that changes of this nature between one release and another will be made.

Points

This tag stores the data for a 2-D point. The <Point> tag has the following attributes.

id: The unique identifier for the point. 
x: The x-coordinate of the point. 
y: The y-coordinate of the point. 

Segments

This tag stores the data for a 2-D line segment. The <Segment> tag has the following attributes:

id: The unique identifier for the segment. 
p1: The point id for the first endpoint of the segment. 
p2: The point id for the second endpoint of the segment. 
style: The style id for the segment. 

Any overridden style attribute will be present as an attribute in the segment.

Circles

A <Circle> tag has the following attributes:

id: The unique identifier for the circle. 
cp: The point id for the center point of the circle. 
r: The circle's radius. 
style: The style id for the circle. 

Any overridden style attribute will be present as an attribute in the circle.

Arcs

An <Arc> tag has the following attributes:

id: The unique identifier for the arc. 
cp: The point id for the center point of the arc. 
r: The arc's radius. 
sa: The starting angle of the arc. 
ea: The ending angle of the arc. 
style: The style id for the arc. 

The start angle and end angle of an arc are given in degrees, and will be a value between 0.0 and 360.0. All arcs are defined in a counter-clockwise manner, so an arc with a start angle of 0.0 and an end angle of 45.0 is a 45.0 degree arc. If the start angle is 45.0, and the end angle is 0.0, then the arc is a 315.0 degree arc.

Any overridden style attribute will be present as an attribute in the arc.

Horizontal Construction Lines

A horizontal construction line represents an infinitely long line horizontal through some point. Construction lines will not be shown on printed output. A <HCLine> tag has the following attributes:

id: The unique identifier for the horizontal construction line. 
location: The point id for the point through which the line passes. 

The appearance of a construction line is not overrideable. All construction lines and circles have the same color, linetype, and thickness.

Vertical Construction Lines

A vertical construction line represents an infinitely long line vertical through some point. Construction lines will not be shown on printed output. A <VCLine> tag has the following attributes:

id: The unique identifier for the vertical construction line. 
location: The point id for the point through which the line passes. 

The appearance of a construction line is not overrideable. All construction lines and circles have the same color, linetype, and thickness.

Angled Construction Lines

An angled construction line represents an infinitely long line passing through some point and at a user-defined angle. Construction lines will not be shown on printed output. A <ACLine> tag has the following attributes:

id: The unique identifier for the angled construction line. 
location: The point id for the point through which the line passes. 
angle: The angle at which the construction line lies. 

The angle for an angled construction lines is defined in degrees, and will be a value between -90.0 and 90.0. An angled construction line having a value of -90.0 or 90.0 is the same as a vertical construction line, and one with an angle of 0.0 is essentially a horizontal construction line.

The appearance of a construction line is not overrideable. All construction lines and circles have the same color, linetype, and thickness.

Two-point Construction Lines

An two-point construction line represents an infinitely long line passing through two distinct points. Construction lines will not be shown on printed output. A <CLine> tag has the following attributes:

id: The unique identifier for the two-point construction line. 
p1: The point id of the first key-point of the construction line. 
p2: The point id of the second key-point of the construction line. 

The two points of the construction line must be different. If both points have equal x-coordinate values, then the two-point construction line is the same as a vertical construction line. Likewise, if both points have equal y-coordinate values, the construction line is the same as a horizontal construction line.

The appearance of a construction line is not overrideable. All construction lines and circles have the same color, linetype, and thickness.

Construction Circles

A construction circle is essentially the same as a standard circle, except as it is a construction entity, it will not be shown on printed output. A <CCircle> tag has the following attributes:

id: The unique identifier for the construction circle. 
cp: The point id of the construction circle center point. 
r: The construction circle radius. 

The appearance of a construction line is not overrideable. All construction lines and circles have the same color, linetype, and thickness.

Fillets

A fillet is a smooth, round joint between two segments. At the present, only segments can be connected with a fillet, though at some point joining arc is planned. A <Fillet> tag has the following attributes:

id: The unique identifier for the fillet. 
radius: The radius of the fillet. 
s1: The first segment the fillet connects to. 
s2: The second segment the fillet connects to. 
style: The style id for the fillet. 

Chamfers

A chamfer is a straight connection between two segments. Like a fillet, a chamfer can only join segments. This will probably not change in future releases. A <Chamfer> tag has the following attributes.

id: The unique identifier for the chamfer. 
length: The length of the chamfer. 
s1: The first segment the fillet connects to. 
s2: The second segment the fillet connects to. 
style: The style id for the fillet. 

The chamfer length is the distance subtracted from the original segment length at the endpoint where the chamfer is placed. It does not represent the distance between the two segment endpoints where the chamfer is drawn.

Leaders

This tag stores the data for a leader line. A leader line is typically used to act as a visual pointer in a drawing. The <Leader> tag has the following attributes:

id: The unique identifier for the leader. 
p1: The point id for the first endpoint of the leader. 
p2: The point id for the second endpoint of the leader. 
p3: The point id for the final endpoint of the leader. 
style: The style id for the leader. 
size: The size of the leader arrow pointer. 

Any overridden style attribute will be present as an attribute in the leader.

Polylines

This tag defines the storage of Polylines. A Polyline is essentially a connected set of segments. There is no limit to the length of the polyline. The Polyline object was added to PythonCAD in an effort to provide an similar entity to other CAD programs that have an drawing entity like this. As such the fields comprising an Polyline may change to provide additional functionality or simpler conversion to/from the behavior of this type of object in other CAD programs. The <Polyline> tag has the following attributes:

id: The unique identifier for the polyline. 
points: A list of point references that define the vertices of the polyline. 
style: The style id for the polyline. 

The points attribute is a Python list, similar to that used to store linetype entities. The list must hold at least two references to point objects stored in the layer.

Any overridden style attribute will be present as an attribute in the leader.

TextBlocks

The TextBlock tag stores the text strings in the file. Expect the TextBlock tag to change somewhat until the first official release as text handling within PythonCAD improves. The <TextBlock> tag currently has the following attributes:

id: The unique identifier for the textblock. 
tsid: A reference to the TextStyle used by the textblock. 
angle: The angle of the textblock. 
x: The x-coordinate where the textblock should be placed. 
y: The y-coordinate where the textblock should be placed. 

The angle attribute is currently ignored, as PythonCAD does not yet have the ability to display rotated text. There is also a current implicit assumption that all text is left justified, so the x and y attributes give the upper left-hand coordinate where the text begins. An attribute storing text justification will be added to provide for the storage of center or right justified text, and other attributes will be added as the handling of text in PythonCAD becomes more complete. As text in PythonCAD is stored as Unicode objects, attributes that indicate the text as right-to-left text or multibyte characters may be added as well.

The storage of the text itself is handled in child entities of a TextBlock entity. The initial layout of these child entities was arrived at after failing to resolve a problem in the Python minidom library with the storage of TextNodes. See the comments in the Generic/imageio.py for a short description of the problem. If there is a solution to the problem described in this note please send it to the PythonCAD developer.

Each line of text is stored in a <TextLine> entity. This entity is defined with a single attribute text that stores the text for the line. There should be a <TextLine> entity for every line of text in the textblock, even empty lines. Once the minidom problem is resolved, the text string will be stored as a TextNode child of the <TextLine>, not as an attribute.

Linear Dimensions

A linear dimension measures the absolute distance between two separate points. The code for storing linear dimensions works, but the code for drawing them has not been completed, so these dimensional entities will not be displayed. A <LDim> tag has the following attributes:

id: The unique identifier for the linear dimension. 
ds: The dimension style identifier used by the linear dimension. 
l1: The layer identifier containing the first point of the linear dimension. 
p1: The point identifier for the first point of the linear dimension. 
l2: The layer identifier containing the second point of the linear dimension. 
p2: The point identifier for the second point of the linear dimension. 
x: The x-coordinate around which the dimension text should be centered. 
y: The y-coordinate around which the dimension text should be centered. 

Please refer to the brief discussion about dimension objects here.

Horizontal Dimensions

A horizontal dimension measures the horizontal distance between two separate points. The code for storing horizontal dimensions works, as well as the code for drawing them. The code is not very robust, however, so it is quite easy to create a horizontal dimension that is drawn oddly. Future releases will address these shortcomings. A <HDim> tag has the following attributes:

id: The unique identifier for the horizontal dimension. 
ds: The dimension style identifier used by the horizontal dimension. 
l1: The layer identifier containing the first point of the horizontal dimension. 
p1: The point identifier for the first point of the horizontal dimension. 
l2: The layer identifier containing the second point of the horizontal dimension. 
p2: The point identifier for the second point of the horizontal dimension. 
x: The x-coordinate around which the dimension text should be centered. 
y: The y-coordinate around which the dimension text should be centered. 

Please refer to the brief discussion about dimension objects here.

Vertical Dimensions

A vertical dimension measures the vertical distance between two separate points. The code for storing vertical dimensions works, as well as the code for drawing them. The code is not very robust, however, so it is quite easy to create a vertical dimension that is drawn oddly. Future releases will address these shortcomings. A <VDim> tag has the following attributes:

id: The unique identifier for the vertical dimension. 
ds: The dimension style identifier used by the vertical dimension. 
l1: The layer identifier containing the first point of the vertical dimension. 
p1: The point identifier for the first point of the vertical dimension. 
l2: The layer identifier containing the second point of the vertical dimension. 
p2: The point identifier for the second point of the vertical dimension. 
x: The x-coordinate around which the dimension text should be centered. 
y: The y-coordinate around which the dimension text should be centered. 

Please refer to the brief discussion about dimension objects here.

Radial Dimensions

A radial dimension measures only circles and arcs. The dimension can be set to measure diameters as well. The code to create and display radial dimension is in place, though the drawing code is in the earliest stages. A <RDim> tag has the following attributes:

id: The unique identifier for the radial dimension. 
ds: The dimension style identifier used by the radial dimension. 
l: The layer identifier containing the measured circle or arc 
c: The measured circle or arc identifier. 
x: The x-coordinate around which the dimension text should be centered. 
y: The y-coordinate around which the dimension text should be centered. 

Please refer to the brief discussion about dimension objects here.

Angular Dimensions

An angular dimension measures some angle defined by three distinct points. If measuring an arc, the vertex point is the center point of the arc, and the other two points are the arc endpoints. There is essentially no code written for the drawing of an angular dimension, and only the barest code written for the creation of one. For the first release, and probably for the next few development releases, angular dimensions will not work. A <ADim> tag has the following attributes:

id: The unique identifier for the angular dimension. 
ds: The dimension style identifier used by the angular dimension. 
l1: The layer identifier containing the vertex point of the vertical dimension. 
p1: The point identifier for the vertex point of the angular dimension. 
l2: The layer identifier containing the first endpoint of the angular dimension. 
p2: The point identifier for the first endpoint of the angular dimension. 
l3: The layer identifier containing the second endpoint of the angular dimension. 
p3: The point identifier for the second endpoint of the angular dimension. 
x: The x-coordinate around which the dimension text should be centered. 
y: The y-coordinate around which the dimension text should be centered. 

Please refer to the brief discussion about dimension objects here.

Layers

Layers are the containers for all the drawing entities. A drawing consists of at least one layer, with an unlimited number of children layers below the top-most parent layer. Except for the top layer, any layer can be moved from one parent to another. All layers can be hidden or displayed, and any layer with children layers can hide all its children, as well as display them all. The organization of the layers in a drawing is shown on the left-hand side of the interface. Presently this display allows for easy layer renaming, creation of children layers, and the hiding or displaying of each layer. Re-parenting a layer via drag-and-drop does not work yet. A <Layer> tag has the following attributes:

id: The unique identifier for the layer. 
name: The name of the layer. 
scale: The scale factor to use for dimensioning entities contained in that layer.

Every layer in a drawing can have the same name, though that would be incredibly confusing. The scale factor is not used yet, as the code for dimensioning is in such an early state. It is believed that the ability to set different scale factors for different layers is a desirable feature, though this may end up not being the case.

When a layer is saved, the entities stored in that layer are saved as children tags of the layer. More correctly they are grand-children, as each entity type is stored below a tag indicating what sort of things are being grouped together. So, a set of <Point> objects will be stored as children of a <Points> tag, <Segment> tags will be within <Segments>, etc.

Missing Entities

A quick glance over the above defined entities will make the astute reader ask about hatching and other entities that any serious CAD package should offer. The short answer is these features are not available yet. The first development release of the program concentrated on getting the basic drawing code working, providing a few dimensioning abilities, and the initial implementation of a user interface. Each following release has added new entities and improved the functionality of the existing entities, and future releases will hopefully continue the trend. PythonCAD will not have an official release until, at the very least, hatching and printing functions are implemented.


file_format_reference.txt · Last modified: 2010/03/16 15:44 (external edit)