3.4. Multidimensional table cells and automatic sanity checks

Now it's time for a more real-life example (i.e. with errors in the code). We will create two groups that branch directly from the root node, Particles and Events. Then, we will put three tables in each group. In Particles we will put tables based on the Particle descriptor and in Events, the tables based the Event descriptor.

Afterwards, we will provision the tables with a number of records. Finally, we will read the newly-created table /Events/TEvent3 and select some values from it, using a comprehension list.

Look at the next script (you can find it in examples/tutorial2.py). It appears to do all of the above, but it contains some small bugs. Note that this Particle class is not directly related to the one defined in last tutorial; this class is simpler (note, however, the multidimensional columns called pressure and temperature).

We also introduce a new manner to describe a Table as a dictionary, as you can see in the Event description. See section 4.2.2 about the different kinds of descriptor objects that can be passed to the createTable() method.


from numarray import *
from tables import *

# Describe a particle record
class Particle(IsDescription):
    name        = StringCol(length=16) # 16-character String
    lati        = IntCol()             # integer
    longi       = IntCol()             # integer
    pressure    = Float32Col(shape=(2,3)) # array of floats (single-precision)
    temperature = FloatCol(shape=(2,3))   # array of doubles (double-precision)

# Another way to describe the columns of a table
Event = {
    "name"        : StringCol(length=16),
    "lati"        : IntCol(),
    "longi"       : IntCol(),
    "pressure"    : Float32Col(shape=(2,3)),
    "temperature" : FloatCol(shape=(2,3)),
    }

# Open a file in "w"rite mode
fileh = openFile("tutorial2.h5", mode = "w")
# Get the HDF5 root group
root = fileh.root
# Create the groups:
for groupname in ("Particles", "Events"):
    group = fileh.createGroup(root, groupname)
# Now, create and fill the tables in the Particles group
gparticles = root.Particles
# Create 3 new tables
for tablename in ("TParticle1", "TParticle2", "TParticle3"):
    # Create a table
    table = fileh.createTable("/Particles", tablename, Particle,
                           "Particles: "+tablename)
    # Get the record object associated with the table:
    particle = table.row
    # Fill the table with data for 257 particles
    for i in xrange(257):
        # First, assign the values to the Particle record
        particle['name'] = 'Particle: %6d' % (i)
        particle['lati'] = i
        particle['longi'] = 10 - i
        ########### Detectable errors start here. Play with them!
        particle['pressure'] = array(i*arange(2*3), shape=(2,4))  # Incorrect
        #particle['pressure'] = array(i*arange(2*3), shape=(2,3))  # Correct
        ########### End of errors
        particle['temperature'] = (i**2)     # Broadcasting
        # This injects the Record values
        particle.append()
    # Flush the table buffers
    table.flush()

# Now Events:
for tablename in ("TEvent1", "TEvent2", "TEvent3"):
    # Create a table in the Events group
    table = fileh.createTable(root.Events, tablename, Event,
                           "Events: "+tablename)
    # Get the record object associated with the table:
    event = table.row
    # Fill the table with data on 257 events
    for i in xrange(257):
        # First, assign the values to the Event record
        event['name']  = 'Event: %6d' % (i)
        event['TDCcount'] = i % (1<<8)   # Correct range
        ########### Detectable errors start here. Play with them!
        #event['xcoord'] = float(i**2)   # Correct spelling
        event['xcoor'] = float(i**2)     # Wrong spelling
        event['ADCcount'] = i * 2        # Correct type
        #event['ADCcount'] = "sss"          # Wrong type
        ########### End of errors
        event['ycoord'] = float(i)**4
        # This injects the Record values
        event.append()

    # Flush the buffers
    table.flush()

# Read the records from table "/Events/TEvent3" and select some
table = root.Events.TEvent3
e = [ p['TDCcount'] for p in table
      if p['ADCcount'] < 20 and 4 <= p['TDCcount'] < 15 ]
print "Last record ==>", p
print "Selected values ==>", e
print "Total selected records ==> ", len(e)
# Finally, close the file (this also will flush all the remaining buffers)
fileh.close()
	

3.4.1. Shape checking

If you look at the code carefully, you'll see that it won't work. You will get the following error:


$ python tutorial2.py
Traceback (most recent call last):
  File "tutorial2.py", line 53, in ?
    particle['pressure'] = array(i*arange(2*3), shape=(2,4))  # Incorrect
  File  "/usr/local/lib/python2.2/site-packages/numarray/numarraycore.py",
 line 281, in array
  a.setshape(shape)
  File "/usr/local/lib/python2.2/site-packages/numarray/generic.py",
 line 530, in setshape
    raise ValueError("New shape is not consistent with the old shape")
ValueError: New shape is not consistent with the old shape
	

This error indicates that you are trying to assign an array with an incompatible shape to a table cell. Looking at the source, we see that we were trying to assign an array of shape (2,4) to a pressure element, which was defined with the shape (2,3).

In general, these kinds of operations are forbidden, with one valid exception: when you assign a scalar value to a multidimensional column cell, all the cell elements are populated with the value of the scalar. For example:


        particle['temperature'] = (i**2)    # Broadcasting
	  

The value i**2 is assigned to all the elements of the temperature table cell. This capability is provided by the numarray package and is known as broadcasting.

3.4.2. Field name checking

After fixing the previous error and rerunning the program, we encounter another error:


$ python tutorial2.py
Traceback (most recent call last):
  File "tutorial2.py", line 74, in ?
    event['xcoor'] = float(i**2)     # Wrong spelling
  File "src/hdf5Extension.pyx",
 line 1812, in hdf5Extension.Row.__setitem__
    raise KeyError, "Error setting \"%s\" field.\n %s" % \
KeyError: Error setting "xcoor" field.
 Error was: "exceptions.KeyError: xcoor"
	  

This error indicates that we are attempting to assign a value to a non-existent field in the event table object. By looking carefully at the Event class attributes, we see that we misspelled the xcoord field (we wrote xcoor instead). This is unusual behavior for Python, as normally when you assign a value to a non-existent instance variable, Python creates a new variable with that name. Such a feature can be dangerous when dealing with an object that contains a fixed list of field names. PyTables checks that the field exists and raises a KeyError if the check fails.

3.4.3. Data type checking

Finally, in order to test type checking, we will change the next line:


	    event.ADCcount = i * 2        # Correct type
	  

to read:


	    event.ADCcount = "sss"          # Wrong type
	  

This modification will cause the following TypeError exception to be raised when the script is executed:


$ python tutorial2.py
Traceback (most recent call last):
  File "tutorial2.py", line 76, in ?
    event['ADCcount'] = "sss"          # Wrong type
  File "src/hdf5Extension.pyx",
 line 1812, in hdf5Extension.Row.__setitem__
    raise KeyError, "Error setting \"%s\" field.\n %s" % \
KeyError: Error setting "ADCcount" field.
 Error was: "exceptions.TypeError: NA_setFromPythonScalar: bad value type."
	  

You can see the structure created with this (corrected) script in figure 3.4. In particular, note the multidimensional column cells in table /Particles/TParticle2.

Figure 3.4. Table hierarchy for tutorial 2.