Now it's time for a more real-life example (i.e. with errors in the code). We will create two groups that branch directly from the root node, Particles and Events. Then, we will put three tables in each group. In Particles we will put tables based on the Particle descriptor and in Events, the tables based the Event descriptor.
Afterwards, we will provision the tables with a number of records. Finally, we will read the newly-created table /Events/TEvent3 and select some values from it, using a comprehension list.
Look at the next script (you can find it in examples/tutorial2.py). It appears to do all of the above, but it contains some small bugs. Note that this Particle class is not directly related to the one defined in last tutorial; this class is simpler (note, however, the multidimensional columns called pressure and temperature).
We also introduce a new manner to describe a Table as a dictionary, as you can see in the Event description. See section 4.2.2 about the different kinds of descriptor objects that can be passed to the createTable() method.
from numarray import * from tables import * # Describe a particle record class Particle(IsDescription): name = StringCol(length=16) # 16-character String lati = IntCol() # integer longi = IntCol() # integer pressure = Float32Col(shape=(2,3)) # array of floats (single-precision) temperature = FloatCol(shape=(2,3)) # array of doubles (double-precision) # Another way to describe the columns of a table Event = { "name" : StringCol(length=16), "lati" : IntCol(), "longi" : IntCol(), "pressure" : Float32Col(shape=(2,3)), "temperature" : FloatCol(shape=(2,3)), } # Open a file in "w"rite mode fileh = openFile("tutorial2.h5", mode = "w") # Get the HDF5 root group root = fileh.root # Create the groups: for groupname in ("Particles", "Events"): group = fileh.createGroup(root, groupname) # Now, create and fill the tables in the Particles group gparticles = root.Particles # Create 3 new tables for tablename in ("TParticle1", "TParticle2", "TParticle3"): # Create a table table = fileh.createTable("/Particles", tablename, Particle, "Particles: "+tablename) # Get the record object associated with the table: particle = table.row # Fill the table with data for 257 particles for i in xrange(257): # First, assign the values to the Particle record particle['name'] = 'Particle: %6d' % (i) particle['lati'] = i particle['longi'] = 10 - i ########### Detectable errors start here. Play with them! particle['pressure'] = array(i*arange(2*3), shape=(2,4)) # Incorrect #particle['pressure'] = array(i*arange(2*3), shape=(2,3)) # Correct ########### End of errors particle['temperature'] = (i**2) # Broadcasting # This injects the Record values particle.append() # Flush the table buffers table.flush() # Now Events: for tablename in ("TEvent1", "TEvent2", "TEvent3"): # Create a table in the Events group table = fileh.createTable(root.Events, tablename, Event, "Events: "+tablename) # Get the record object associated with the table: event = table.row # Fill the table with data on 257 events for i in xrange(257): # First, assign the values to the Event record event['name'] = 'Event: %6d' % (i) event['TDCcount'] = i % (1<<8) # Correct range ########### Detectable errors start here. Play with them! #event['xcoord'] = float(i**2) # Correct spelling event['xcoor'] = float(i**2) # Wrong spelling event['ADCcount'] = i * 2 # Correct type #event['ADCcount'] = "sss" # Wrong type ########### End of errors event['ycoord'] = float(i)**4 # This injects the Record values event.append() # Flush the buffers table.flush() # Read the records from table "/Events/TEvent3" and select some table = root.Events.TEvent3 e = [ p['TDCcount'] for p in table if p['ADCcount'] < 20 and 4 <= p['TDCcount'] < 15 ] print "Last record ==>", p print "Selected values ==>", e print "Total selected records ==> ", len(e) # Finally, close the file (this also will flush all the remaining buffers) fileh.close()
If you look at the code carefully, you'll see that it won't work. You will get the following error:
$ python tutorial2.py Traceback (most recent call last): File "tutorial2.py", line 53, in ? particle['pressure'] = array(i*arange(2*3), shape=(2,4)) # Incorrect File "/usr/local/lib/python2.2/site-packages/numarray/numarraycore.py", line 281, in array a.setshape(shape) File "/usr/local/lib/python2.2/site-packages/numarray/generic.py", line 530, in setshape raise ValueError("New shape is not consistent with the old shape") ValueError: New shape is not consistent with the old shape
This error indicates that you are trying to assign an array with an incompatible shape to a table cell. Looking at the source, we see that we were trying to assign an array of shape (2,4) to a pressure element, which was defined with the shape (2,3).
In general, these kinds of operations are forbidden, with one valid exception: when you assign a scalar value to a multidimensional column cell, all the cell elements are populated with the value of the scalar. For example:
particle['temperature'] = (i**2) # Broadcasting
The value i**2 is assigned to all the elements of the temperature table cell. This capability is provided by the numarray package and is known as broadcasting.
After fixing the previous error and rerunning the program, we encounter another error:
$ python tutorial2.py Traceback (most recent call last): File "tutorial2.py", line 74, in ? event['xcoor'] = float(i**2) # Wrong spelling File "src/hdf5Extension.pyx", line 1812, in hdf5Extension.Row.__setitem__ raise KeyError, "Error setting \"%s\" field.\n %s" % \ KeyError: Error setting "xcoor" field. Error was: "exceptions.KeyError: xcoor"
This error indicates that we are attempting to assign a value to a non-existent field in the event table object. By looking carefully at the Event class attributes, we see that we misspelled the xcoord field (we wrote xcoor instead). This is unusual behavior for Python, as normally when you assign a value to a non-existent instance variable, Python creates a new variable with that name. Such a feature can be dangerous when dealing with an object that contains a fixed list of field names. PyTables checks that the field exists and raises a KeyError if the check fails.
Finally, in order to test type checking, we will change the next line:
event.ADCcount = i * 2 # Correct type
to read:
event.ADCcount = "sss" # Wrong type
This modification will cause the following TypeError exception to be raised when the script is executed:
$ python tutorial2.py Traceback (most recent call last): File "tutorial2.py", line 76, in ? event['ADCcount'] = "sss" # Wrong type File "src/hdf5Extension.pyx", line 1812, in hdf5Extension.Row.__setitem__ raise KeyError, "Error setting \"%s\" field.\n %s" % \ KeyError: Error setting "ADCcount" field. Error was: "exceptions.TypeError: NA_setFromPythonScalar: bad value type."
You can see the structure created with this (corrected) script in figure 3.4. In particular, note the multidimensional column cells in table /Particles/TParticle2.