From brideout at haystack.mit.edu Mon Apr 4 14:17:34 2005 From: brideout at haystack.mit.edu (William Rideout) Date: Mon Oct 23 17:53:09 2006 Subject: [OpenMadrigal-developers] Creating/editting Cedar files with python Message-ID: <425184BE.6050207@haystack.mit.edu> As the final feature of Madrigal 2.4, I plan to add a class that will allow easy creation and editing of Cedar files from python. This python class is meant to make working with Cedar files easier by hiding the following details of the Cedar format from the user: 1. The underlying use of 16 bit integers as a storage format, along with the underlying use of "additional increment" parameters. Users will store and retrieve all data as doubles, and the python API will be responsible for converting to 16 bit integers, and using any "additional increment" parameters if they exist. Trying to input a value that overflows the limited dynamic range set by the Cedar format will raise an exception. 2. All parameters may be referred to through either Cedar mnemonics or cedar codes. 3. Ordering of parameters (and other details) within a Cedar file will be hidden from users. 4. Warnings will be raised if a user tries to include any time parameters in a Cedar record that directly conflict with prolog time parameters. While strictly legal according to the Cedar standard, including time information in two independent fields leads to an "Alice in Wonderland" data format of "data means just what I intend it to mean", and should be avoided. Here's my suggested API: *********************************** MadrigalCedarFile(fullFilename, createFlag=False) The class initializer takes a fullFilename as argument. This is either the existing Cedar file (in any allowed Cedar format), or a file to be created. The second argument, createFlag, tells whether this is a file to be created. If False and fullFilename cannot be read, an error is raised. If True and fullFilename already exists, or fullFilename cannot be created, an error is raised. If createFlag == False, then the initializer reads the entire Cedar file into memory, and creates a list of MadrigalCatalogRecords, MadrigalHeaderRecords, and MadrigalDataRecords (described below). The MadrigalCedarFile will be derived from the python list class, so its public methods are exactly those of a python list. The only limitation will be the natural one - any object added to the list must be either a MadrigalCatalogRecord, a MadrigalHeaderRecord, or a MadrigalDataRecord. MadrigalCedarFile will have one additional public method: write(format="Madrigal", newFilename=None) which will persist the object to file in a Cedar format. The default format is Madrigal, but also allowed will be "BlockedBinary", "UnblockedBinary", "Cbf", and "Ascii". The default newFilename is None, which means write to file originally opened or created, but if given, write to newFilename. *********************************** MadrigalCatalogRecord(kinst, modexp, startTimestamp, endTimestamp) The MadrigalCatalogRecord initializer takes the following arguments: kinst - the kind of instrument code modexp - the mode of the experiment identifier startTimestamp - start of experiment in seconds since 1/1/1970 (int or double) endTimestamp - end of experiment in seconds since 1/1/1970 (int or double) A warning will be raised if kinst is not an instrument listed in instTab.txt. An exception will be raised if startTimestamp > endTimestamp. MadrigalCatalogRecord has the following public methods/attributes: getKinst() getModexp() getStartTimestamp() getEndTimestamp() lines lines is a list of 80 character or less lines of ascii text. This list may be changed to modify or create a MadrigalCatalogRecord. The only limitation is that each new line added must be ascii text of 80 characters or less. *********************************** MadrigalHeaderRecord(kinst, kindat, startTimestamp, endTimestamp) The MadrigalCatalogRecord initializer takes the following arguments: kinst - the kind of instrument code kindat - the kind of data code startTimestamp - start of experiment in seconds since 1/1/1970 (int or double) endTimestamp - end of experiment in seconds since 1/1/1970 (int or double) jpar - number of 1D parameters in following records mpar - number of 2D parameters in following records A warning will be raised if kinst is not an instrument listed in instTab.txt. An exception will be raised if startTimestamp > endTimestamp. MadrigalCatalogRecord has the following public methods/attributes: getKinst() getKindat() getStartTimestamp() getEndTimestamp() getJpar() getMpar() lines lines is a list of 80 character or less lines of ascii text. This list may be changed to modify or create a MadrigalCatalogRecord. The only limitation is that each new line added must be ascii text of 80 characters or less. *********************************** MadrigalDataRecord(kinst, kindat, startTimestamp, endTimestamp, oneDList, twoDList, nrow) The MadrigalDataRecord initializer takes the following arguments: kinst - the kind of instrument code kindat - the kind of data code startTimestamp - start of record in seconds since 1/1/1970 (int or double) endTimestamp - end of record in seconds since 1/1/1970 (int or double) oneDList - list of one-dimensional parameters in record (mnemonic or code) twoDList - list of two-dimensional parameters in record (mnemonic or code) nrow - number of rows of 2D data to create. Until set, all values default to missing. MadrigalDataRecord has the following public attributes/methods: getKinst() getKindat() getStartTimestamp() getEndTimestamp() getOneDParmCodes() getOneDParmMnemonics() getNrow() set(parm, row, value) - parm is mnemonic or code, row starts at 0, value is double. Value may also be 'missing'. If error parameter, value may also be 'assumed' or 'knownbad'. get(parm, row) - parm is mnemonic or code, row starts at 0 - returns a double. *********************************** In fact this API is meant to abstract the Cedar Data Model away from the Cedar database format. The essence of the Cedar Data Model is simply that a file is an ordered list of records, where each record has the required fields (startTimestamp, endTimestamp, kinst, kindat, and lists of 1D and 2D parameters), along with values for the 1D and 2D parameters. It is possible a future version of this API would allow writing and reading from a non-Cedar format. Bill -- Bill Rideout MIT Haystack Observatory Email: brideout@haystack.mit.edu Phone: 781 981-5624