Program: beautify-unl.pl Purpose: Make .unl file more readable Author: Jacob Salomon jakesalomon@yahoo.com Release: 1.20 2010-07-25 The Problem: In the process of debugging many programs, we frequently unload the results of a query to a flat file in order to examine the data by eye. This is a daunting task simply by virtue (vice? ;-) of the with of each line. If this is not enough to discourage you, try navigating down the uneven columns in the .unl output. Here is a cut-out box from one such .unl file: 1|F96|305|||R|05/31/96|1161|235||0.0|L|M|O|BF|Z|C|P|M|002045|06/05/96| 1|F96|305|||L|05/31/96|189|235||0.0|L|M|O|BF|Z|C|P|M|002045|06/05/96| 1|F96|305|||L|06/30/96|180|235||0.0|L|M|O|BF|Z|C|P|M|002045|07/15/96| 1|F96|305|||R|06/30/96|1020|235||0.0|L|M|O|BF|Z|C|P|M|002045|07/15/96| 1|F96|041|||R|06/30/96|1232|235||0.0|L|M|O|BF|Z|C|P|M|002045|07/15/96| 1|F96|041|||L|06/30/96|218|235||0.0|L|M|O|BF|Z|C|P|M|002045|07/15/96| 1|F96|010|||R|07/30/96|1125|235||33.29|L|M|O|BF|Z|C|P|M|002045|07/30/| 1|F96|010|||L|07/30/96|125|235||33.29|L|M|O|BF|Z|C|P|M|002045|07/30/9| 1|F96|041|||R|07/30/96|1275|235||33.29|L|M|O|BF|Z|C|P|M|002045|07/30/| 1|F96|305|||L|05/31/96|301|235||0.0|L|M|O|BF|Z|C|B|M|002046|06/25/96| Not a pretty sight, is it? It would be so much more readable if only the columns were aligned. The Original Solution: beautify-unl.sh The shell-script beautify-unl.sh, originially written n 1998, does exactly this. Historical information about that can be obtained by downloading that product from the IIUG repository. And it still works well. It scans down the lines, noting the greatest width of each column in the .unl output and then prints it out again, giving each column the greatest width it had needed in the original unl file. This gives the result the appearance of neat columns, like the following: 1|F96|305| ||R |05/31/96|1161|235|| 0.00|L|M|O |BF|Z|C|P |M | 1|F96|305| ||L |05/31/96| 189|235|| 0.00|L|M|O |BF|Z|C|P |M | 1|F96|305| ||L |06/30/96| 180|235|| 0.00|L|M|O |BF|Z|C|P |M | 1|F96|305| ||R |06/30/96|1020|235|| 0.00|L|M|O |BF|Z|C|P |M | 1|F96|041| ||R |06/30/96|1232|235|| 0.00|L|M|O |BF|Z|C|P |M | 1|F96|041| ||L |06/30/96| 218|235|| 0.00|L|M|O |BF|Z|C|P |M | 1|F96|010| ||R |07/30/96|1125|235||33.29|L|M|O |BF|Z|C|P |M | 1|F96|010| ||L |07/30/96| 125|235||33.29|L|M|O |BF|Z|C|P |M | 1|F96|041| ||R |07/30/96|1275|235||33.29|L|M|O |BF|Z|C|P |M | 1|F96|305| ||R |07/30/96|2508|235||33.29|L|M|O |BF|Z|C|P |M | (Note - in this case, the lines were truncated to fit into a 72-column line.) The lines may still be ugly and long but at least you can trace the value of a column down a straight path instead of snaking you way down the page. Note one more nicety: Numeric values are right-justified in their columns while non-numeric values are left justified. (Alignment by the decimal point was later implemented.) For more information about how to invoke beautify-unl.sh, please download beautify-unl.shar from the IIUG software library. The New Solution: beautify-unl.ps and Package UNLbeautifier.pm Most shell scripts that I subsequently wrote piped their output to beautify-unl.sh as their last step. Since I plan to rewrite most of them in Perl for better performance, I wanted them to also handle their own output rather than pipe to another program. A Perl class would do the job very well. More on the class requirements later. To invoke beautify-unl.pl simply type the command and specify a .unl file and the character used at the delimter: $ beautify-unl.pl [-d input delimiter] [-D output delimiter] [yutz.unl] As you can see, all options are, well, optional o Omit the inpput delimiter and it will default to the '|' pipe,the Informix default for unloads. o Unload the output delimiter and it will default to the input delimiter. o If you omit a file, it will default to stdin. As with the shell script, beautify-unl.pl is a filter; it sends its data out stdout. Note that some delimiters need to be escaped for the sake of the shell. FOr example, if you wished to specify the pipe character, you would need to run the command as: $ beautify-unl.pl -d'|' # or -d \| to prevent the shell from invoking the pipe mechanism. The character as a delimiter: ===================================== Many files - like the output of onstat commands - are space delimited. This poses a nasty problem: How do I tell beautify-unl.sh that the delimiter is a blank? With a '-d ' parameter? This is doable but awkward; it's too easy to forget the quote marks and some shell implementations of the are too brain-damaged to correctly recognize this. Solution: Specify the blank parameter as -db eg: onstat -d|beautify-unl.pl -db One caveat with this parameter: If the delimiter is the [default] pipe symbol or a comma, then a pair of successive delimiters are interpreted as separating null column values. If the delimiter is a space then successive spaces will be folded by the package and treated as one delimiter. Fortunately, this is normally what we humans expect with blank delimiters. Delimiters Within Columns: ========================== While not desirable, it is sometimes unavoidable: The data in a column contains the delimiter. e.g. address="333 Pickle Street|Yechupetz". When such data is unloaded, the unl file will present it as: ..|333 Pickle Street\|Yechupetz|.. This escape is recognized by the load command and the dbload utility. However, perl is not that clever - it will automatically separate this into two fields: "333 Pickle Street\" and "Yechupetz". (It may also be argued that I was not clever enough to come up with a pattern to recognize this. I won't disagree.) To compensate for this, the package checks for a backslash at the end of every field and reunites the separated data. Note that I have not tested the escape-handling code for the blank delimiter. Planned Features: ================= o Column Headers and Lines Per Page --------------------------------- One plan on the back burner is an option to designate the first N lines of the input file as column headers, to be repeated at the top ov every page. (i.e. -r55) In that case, an arbitrary page size of 56 rows would be set. But once I include that, it demands I allow a parameter for the line count of a page so that you might specify -l 40 if you plan to send the output to a landscape printer. o Limiting Column Width --------------------- Another feature I'd like to see solves a problem I had with my script fragments.sh: The fragmentation expression was very long; I would rather have displayed it as a column 40 bytes wide and have it break in the same column over multiple rows. And since I may wish to do this on several columns, I would need to modify the way I parse the command line options, since I would have to allow the same option to be reused. For example, if I wished to limit column 3 to 20 characters and column 6 to 38 characters, the user would specify: -w3,20 -w6,38 in the command line. Of course, such output would be entirely unfit for loading into a database. But it would fit on a sheet and be more readable. This issue also has a comment in the pos for UNLbeautifier.pm Possible bug: ============= As I write this, I realize that I have omitted a possible error in the output: Suppose I allow for the | input delimiter but specify comma for the output delimiter. Now suppose a column contains a comma already e.g. a complete address. The result would have a delimiter embedded within the column. I have made no provision to escape the embedded delimiter. Hence, if the output file is subsequently used to load another table, it would have the incorrect number of columns. Truth be known, it's not toally straightforard; what if it is already escaped? By adding another ecape, I have effectivly un-escaped the delimiter by escaping the escape. Perl Issues =========== Now that we have a package with a client, the program is sivide in two separate files: - beautify-unl.pl - The command that the user calls - UNLbeutifier.p, - The package hat defines the classes and methods called by the client. Although written as a set of related Perl classes, any perl guru eyeballing my code would recognize that it comes from the fingers of a C programmer. Sorry about that and I am working on the expertise. However there was one problem I was unable to overcome: There ar some places in the code where I was forced to qualify a subroutine with the class name e.g. UNLbeutifier::check_meta, when it seemed that it shluld have been visible to the "compiler" without this qualification. WHen I have solved this I will issue a new release. Another issue is the help page. To get some brief help text on the use of beautify-unl.pl, type: $ beautify-unl.pl -h There is a fairly wordy pod for the UNLbeautifier package. To view that (once you have the .pm in your INC path or current directory), type: $ perldoc UNLbeautifier -- Jacob Salomon jakesalomon@yahoo.com jakesalomon@yahoo.com ========================================================================