This file contains a collection of articles posted to comp.databases.informix and informix-list that are related to converting comma-quote delimited ASCII files into Informix unload format. The original articles have been edited to remove included lines and extraneous header lines. Walt Hultgren walt@rmy.emory.edu ============================================================================== From: root@TeamInfinity.com (root) Date: 14 Apr 1995 08:05:44 -0400 Subject: HOW in the HECK can we import comma quote delimted ASCII !!!???? Message-ID: <3mloeo$8eg@dialin01.teaminfinity.com> I cannot find out how to import comma quote delimited text into Informix if my life depended on it. We looked at dbload and got it to work ONLY after converting our comma quote delimited ascii text file to single character limited ascii text with much labor. comma quote delimited ascii is very much a standard data format. There MUST be a way to import comma quote delimited text into informix, but I sure could not find it ! See we are trying to import dbaseIII and paradox files into informix by first converting them to the WORLD STANDARD of comma quote delimited ascii text, but informix's dbload seems to only allow importation of ascii text that is delimited with a SINGLE character. Please tell us this is NOT so. Thanks in advance, root@TeamInfinity.com (Eric Sean Webber) ============================================================================== From: johnl@informix.com (Jonathan Leffler) Date: 18 Apr 1995 12:23:04 -0400 Message-ID: <3n0p18$dl5@cssun.mathcs.emory.edu> X-Informix-List-ID: The version of SQLCMD which should be available in the c.d.i archives has load/unload modes which accommodate this 'world standard' format. I would suggest that the comma/quote delimited format is highly operating system specific -- namely for DOS and DOS-derivative systems. This is not the same as 'world standard', despite various people's assumption that it is, and desptie the undoubted popularity and wide-spread use of these O/S. To the best of my knowledge (flaky), ISO has not created any world standard for unload formats -- if it has, please quote the standard number. Most mainframe unload formats, for example, do not have any delimiters between fields (and sometimes not between the records either!), and often have implied decimal points. Yours, Jonathan Leffler (johnl@informix.com) #include ============================================================================== From: johnb@ux1.cso.uiuc.edu (John Blomberg) Date: 21 Apr 1995 13:49:02 GMT Message-ID: <3n8d4e$6qr@vixen.cso.uiuc.edu> For those of you with PERL, this script takes a file delimited "," and changes it to the delimiter of your choice: #!/usr/bin/perl $delimiter = '|' ; while (<>) { chop ; #change "," delimitation to $delimiter $str = 0 ; undef $line ; while ( $str < length($_) ) { #go through the line $char = substr($_,$str,1) ; # get the current character @ offset $str if ( $char eq $delimiter ) { print "\n", $_, "\n" ; die "ERROR: Delimiter found in file\n" ; } if ( $char eq '"' ) { # if a quote mark $quote = $quote ? 0 : 1 ; # toggle quote } if ( ! $quote && $char eq ',' ) { # if not in quoted material $line .= $delimiter ; # and char is comma, set delimiter } else { # else put the char back $line .= $char if ( $char ne '"' ) ; } $str++ ; # increment offset } # another problem: some programs chop ASCII output to 80 chars # so we put the lines back together if ( substr($line,-1,1) eq '+' ) { chop $line ; $bigline .= $line ; } else { $bigline .= $line ; print $bigline, "$delimiter\n" ; undef $bigline ; } } ============================================================================== From: ddx85qr@boots.bell-atl.com (Dave Snyder) Date: 24 Apr 1995 11:28:03 -0400 Message-ID: <3ngg23$m8n@cssun.mathcs.emory.edu> X-Informix-List-ID: Well, here's my stab at it... Here --------- Snip Here --------- Snip Here --------- Snip Here --------- Snip /* * asc2infmx.c * Convert comma-delimited (w/ optional quotes) records to Informix "unload" * format. */ #include main(argc, argv) int argc; char *argv[]; { void perror(); if (!*++argv) process(); else do { if (!freopen(*argv, "r", stdin)) { perror(*argv); continue; } process(); } while (*++argv); return(0); } process() { register int c, in_a_quote = 0; while ((c = getchar()) != EOF) switch(c) { case ',': (void)putchar((in_a_quote) ? c : '|'); break; case '"': in_a_quote = !in_a_quote; break; case '\n': (void)putchar('|'); default: (void)putchar(c); break; } } Here --------- Snip Here --------- Snip Here --------- Snip Here --------- Snip DAS -- Dave Snyder @ Bell Atlantic - Philadelphia, PA WEB: http://www.ece.vill.edu/~dave/index.html EMAIL: (w) ddx85qr@boots.bell-atl.com (h) dave@das13.snide.com ============================================================================== From: rob@dssmktg.com (Robert Minter) Date: 24 Apr 1995 12:23:04 -0400 Message-ID: <3ngj98$nv9@cssun.mathcs.emory.edu> X-Informix-List-ID: You can try this. It will get ALMOST all situations. # usage: undoit in-file out-file # << stuff removed for file testing >> cat $1 | sed -e "s/\",\"/|/g" | \ sed -e "s/,\"/|/g" | \ sed -e "s/\",/|/g" | \ sed -e "s/\([0-9]\),\([0-9]\)/\1|\2/g" | \ sed -e "s/^\"//g" | \ sed -e "s/$/|/g" > $2 # The first sed changes "," to | # The second sed changes ," to | # The third sed changes ", to | # The fourth sed changes [0-9],[0-9] to [0-9]|[0-9] # The fifth sed removes ^" # The sixth sed changes $ to | Robert Minter |Data Systems Support| \\\_/// Programmer, Software Development | Orange, CA | ( _ _ ) rob@dssmktg.com | Tel: 714.771.0454 | (| ^ |) De Colores - Emmaus OC-13 | Fax: 714.771.3028 | \`-'/ #include SURF'S UP \_/ ============================================================================== From: mike@lintel.demon.co.uk (Mike Aubury) Date: Mon, 24 Apr 1995 11:35:42 +0000 Message-ID: <655152672wnr@lintel.demon.co.uk> Here are a couple of awk scripts : # c2ascii # converts comma delimited files with optional quotes to std informix unload # assumes no "|" symbols in text ! { for (a=1;a<=length($0);) if ((ch=substr($0,a++,1))=="\"") isquote=isquote?0:1 else printf("%s",(ch==","&&isquote==0)?"|":ch) print "" } # c2informix # converts unload format files to comma delimited quoted output files # all lines must end in a "|" { z="" for (b=1;b newfile.unl awk -f c2ascii myfile.unl > newfile.asc Note: You may need to use nawk on some machines! -- Mike Aubury Senior Technical Consultant ============================================================================== From: nswdot@sydney.DIALix.oz.au (NSW Department of Transport) Date: 27 Apr 1995 11:37:46 +1000 Message-ID: <3nmsha$s8d$1@sydney.DIALix.oz.au> You could try the following nawk script: BEGIN { instring = 0 } { lineout = ""; for (i = 1; i <= length($0); ++i) { onechar = substr($0, i, 1); if (onechar == "|") onechar = "~"; if ((onechar == ",") && (!instring)) onechar = "|"; if (onechar == "\"") if (instring) instring = 0 else instring = 1 else lineout = lineout onechar; } print lineout "|" } It works for us ============================================================================== From: rob@dssmktg.com (Robert Minter) Date: 27 Apr 1995 17:53:03 -0400 Message-ID: <3np3nv$gkk@cssun.mathcs.emory.edu> X-Informix-List-ID: Here is a Quote-Comma file converter. There is awk, c, and whatever else versions out there, also. Just for someone's convenience. And, no, there is no cost for this. (Just playing it safe. Got to walk around here like there is egg shells everywhere.) --v-- SNIP --v-- SNIP --v-- SNIP --v-- SNIP --v-- SNIP --v-- SNIP --v-- # newdelim - quote-comma to delimiter converter # # This will convert MOST, not ALL, of the situations of quote-comma delimited # files. Which ones it won't get, I'm not sure, yet. if test $# -ne 3 ; then echo "usage: newdelim delimiter-to-use in-file out-file" exit 1 fi DELIM=$1 IN_FILE=$2 OUT_FILE=$3 if ! test -r $IN_FILE ; then echo "cannot open file: $IN_FILE" exit 1 fi # # First, backslash any delimiter in the data. Will not backslash already # backslashed chars. # sed -e "s/\([^\]\)${DELIM}/\1\\\\${DELIM}/g" \ # # Next, change "," to delimiter char. # -e "s/\",\"/${DELIM}/g" \ # # Then, change ," to delimiter char. # -e "s/,\"/${DELIM}/g" \ # # Change ", to delimiter char. # -e "s/\",/${DELIM}/g" \ # # Then, change commas between two numeric fields to delimiter char. # -e "s/\([0-9]\),\([0-9]\)/\1${DELIM}\2/g" \ # # Next, remove any quotes at beginning of line. # -e "s/^\"//g" \ # # Almost done, remove any quotes at end of line. # -e "s/\"$//g" \ # # Lastly, add delimiter char at end of line. # -e "s/$/${DELIM}/g" < $IN_FILE > $OUT_FILE # eof --^-- SNIP --^-- SNIP --^-- SNIP --^-- SNIP --^-- SNIP --^-- SNIP --^-- Robert Minter Data Systems Support \\\_/// Senior Software Engineer A Client Technologies Company ( _ _ ) E-Mail: rob@dssmktg.com Tel: 714.771.0454 (| ^ |) #include Fax: 714.771.3028 \`-'/ De Colores - Emmaus OC-13 SURF'S UP \_/ ============================================================================== From: john@mks.com (John Postma) Date: Fri, 28 Apr 95 12:00:15 -0400 Message-ID: <04-28-1995.43216@ia-ics> Very simple to convert your double quote delimited, comma separated file to a single character delimited file with SED (standard Unix stream editor, also avaiable in MKS Toolkit for DOS, OS/2, NT platforms) create file 'convert.sed', only three simply lines s/","/|/g; - this one changes all "," to | s/^"//; - this one removes " at beginning of each line s/"$//; - this one removes " at end of each line Then run sed like this: sed -f convert.sed input_file > output_file Your output file will be ready to import with either dbload or an SQL load statement. -- John Postma Standard disclaimer - comments are my own, not my employers, etc... etc... ==============================================================================