Over the last five years, I've written quite a bit of material about processing IFS files from an RPG program. Most of the material that I've written demonstrates using the open() and read() APIs to read text from an IFS file.
This technique works great for many IFS files, but it can be awkward when you're processing a text file. A text file is a file in which the records aren't always the same length. Instead of reading fixed-length records, each record ends when the carriage return and line feed (CRLF) characters are found. Unfortunately, this doesn't work so well with the open() and read() APIs, because reading a file one character at a time is not very efficient! This article demonstrates a different API that makes reading text file records easy without sacrificing efficiency.
The reason the read() API can't do this type of reading efficiently is because it goes back to the disk each time you attempt to read a byte. This frequency is no problem when you read large amounts at once, because data is transferred from disk to memory only once. However, when you read a file one byte at a time, each call to the read() API requires it to transfer data from disk to memory. The extra overhead of doing that repeatedly for only one byte at a time causes poor performance.
Fortunately, an alternative exists. ILE C has a set of APIs different from the familiar open(), read(), and close() ones. Because ILE languages can call subprocedures written in other ILE languages, you can use these ILE C APIs from your RPG program.
These alternative APIs are useful because they load data from disk more efficiently. They calculate the optimal size of data to be read at once from the file, and then they load that much from disk at a time. The data, once loaded, is kept in memory in a buffer. Each time you try to read the file, it first reads from the buffer and only goes back to disk once the buffer is used up.
This disk buffering technique is similar in concept to the "record blocking" that we're used to using in RPG, except that it works on stream files rather than traditional record-based files.
The ILE C stream file access APIs are as follows:
- fopen(): Open a file.
- fclose(): Close a file.
- fread(): Read bytes from a file.
- fwrite(): Write bytes to a file.
- fgets(): Get a line of text from a CRLF-delimited text file.
- fputs(): Write a line of text to a CRLF-delimited text file.
- fseek(): Seek (i.e., move the file cursor) to a particular position in the file.
- ftell(): Return the current position in the file.
I've written a copy book that contains all the prototypes and constants needed to use these APIs from ILE RPG. I named this copybook STDIO_H, and it's included in the code download for this article.
The most interesting of these APIs is the fgets() API. It takes care of all the work of reading the data from the stream file until a CRLF delimiter is found. You no longer have to code a loop that reads the file byte-by-byte looking for the CRLF characters! Woo hoo!
Before you can call fgets(), you have to open the file by calling the fopen() API. Here's code that does that: D file s like(pFILE)
.
.
filename = '/home/scott/testfile.txt';
file = fopen(%trimr(filename): 'r');
if (file = *NULL);
// error occurred! Check errno!
endif;
The first parameter to the fopen() API is the IFS path name to the file that you want to open. You should always use the %trimr() (or a comparable method) to remove any trailing spaces from the file name.
The second parameter tells the system how you want to open the file. It can have the following values:
- r Open the file for reading only.
- r+ Open the file for reading and writing.
- w Create the file if it does not exist, or clear the file if it does exist. Then open it for writing only.
- w+ Same as "w," except that the file is opened for reading and writing.
- a Open the file for writing only and create it if it doesn't exist. All data is always written at the end of the file. You cannot overwrite existing data in the file.
- a+ Same as "a," except that the file is opened for reading and writing.
The preceding options are alternatives to one another. You can specify only one of them.
Some optional modifiers can also be added to enable special processing of the IFS file. These modifiers are added to the end of the option that you choose. The modifiers are:
- b This can be added to any of the preceding values to open the file in "binary" mode. In binary mode, data is not translated from one character set to another (e.g., ASCII to EBCDIC translations do not occur).
- o_ccsid=xxxxx This keyword has to be separated from the preceding ones by a comma. It specifies the Coded Character Set Identifier (CCSID) of the data that you intend to write to the file. If the file doesn't exist, it gets tagged with this CCSID. If the file does exist, it assumes that this CCSID is the CCSID of the data that you're providing to the fwrite() or fputs() APIs, and it converts from that CCSID to the one that the file was originally tagged with.
- crln=N This specifies whether each line of text in the file ends with CRLF or only with LF. When you set this parameter to Y (the default) it looks for CRLF. If this is set to N, it uses LF only.
For example, the following statement creates a new file and assigns it CCSID 819: D file s like(pFILE)
.
.
filename = '/home/scott/newfile.txt';
file = fopen(%trimr(filename): 'w,o_ccsid=819');
if (file = *NULL);
// error occurred! Check errno!
endif;
The fopen() API returns a pointer that the API uses internally to keep track of which file it has open, its position in the file, and the status of the buffering. You don't have to worry about what this pointer is set to. All you need to do is keep it in a variable so you can pass that variable to the other APIs so that they know which file to read or write from.
If the fopen() API returns *NULL, an error has occurred, and you should check the errno value to see what went wrong. This is the same errno used with the IFS APIs and socket APIs. If you're unfamiliar with this concept, please see the following article: http://www.systeminetwork.com/article.cfm?id=19312
Or on my web site at the following link: http://www.scottklement.com/rpg/ifs_ebook/errors.html
After the file is opened, you can call the fgets() API to read data as lines in a text file. The fgets() API takes care of all the buffering and searching for CRLF for you. D p_line s *
D rddata s 8000a
D line s 8000a varying
.
.
p_line = fgets(%addr(rddata): %size(rddata): file);
dow (p_line <> *null);
line = %str(p_line);
// now the "line" variable contains one line
// of text from the IFS file! Insert code here
// to use that line for whatever you need to
// use it for...
p_line = fgets(%addr(rddata): %size(rddata): file);
enddo;
The fgets() API accepts three parameters: a pointer to a buffer, the size of the buffer, and the file pointer to read from. The first pointer should be the address of a variable in your program. This is where the fgets() API reads data into. The second parameter is the length (in bytes) of the variable that you specified in the first parameter. The last parameter is the value that you received when you called the fopen() API, and it tells the fgets() API which file to read from.
The fgets() API returns a pointer to the data that it has read if it's successful in other words, it returns a pointer to the variable that you specified in the first parameter! If you reach the end of the file, a *NULL pointer is returned, instead.
The data that the fgets() API loads into the variable (rddata in the preceding example) is a null-terminated string, like the ones usually used in C programs. To convert it to an RPG-style alphanumeric string, you call the %str() built-in function (BIF). In my example, I use the %str() BIF to convert the C-style string in the rddata variable to an RPG-style string in the line variable. After that's done, the line variable contains one line of text from the stream file and is ready for you to use in your program.
The preceding sample code calls fgets() again in a loop and continues to read the file until fgets() returns *NULL, indicating that the end of the file was reached.
After you're done with the file, you have to close it by calling the fclose() API. The only parameter that you have to pass is the file pointer that you received when you called the fopen() API. For example: fclose(file);
To demonstrate the entire process, I've written an RPG program that reads a pipe-delimited text file. It reads the whole file, one line at a time, using the fgets() API. For each record, it uses a subprocedure that I wrote, called gettok(), to break each record up into fields. With minor modifications, this program could search for commas or tabs instead of pipes.
You can download this sample program and all the copy books from the following link: http://www.pentontech.com/IBMContent/Documents/article/53867_157_PipeDelim.zip
19-01-2007 om 00:00
geschreven door Qmma 
|