introduction This file tells what I had to do to convert a large program (ROFF) from RATFOR to C. Most of this file was written just after I finished the project, so it is quite complete. general changes Comments must be changed from RATFOR style to C style. RAT4 comments start with # (outside strings) and continue to the end of the line. C comments start with /* and continue to */. The program rat2c converts from RAT4 comments to C comments. Multiple spaces may be replaced with tabs. The routines entab() and detab() are only marginally useful for this. RATFOR define statment must be replaced by the C #define. The RATFOR define looks like: define (a,b) The C define looks like: #define a b Note that the C define must start in column 1. No arithmetic is allowed in the C define. RAT4 had the ifdef and ifnotdef pseudo ops. These may have to be changed in some versions of C. They can be replaced by just commenting out code or by creating runtime global variables corresponding to compile time variables tested by the ifdef. RAT4 uses includes inside each program for including declarations of variables. In C, these #includes appear once at the start of each file. The includes inside each RAT4 program must be commented out and replaced by a single set at the start of the C file. Printable character constants can be eliminated from C programs. I think that RAT4 uses them to get around problems with character sets. At the present time I can see no reason to have constants for printable characters. I may be wrong though. RAT4 common blocks must be replaced by global variables declared in a header file. Array dimensions must appear in the variable declarations in C rather than in the common block declaration. changes to declarations The RAT4 keywords function and subroutine must be eliminated where they appear. RAT4 groups parameter declarations with declarations of local variables. In C, these two declarations are separated by a {. The RAT4 keyword integer and character must be changed to int and char. The RAT4 keyword pointer indicates (probably) that the variable is an int whose address will be passed (via &) as an actual parameter. It is not clear whether the RAT4 usage of pointer is consistent with good C coding practice. As mentioned before, include statements inside RAT4 subroutines must be eliminated. The RAT4 data statement can not be simulated directly without C initializers. An alternative is to create global variables which are initialized at run time. The names of functions which are used by a RAT4 subroutine appear in the declarations of that routine. These declarations must be eliminated from C programs. (The C compiler thinks local variables are being declared.) RAT4 uses parens to declare arrays. C uses square brackets. changes to executable statements The biggest problem is probably subscripts. (See below for a more semantic problem with subscripts). RAT4 uses parens for both actual parameter lists and array subscripts. C uses parens for parameter lists, but square brackets for subscripts. Clearly, whether to convert to square brackets depends on knowing whether an id is an function or not. In C, all statements must end in semicolons. Once again, a full RAT4 parser is needed in order to know where to put the semicolons. RAT4 assignment statements like: i = i + 1 can be changed to: i++; In RAT4 functions, the name of the function is used as a variable which denotes the value returned by the function. In C, the name of a function denotes recursive execution. In C, a new local variable must be declared inside the function and the function must return a value with a return(value) statement. My version of C does not have a repeat statement. The RAT4 statement: repeat { ... } until ( ) must be replaced by the C statement: while (1) { ... if ( ) break; } The RAT4 statement next must be replaced by the C statement continue. RAT4 uses call statement. C does not. RAT4 calls with no arguments do not need actual parameter lists. In C, a function with no arguments still needs (). In RAT4 all arrays start at 1. In C all arrays start at 0. This means that array declarations will have to be changed, as well as all code that refers to the limits of arrays. In practice, a semantic analysis of how all variables of a program are used is required. Almost every part of a program might need to be changed: initialization of variables, for loop indices, actual parameters, etc. RAT4 uses call by reference. C uses call by value. This means that if a RAT4 subroutine changes any of it's arguments, the C routine will have to pass a pointer to a variable, rather than the variable itself. The C routine will have to change the variable via indirection. This also means that C programs do not have to protect there arguments from change like RAT4 programs do. RAT4 uses & and | to indicate AND and OR. C uses && and || to indicate boolean AND and OR. C can use & and | only if each relational clause joined by & and | is fully parenthesized. For best portability RAT4 uses the construction: junk = func() even when junk is not going to be used. This is not needed in C.