1.4 Internal Representation of Characters

As with the floating-point numbers, the internal representation of characters are also system-dependent. CHGLIB and CHKLIB are packages for standardizing system-dependent character manipulation.

In FORTRAN grammar, the following FORTRAN character set is defined, and a program in FORTRAN must be written using only these characters. (Notes and contents of character-type data are exceptions to this rule.)

Roman characters:	ABCDEFGHIJKLMNOPQRSTUVWXYZ
Numbers:	0123456789
Special characters:	blank`'()*+,-./:=` currency sign

Therefore, strictly speaking, it is grammatically incorrect to write a FORTRAN program using lower-case letters.

However, the method of internal representation of characters, including these FORTRAN characters, are not specifically defined in FORTRAN standard. As with the internal representation of floating-point numbers, there are two types of internal representation of characters: the EBCDIC of the IBM standard and the American standard code ASCII.

EBCDIC
The Extended Binary Coded Decimal Interchange Code (EBCDIC) was set by IBM. True to its name, it is an extension of the BCD (binary coded decimal) code. The BCD code does not use the letters A, B, C, D, E , F for hexadecimals, and only uses the numbers 0-9 to represent decimal numbers. Four bits correspond to a single decimal digit.

A single word in EBCDIC is expressed by 8 bits. For numbers, the higher 4 bits correspond to F and the lower 4 bits to the BCD code. (Thus, it is called the EBCDIC.) This code is also implemented in the so-called IBM-compatible general-purpose machines such as Fujitsu and Hitachi, but there are subtle differences in the definition for each company.

Although all companies use the same code for FORTRAN characters, there are significant system dependencies regarding other characters. Especially, code systems are often changed to accommodate for the katakana used in Japanese. There are two main approaches to this. The first is to replace the lower-case letters in the alphabet with katakana. Of course, with this method, katakana and lower-case letters cannot be used simultaneously. This method is adopted by the Fujitsu M series (although conditions may differ among computer centers.) The other method replaces the lower-case letters with kanji, as in the first method, but also squeezes in the lower-case letters into empty codes. This is also called the EBCDIK code (the K standing for kana). This method is adopted by the Hitachi M series. Thus for these two methods, the upper-case alphabets and katakana will have the same codes, but the lower-case letters will not be interchangeable.
ASCII･ｳ｡ｼ･ﾉ
The ASCII code is a character code system that has been set by the American National Standards Institute (ANSII), and adopted by UNIX and MS-DOS. In Japan, the JIS X0201, which is almost the same as ASCII, is defined. The ASCII code is represented by 7 bits, with the highest digit begin 0. However, there exists a table of 8-bit codes that use all 8 bits to accommodate katakana in the JIS.

It can be seen that character codes are system dependent, even for FORTRAN characters. Therefore, the ICHAR function that returns the character code is a standard built-in function of FORTRAN, the returned value will be system dependent. Functions LGE, LGT, LLE, LLT are for comparing the values of the character code, free of any system dependency. These functions compare the codes following the order of ASCII codes, regardless of the code system used by the computer being used.

Table of EBCDIC Codes

0 1 2 3 4 5 6 7 8 9 A B C D E F

0 NUL DLE SP & - { } $ 0

1 SOH DC1 / a j ~ A J 1

2 STX DC2 FS SYN b k s B K S 2

3 ETX DC3 c l t C L T 3

4 d m u D M U 4

5 HT LF e n v E N V 5

6 BS ETB f o w F O W 6

7 DEL ESC EOT g p x G P X 7

8 CAN h q y H Q Y 8

9 EM i r z I R Z 9

A ｢ｨ｢ｨ｢ｨ :

B VT . , #

C FF DC4 < * % @

D CR GS ENQ NAK ( ) _ '

E SO RS ACK + ; > =

F SI US BEL SUB ｢ｨ｢ｨ ? "

For the definitions of the control codes, see the last table in this node.
Fujitsu has control codes other than those defined here.
For Hitachi EBCDIK, the codes for lower-case alphabets are different.
The definition for the special character ｢ｨ may be different between Fujitsu and Hitachi.
(5B) is the currency symbol. and is $ in the U.S. In such a case, the $ in E0 will become \ (backslash).

Table of ASCII Codes

	0	1	2	3	4	5	6	7
0	NUL	DLE	SP	0	@	P		p
1	SOH	DC1	!	1	A	Q	a	q
2	STX	DC2	"	2	B	R	b	r
3	ETX	DC3	#	3	C	S	c	s
4	EOT	DC4	$	4	D	T	d	t
5	ENQ	NAK	%	5	E	U	e	u
6	ACK	SYN	&	6	F	V	f	v
7	BEL	ETB	'	7	G	W	g	w
8	BS	CAN	(	8	H	X	h	x
9	HT	EM	)	9	I	Y	i	y
A	LF	SUB	*	:	J	Z	j	z
B	VT	ESC	+	;	K	[	k	{
C	FF	FS	,	<	L	`\`	l	\|
D	CR	GS	-	=	M	]	m	}
E	SO	RS	.	>	N	`^`	n	`~`
F	SI	US	/	?	O	_	o	DEL

In JIS X0201, the \ and ~ in the above table become / and ^-(overline).

Table of Control Codes

Symbol	Name
`NUL`	null
`SOH`	start of heading
`STX`	start of text
`ETX`	end of text
`EOT`	end of transmission
`ENQ`	enquiry
`ACK`	acknowledge
`BEL`	bell
`BS`	backspace
`HT`	horizontal tabulation
`LF`	line feed
`VT`	horizontal tabulation
`FF`	form feed
`CR`	carriage return
`SO`	shift out
`SI`	shift in
`DLE`	data link escape

Symbol	Name
`DC1`	device control 1
`DC2`	device control 2
`DC3`	device control 3
`DC4`	device control 4
`NAK`	negative acknowledge
`SYN`	synchronous idle
`ETB`	end of transmission block
`CAN`	cancel
`EM`	end of medium
`SUB`	substitute character
`ESC`	escape
`FS`	file separator
`GS`	group separator
`RS`	record separator
`US`	unit separator
`SP`	space
`DEL`	delete

Correspondence Between EBCDIC of Fujitsu and Hitachi and ASCII

Code	Fujitsu	Hitachi	ASCII
4A	c*	[	[
4F	`\|`	!	!
5A	!	]	]
5B	\	\	$
5F	¬	`^`	`^`
6A		`\|`	`\|`
E0	$	$	`\`

Note 1: For Fujitsu, 4A is a c with a vertical bar.
Note 2: The correspondence with the ASCII code is made for codes that are normally outputted at ASCII terminals, and it may be possible for the codes to be converted into codes not shown in the table.

Back|Forward

DCL:MISC｣ｱ:Summary

	0	1	2	3	4	5	6	7
0	NUL	DLE	SP	0	@	P		p
1	SOH	DC1	!	1	A	Q	a	q
2	STX	DC2	"	2	B	R	b	r
3	ETX	DC3	#	3	C	S	c	s
4	EOT	DC4	$	4	D	T	d	t
5	ENQ	NAK	%	5	E	U	e	u
6	ACK	SYN	&	6	F	V	f	v
7	BEL	ETB	'	7	G	W	g	w
8	BS	CAN	(	8	H	X	h	x
9	HT	EM	)	9	I	Y	i	y
A	LF	SUB	*	:	J	Z	j	z
B	VT	ESC	+	;	K	[	k	{
C	FF	FS	,	<	L	`\`	l	\|
D	CR	GS	-	=	M	]	m	}
E	SO	RS	.	>	N	`^`	n	`~`
F	SI	US	/	?	O	_	o	DEL

	0	1	2	3	4	5	6	7
0	NUL	DLE	SP	0	@	P		p
1	SOH	DC1	!	1	A	Q	a	q
2	STX	DC2	"	2	B	R	b	r
3	ETX	DC3	#	3	C	S	c	s
4	EOT	DC4	$	4	D	T	d	t
5	ENQ	NAK	%	5	E	U	e	u
6	ACK	SYN	&	6	F	V	f	v
7	BEL	ETB	'	7	G	W	g	w
8	BS	CAN	(	8	H	X	h	x
9	HT	EM	)	9	I	Y	i	y
A	LF	SUB	*	:	J	Z	j	z
B	VT	ESC	+	;	K	[	k	{
C	FF	FS	,	<	L	`\`	l	\|
D	CR	GS	-	=	M	]	m	}
E	SO	RS	.	>	N	`^`	n	`~`
F	SI	US	/	?	O	_	o	DEL

	0	1	2	3	4	5	6	7
0	NUL	DLE	SP	0	@	P		p
1	SOH	DC1	!	1	A	Q	a	q
2	STX	DC2	"	2	B	R	b	r
3	ETX	DC3	#	3	C	S	c	s
4	EOT	DC4	$	4	D	T	d	t
5	ENQ	NAK	%	5	E	U	e	u
6	ACK	SYN	&	6	F	V	f	v
7	BEL	ETB	'	7	G	W	g	w
8	BS	CAN	(	8	H	X	h	x
9	HT	EM	)	9	I	Y	i	y
A	LF	SUB	*	:	J	Z	j	z
B	VT	ESC	+	;	K	[	k	{
C	FF	FS	,	<	L	`\`	l	\|
D	CR	GS	-	=	M	]	m	}
E	SO	RS	.	>	N	`^`	n	`~`
F	SI	US	/	?	O	_	o	DEL