2.2 Lexical Elements, Separators, and Delimiters

25
Dec

Static Semantics

1 {text of a program} The text of a program consists of the texts of one or more compilations. {lexical element} {token: See lexical element} The text of each compilation is a sequence of separate lexical elements. Each lexical element is formed from a sequence of characters, and is either a delimiter, an identifier, a reserved word, a numeric_literal, a character_literal, a string_literal, or a comment. The meaning of a program depends only on the particular sequences of lexical elements that form its compilations, excluding comments.

2/2 The text of a compilation is divided into {line} lines. {end of a line} In general, the representation for an end of line is implementation defined. However, a sequence of one or more format_effectors other than the character whose code position is 16#09# (CHARACTER TABULATION) signifies at least one end of line.

2.a Implementation defined: The representation for an end of line.

3/2 {separator} [In some cases an explicit separator is required to separate adjacent lexical elements.] A separator is any of a separator_space, a format_effector, or the end of a line, as follows:

7 One or more separators are allowed between any two adjacent lexical elements, before the first of each compilation, or after the last. At least one separator is required between an identifier, a reserved word, or a numeric_literal and an adjacent identifier, reserved word, or numeric_literal.

8/2 {delimiter} A delimiter is either one of the following characters:

9   &    '    (    )    *    +    ,    -    .    /    :    ;    <    =    >    |

10 {compound delimiter} or one of the following compound delimiters each composed of two adjacent characters:1:

11   =>    ..    **    :=    /=    >=    <=    <<    >>    <>

12 Each of the special characters listed for single character delimiters is a single delimiter except if this character is used as a character of a compound delimiter, or as a character of a comment, string_literal, character_literal, or numeric_literal.

13 The following names are used when referring to compound delimiters: 

delimiter name
=> arrow
.. double dot
** double star, exponentiate
:= assignment (pronounced: “becomes”)
/= inequality (pronounced: “not equal”)
>= greater than or equal
<= less than or equal
<< left label bracket
>> right label bracket
<> box

Implementation Requirements

14 An implementation shall support lines of at least 200 characters in length, not counting any characters used to signify the end of a line. An implementation shall support lexical elements of at least 200 characters in length. The maximum supported line length and lexical element length are implementation defined.

14.a Implementation defined: Maximum supported line length and lexical element length.

14.b Discussion: From URG recommendation.

Wording Changes from Ada 95

14.c/2 The wording was updated to use the new character categories defined in the preceding clause.

 

[aada]

Alexis Ada

In regard to 2/2:

The 16#0D#, 16#0A# and 16#85# characters are considered line separators. Each time one of these characters is found, the line number is increased by one. There is one exception:

  • When a 16#0A# appears right after a 16#0D#, then it is counted as one line separator and the line counter will be increased by one only and not two. This is to support Microsoft Windows files.

The beginning of a file is considered to be line 1, page 1. The number of lines defined on a page defines the page increment which by default is set to 72 [TODO: adjust as necessary]. A pragma can be used to change that number.

Note that this compiler accepts lines of any length (only limited by the amount of memory you have, although the compiler will not memorize an entire line at once since any discovered token is returned as they are discovered.) However, long horizontal lines are not wrapped in any way and thus the compiler always considers them as one line on that page. It is possible to request the compiler to generate a warning or an error on lines that are longer than a given length using a pragma [TODO: define this pragma]. The pragma allows you to specify a length of less than 200 characters (see 14).

The 16#0C# character can be used to increment the page number immediately.

The 16#0B# character adjusts the line number to the next multiple of 8. A pragma can be used to change the number of lines from 8 to another number [TODO: define this pragma].

A lexical is limited to 4096 characters. This is mainly to impose a limit since such a large lexical is most certainly mistakes in the first place (see 14 and 14.a).

  • 1. Correction: I removed the word special and added a colon as in 8/2.