CATEGORII DOCUMENTE |
There are several kinds of tokens: identifiers, keywords, literals, operators, and punctuators. White space and comments are not tokens, though they may act as separators for tokens.
token:
identifier
keyword
integer-literal
real-literal
character-literal
string-literal
operator-or-punctuator
A Unicode character escape sequence represents a Unicode character. Unicode character escape sequences are processed in identifiers (2.4.2), character literals (2.4.4.4), and regular string literals (2.4.4.5). A Unicode character escape is not processed in any other location (for example, to form an operator, punctuator, or keyword).
unicode-escape-sequence:
u
hex-digit hex-digit hex-digit
hex-digit
U
hex-digit hex-digit hex-digit
hex-digit hex-digit hex-digit
hex-digit hex-digit
A Unicode escape sequence represents the single Unicode character formed by the hexadecimal number following the "u" or "U" characters. Since C# uses a 16-bit encoding of Unicode characters in characters and string values, a Unicode character in the range U+10000 to U+10FFFF is not permitted in a character literal and is represented using two Unicode surrogate characters in a string literal. Unicode characters with code points above 0x10FFFF are not supported.
Multiple translations are not performed. For instance, the string literal "u005Cu005C" is equivalent to "u005C" rather than "". (The Unicode value u005C is the character "".)
The example
class
Class1
}
shows several uses of u0066, which is the character escape sequence for the letter "f". The program is equivalent to
class
Class1
}
The rules for identifiers given in this section correspond exactly to those recommended by the Unicode 3.0 standard, Technical Report 15, Annex 7, except that underscore is allowed as an initial character (as is traditional in the C programming language), Unicode escape characters are permitted in identifiers, and the "@" character is allowed as a prefix to enable keywords to be used as identifiers.
identifier:
available-identifier
@ identifier-or-keyword
available-identifier:
An identifier-or-keyword that is not a keyword
identifier-or-keyword:
identifier-start-character
identifier-part-charactersopt
identifier-start-character:
letter-character
_ (the underscore character)
identifier-part-characters:
identifier-part-character
identifier-part-characters
identifier-part-character
identifier-part-character:
letter-character
decimal-digit-character
connecting-character
combining-character
formatting-character
letter-character:
A Unicode character of classes Lu, Ll, Lt, Lm, Lo, or Nl
A unicode-escape-sequence representing a character of classes
Lu, Ll, Lt, Lm, Lo, or Nl
combining-character:
A Unicode character of classes Mn or Mc
A unicode-escape-sequence representing a character of classes Mn or Mc
decimal-digit-character:
A Unicode character of the class Nd
A unicode-escape-sequence representing a character of the
class Nd
connecting-character:
A Unicode character of the class Pc
A unicode-escape-sequence representing a character of the class
Pc
formatting-character:
A Unicode character of the class Cf
A unicode-escape-sequence representing a character of the
class Cf
Examples of legal identifiers include "identifier1", "_identifier2", and "@if".
The prefix "@" enables the use of keywords as identifiers, which is useful when interfacing with other programming languages. The character @ is not actually part of the identifier, so the identifier might be seen in other languages as a normal identifier, without the prefix. Use of the @ prefix for identifiers that are not keywords is permitted, but strongly discouraged as a matter of style.
The example:
class
@class
}
class
Class1
}
defines a class named "class" with a static method named "static" that takes a parameter named "bool". Note that since Unicode escapes are not permitted in keywords, the token "clu0061ss" is an identifier, and is the same identifier as "@class".
Two identifiers are considered the same if they are identical after the following transformations are applied, in order:
The prefix "@", if used, is removed.
Each unicode-escape-sequence is transformed into its corresponding Unicode character
Identifiers containing two consecutive underscore characters are reserved for use by the implementation. For example, an implementation may provide extended keywords that begin with two underscores.
A keyword is an identifier-like sequence of characters that is reserved, and cannot be used as an identifier except when prefaced by the @ character.
keyword: one of
abstract as base bool break
byte case catch char checked
class const continue decimal default
delegate do double else enum
event explicit extern false finally
fixed float for foreach goto
if implicit in int interface
internal is lock long namespace
new null object operator out
override params private protected public
readonly ref return sbyte sealed
short sizeof stackalloc static string
struct switch this throw true
try typeof uint ulong unchecked
unsafe ushort using virtual void
volatile while
In some places in the grammar, specific identifiers have special meaning, but are not keywords. For example, within a property declaration, the "get" and "set" identifiers have special meaning (10.6.2). An identifier other than get or set is never permitted in these locations, so this use does not conflict with a use of these words as identifiers.
A literal is a source code representation of a value.
literal:
boolean-literal
integer-literal
real-literal
character-literal
string-literal
null-literal
There are two boolean literal values: true and false.
boolean-literal:
true
false
The type of a boolean-literal is bool.
Integer literals are used to write values of types int, uint, long, and ulong. Integer literals have two possible forms: decimal and hexadecimal.
integer-literal:
decimal-integer-literal
hexadecimal-integer-literal
decimal-integer-literal:
decimal-digits integer-type-suffixopt
decimal-digits:
decimal-digit
decimal-digits decimal-digit
decimal-digit: one of
0 1
2 3 4
5 6 7
8 9
integer-type-suffix: one of
U u
L l UL
Ul uL ul
LU Lu lU lu
hexadecimal-integer-literal:
0x hex-digits
integer-type-suffixopt
0X hex-digits
integer-type-suffixopt
hex-digits:
hex-digit
hex-digits hex-digit
hex-digit: one of
0 1
2 3 4
5 6 7
8 9 A
B C D
E F a
b c d
e f
The type of an integer literal is determined as follows:
If the literal has no suffix, it has the first of these types in which its value can be represented: int, uint, long, ulong.
If the literal is suffixed by U or u, it has the first of these types in which its value can be represented: uint, ulong.
If the literal is suffixed by L or l, it has the first of these types in which its value can be represented: long, ulong.
If the literal is suffixed by UL, Ul, uL, ul, LU, Lu, lU, or lu, it is of type ulong.
If the value represented by an integer literal is outside the range of the ulong type, an error occurs.
As a matter of style, it is suggested that "L" be used instead of "l" when writing literals of type long, since it is easy to confuse the letter "l" with the digit "1".
To permit the smallest possible int and long values to be written as decimal integer literals, the following two rules exist:
When a decimal-integer-literal with the value 2147483648 (231) and no integer-type-suffix appears as the token immediately following a unary minus operator token (7.6.2), the result is a constant of type int with the value −2147483648 (−231). In all other situations, such a decimal-integer-literal is of type uint.
When a decimal-integer-literal with the value 9223372036854775808 (263) and no integer-type-suffix appears as the token immediately following a unary minus operator token (7.6.2), the result is a constant of type long with the value −9223372036854775808 (−263). In all other situations, such a decimal-integer-literal is of type ulong.
Real literals are used to write values of types float, double, and decimal.
real-literal:
decimal-digits . decimal-digits exponent-partopt real-type-suffixopt
. decimal-digits exponent-partopt real-type-suffixopt
decimal-digits exponent-part real-type-suffixopt
decimal-digits real-type-suffix
exponent-part:
e signopt decimal-digits
E signopt decimal-digits
sign: one of
+ -
real-type-suffix: one of
F f
D d M m
If no real type suffix is specified, the type of the real literal is double. Otherwise, the real type suffix determines the type of the real literal, as follows:
A real literal suffixed by F or f is of type float. For example, the literals 1f, 1.5f, 1e10f, and 123.456F are all of type float.
A real literal suffixed by D or d is of type double. For example, the literals 1d, 1.5d, 1e10d, and 123.456D are all of type double.
A real literal suffixed by M or m is of type decimal. For example, the literals 1m, 1.5m, 1e10m, and 123.456M are all of type decimal.
If the specified literal cannot be represented in the indicated type, then a compile-time error occurs.
The value of a real literal is determined by using the IEEE "round to nearest" mode.
A character literal represents a single character, and usually consists of a character in quotes, as in 'a'.
character-literal:
' character
'
character:
single-character
simple-escape-sequence
hexadecimal-escape-sequence
unicode-escape-sequence
single-character:
Any character except ' (U+0027), (U+005C), and new-line-character
simple-escape-sequence: one of
' '
0 a
b f n
r t v
hexadecimal-escape-sequence:
x hex-digit
hex-digitopt
hex-digitopt
hex-digitopt
A character that follows a backslash character () in a character must be one of the following characters: ', ', , 0, a, b, f, n, r, t, u, U, x, v. Otherwise, a compile-time error occurs.
A simple escape sequence represents a Unicode character encoding, as described in the table below.
Escape sequence |
Character name |
Unicode encoding |
' |
Single quote |
0x0027 |
' |
Double quote |
0x0022 |
|
Backslash |
0x005C |
0 |
Null |
0x0000 |
a |
Alert |
0x0007 |
b |
Backspace |
0x0008 |
f |
Form feed |
0x000C |
n |
New line |
0x000A |
r |
Carriage return |
0x000D |
t |
Horizontal tab |
0x0009 |
v |
Vertical tab |
0x000B |
A hexadecimal escape sequence represents a single Unicode character, with the value formed by the hexadecimal number following "x".
If the value represented by a character literal is greater than U+FFFF, an error occurs.
A Unicode character escape sequence (2.4.1) in a character literal must be in the range U+0000 to U+FFFF. Unicode characters in the range U+10000 to U+10FFFF are only permitted in string literals and are encoded as two Unicode "surrogate" characters.
The type of a character-literal is char.
C# supports two forms of string literals: regular string literals and verbatim string literals.
A regular string literal consists of zero or more characters enclosed in double quotes, as in 'hello', and may include both simple escape sequences (such as t for the tab character), hexadecimal escape sequences, and Unicode escape sequences.
A verbatim string literal consists of an @ character followed by a double-quote character, zero or more characters, and a closing double-quote character. A simple example is @'hello'. In a verbatim string literal, the characters between the delimiters are interpreted verbatim, the only exception being a quote-escape-sequence. In particular, simple escape sequences, hexadecimal escape sequences, and Unicode character escape sequences are not processed in verbatim string literals. A verbatim string literal may span multiple lines.
string-literal:
regular-string-literal
verbatim-string-literal
regular-string-literal:
'
regular-string-literal-charactersopt '
regular-string-literal-characters:
regular-string-literal-character
regular-string-literal-characters
regular-string-literal-character
regular-string-literal-character:
single-regular-string-literal-character
simple-escape-sequence
hexadecimal-escape-sequence
unicode-escape-sequence
single-regular-string-literal-character:
Any character except ' (U+0022), (U+005C), and new-line-character
verbatim-string-literal:
@' verbatim
-string-literal-charactersopt
'
verbatim-string-literal-characters:
verbatim-string-literal-character
verbatim-string-literal-characters
verbatim-string-literal-character
verbatim-string-literal-character:
single-verbatim-string-literal-character
quote-escape-sequence
single-verbatim-string-literal-character:
any character except '
quote-escape-sequence:
''
A character that follows a backslash character () in a regular-string-literal-character must be one of the following characters: ', ', , 0, a, b, f, n, r, t, u, U, x, v. Otherwise, a compile-time error occurs.
The example
string a = 'hello, world'; // hello, world
string b = @'hello, world'; //
hello, world
string c = 'hello t world'; // hello world
string d = @'hello t world'; // hello t world
string e = 'Joe said 'Hello'
to me'; // Joe said
'Hello' to me
string f = @'Joe said ''Hello'' to me'; // Joe said 'Hello' to me
string g =
'serversharefile.txt'; //
serversharefile.txt
string h = @'serversharefile.txt'; // serversharefile.txt
string i = 'onentwonthree';
string j = @'one
two
three';
shows a variety of string literals. The last string literal, j, is a verbatim string literal that spans multiple lines. The characters between the quotation marks, including white space such as newline characters, are preserved verbatim.
Since a hexadecimal escape sequence can have a variable number of hex digits, the string literal 'x123' contains a single character with hex value 123. To have a string containing the two characters with hex values 12 and 3, respectively, one could write 'x00123' or 'x12' + '3' instead.
The type of a string-literal is string.
Each string literal does not necessarily result in a new string instance. When two or more string literals that are equivalent according to the string equality operator (7.9.7) appear in the same assembly, these string literals refer to the same string instance. For instance, the output of the program
class Test
}
is True because the two literals refer to the same string instance.
null-literal:
null
The type of a null-literal is the null type.
There are several kinds of operators and punctuators. Operators are used in expressions to describe operations involving one or more operands. For example, the expression a + b uses the + operator to add the two operands a and b. Punctuators are for grouping and separating.
operator-or-punctuator: one of
[ ] ( ) . , : ;
+ - * / % & | ^ ! ~
= < > ? ++ -- && || << >>
== != <= >= += -= *= /= %= &=
Politica de confidentialitate | Termeni si conditii de utilizare |
Vizualizari: 1070
Importanta:
Termeni si conditii de utilizare | Contact
© SCRIGROUP 2024 . All rights reserved