man*/: srcfix (Use .P instead of .PP or .LP)

[thirdparty/man-pages.git] / man7 / utf-8.7
diff --git a/man7/utf-8.7 b/man7/utf-8.7

index 015d4b746b0aff5ae9439473fa492aa0349bb06b..2ea14b2e41ed383dc79f758828ad6ca67bb84a3a 100644 (file)
--- a/man7/utf-8.7
+++ b/man7/utf-8.7
@@ -27,7 +27,7 @@ The ISO/IEC 10646 Universal Character Set (UCS),
  a superset of Unicode, occupies an even larger code
  space\[em]31\ bits\[em]and the obvious
  UCS-4 encoding for it (a sequence of 32-bit words) has the same problems.
-.PP
+.P
  The UTF-8 encoding of Unicode and UCS
  does not have these problems and is the common way in which
  Unicode is used on UNIX-style operating systems.
@@ -110,14 +110,14 @@ The sequence to be used depends on the UCS code number of the character:
  .RI 10 xxxxxx
  .RI 10 xxxxxx
  .RI 10 xxxxxx
-.PP
+.P
  The
  .I xxx
  bit positions are filled with the bits of the character code number in
  binary representation, most significant bit first (big-endian).
  Only the shortest possible multibyte sequence
  which can represent the code number of the character can be used.
-.PP
+.P
  The UCS code values 0xd800\[en]0xdfff (UTF-16 surrogates) as well as 0xfffe and
  0xffff (UCS noncharacters) should not appear in conforming UTF-8 streams.
  According to RFC 3629 no point above U+10FFFF should be used,
@@ -125,44 +125,44 @@ which limits characters to four bytes.
  .SS Example
  The Unicode character 0xa9 = 1010 1001 (the copyright sign) is encoded
  in UTF-8 as
-.PP
+.P
  .RS
  11000010 10101001 = 0xc2 0xa9
  .RE
-.PP
+.P
  and character 0x2260 = 0010 0010 0110 0000 (the "not equal" symbol) is
  encoded as:
-.PP
+.P
  .RS
  11100010 10001001 10100000 = 0xe2 0x89 0xa0
  .RE
  .SS Application notes
  Users have to select a UTF-8 locale, for example with
-.PP
+.P
  .RS
  export LANG=en_GB.UTF-8
  .RE
-.PP
+.P
  in order to activate the UTF-8 support in applications.
-.PP
+.P
  Application software that has to be aware of the used character
  encoding should always set the locale with for example
-.PP
+.P
  .RS
  setlocale(LC_CTYPE, "")
  .RE
-.PP
+.P
  and programmers can then test the expression
-.PP
+.P
  .RS
  strcmp(nl_langinfo(CODESET), "UTF-8") == 0
  .RE
-.PP
+.P
  to determine whether a UTF-8 locale has been selected and whether
  therefore all plaintext standard input and output, terminal
  communication, plaintext file content, filenames, and environment
  variables are encoded in UTF-8.
-.PP
+.P
  Programmers accustomed to single-byte encodings such as US-ASCII or ISO 8859
  have to be aware that two assumptions made so far are no longer valid
  in UTF-8 locales.
@@ -178,7 +178,7 @@ Library functions such as
  and
  .BR wcswidth (3)
  should be used today to count characters and cursor positions.
-.PP
+.P
  The official ESC sequence to switch from an ISO 2022
  encoding scheme (as used for instance by VT100 terminals) to
  UTF-8 is ESC % G