github的一些开源项目
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

163 lines
4.0 KiB

  1. # These test special UTF and UCP features of DFA matching. The output is
  2. # different for the different widths.
  3. #subject dfa
  4. # ----------------------------------------------------
  5. # These are a selection of the more comprehensive tests that are run for
  6. # non-DFA matching.
  7. /X/utf
  8. XX\x{d800}
  9. Failed: error -24: UTF-16 error: missing low surrogate at end at offset 2
  10. XX\x{d800}\=offset=3
  11. No match
  12. XX\x{d800}\=no_utf_check
  13. 0: X
  14. XX\x{da00}
  15. Failed: error -24: UTF-16 error: missing low surrogate at end at offset 2
  16. XX\x{da00}\=no_utf_check
  17. 0: X
  18. XX\x{dc00}
  19. Failed: error -26: UTF-16 error: isolated low surrogate at offset 2
  20. XX\x{dc00}\=no_utf_check
  21. 0: X
  22. XX\x{de00}
  23. Failed: error -26: UTF-16 error: isolated low surrogate at offset 2
  24. XX\x{de00}\=no_utf_check
  25. 0: X
  26. XX\x{dfff}
  27. Failed: error -26: UTF-16 error: isolated low surrogate at offset 2
  28. XX\x{dfff}\=no_utf_check
  29. 0: X
  30. XX\x{110000}
  31. ** Failed: character \x{110000} is greater than 0x10ffff and so cannot be converted to UTF-16
  32. XX\x{d800}\x{1234}
  33. Failed: error -25: UTF-16 error: invalid low surrogate at offset 2
  34. /badutf/utf
  35. X\xdf
  36. No match
  37. XX\xef
  38. No match
  39. XXX\xef\x80
  40. No match
  41. X\xf7
  42. No match
  43. XX\xf7\x80
  44. No match
  45. XXX\xf7\x80\x80
  46. No match
  47. /shortutf/utf
  48. XX\xdf\=ph
  49. No match
  50. XX\xef\=ph
  51. No match
  52. XX\xef\x80\=ph
  53. No match
  54. \xf7\=ph
  55. No match
  56. \xf7\x80\=ph
  57. No match
  58. # ----------------------------------------------------
  59. # UCP and casing tests - except for the first two, these will all fail in 8-bit
  60. # mode because they are testing UCP without UTF and use characters > 255.
  61. /\x{c1}/i,no_start_optimize
  62. \= Expect no match
  63. \x{e1}
  64. No match
  65. /\x{c1}+\x{e1}/iB,ucp
  66. ------------------------------------------------------------------
  67. Bra
  68. /i \x{c1}+
  69. /i \x{e1}
  70. Ket
  71. End
  72. ------------------------------------------------------------------
  73. \x{c1}\x{c1}\x{c1}
  74. 0: \xc1\xc1\xc1
  75. 1: \xc1\xc1
  76. \x{e1}\x{e1}\x{e1}
  77. 0: \xe1\xe1\xe1
  78. 1: \xe1\xe1
  79. /\x{120}\x{c1}/i,ucp,no_start_optimize
  80. \x{121}\x{e1}
  81. 0: \x{121}\xe1
  82. /\x{120}\x{c1}/i,ucp
  83. \x{121}\x{e1}
  84. 0: \x{121}\xe1
  85. /[^\x{120}]/i,no_start_optimize
  86. \x{121}
  87. 0: \x{121}
  88. /[^\x{120}]/i,ucp,no_start_optimize
  89. \= Expect no match
  90. \x{121}
  91. No match
  92. /[^\x{120}]/i
  93. \x{121}
  94. 0: \x{121}
  95. /[^\x{120}]/i,ucp
  96. \= Expect no match
  97. \x{121}
  98. No match
  99. /\x{120}{2}/i,ucp
  100. \x{121}\x{121}
  101. 0: \x{121}\x{121}
  102. /[^\x{120}]{2}/i,ucp
  103. \= Expect no match
  104. \x{121}\x{121}
  105. No match
  106. # ----------------------------------------------------
  107. # ----------------------------------------------------
  108. # Tests for handling 0xffffffff in caseless UCP mode. They only apply to 32-bit
  109. # mode; for the other widths they will fail.
  110. /k*\x{ffffffff}/caseless,ucp
  111. Failed: error 134 at offset 13: character code point value in \x{} or \o{} is too large
  112. \x{ffffffff}
  113. /k+\x{ffffffff}/caseless,ucp,no_start_optimize
  114. Failed: error 134 at offset 13: character code point value in \x{} or \o{} is too large
  115. K\x{ffffffff}
  116. \= Expect no match
  117. \x{ffffffff}\x{ffffffff}
  118. /k{2}\x{ffffffff}/caseless,ucp,no_start_optimize
  119. Failed: error 134 at offset 15: character code point value in \x{} or \o{} is too large
  120. \= Expect no match
  121. \x{ffffffff}\x{ffffffff}\x{ffffffff}
  122. /k\x{ffffffff}/caseless,ucp,no_start_optimize
  123. Failed: error 134 at offset 12: character code point value in \x{} or \o{} is too large
  124. K\x{ffffffff}
  125. \= Expect no match
  126. \x{ffffffff}\x{ffffffff}\x{ffffffff}
  127. /k{2,}?Z/caseless,ucp,no_start_optimize,no_auto_possess
  128. \= Expect no match
  129. Kk\x{ffffffff}\x{ffffffff}\x{ffffffff}Z
  130. ** Character \x{ffffffff} is greater than 0xffff and UTF-16 mode is not enabled.
  131. ** Truncation will probably give the wrong result.
  132. ** Character \x{ffffffff} is greater than 0xffff and UTF-16 mode is not enabled.
  133. ** Truncation will probably give the wrong result.
  134. ** Character \x{ffffffff} is greater than 0xffff and UTF-16 mode is not enabled.
  135. ** Truncation will probably give the wrong result.
  136. No match
  137. # ----------------------------------------------------
  138. # End of testinput14