github的一些开源项目
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

163 lines
5.3 KiB

  1. # These test special UTF and UCP features of DFA matching. The output is
  2. # different for the different widths.
  3. #subject dfa
  4. # ----------------------------------------------------
  5. # These are a selection of the more comprehensive tests that are run for
  6. # non-DFA matching.
  7. /X/utf
  8. XX\x{d800}
  9. Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
  10. XX\x{d800}\=offset=3
  11. Error -36 (bad UTF-8 offset)
  12. XX\x{d800}\=no_utf_check
  13. 0: X
  14. XX\x{da00}
  15. Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
  16. XX\x{da00}\=no_utf_check
  17. 0: X
  18. XX\x{dc00}
  19. Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
  20. XX\x{dc00}\=no_utf_check
  21. 0: X
  22. XX\x{de00}
  23. Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
  24. XX\x{de00}\=no_utf_check
  25. 0: X
  26. XX\x{dfff}
  27. Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
  28. XX\x{dfff}\=no_utf_check
  29. 0: X
  30. XX\x{110000}
  31. Failed: error -15: UTF-8 error: code points greater than 0x10ffff are not defined at offset 2
  32. XX\x{d800}\x{1234}
  33. Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
  34. /badutf/utf
  35. X\xdf
  36. Failed: error -3: UTF-8 error: 1 byte missing at end at offset 1
  37. XX\xef
  38. Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 2
  39. XXX\xef\x80
  40. Failed: error -3: UTF-8 error: 1 byte missing at end at offset 3
  41. X\xf7
  42. Failed: error -5: UTF-8 error: 3 bytes missing at end at offset 1
  43. XX\xf7\x80
  44. Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 2
  45. XXX\xf7\x80\x80
  46. Failed: error -3: UTF-8 error: 1 byte missing at end at offset 3
  47. /shortutf/utf
  48. XX\xdf\=ph
  49. Failed: error -3: UTF-8 error: 1 byte missing at end at offset 2
  50. XX\xef\=ph
  51. Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 2
  52. XX\xef\x80\=ph
  53. Failed: error -3: UTF-8 error: 1 byte missing at end at offset 2
  54. \xf7\=ph
  55. Failed: error -5: UTF-8 error: 3 bytes missing at end at offset 0
  56. \xf7\x80\=ph
  57. Failed: error -4: UTF-8 error: 2 bytes missing at end at offset 0
  58. # ----------------------------------------------------
  59. # UCP and casing tests - except for the first two, these will all fail in 8-bit
  60. # mode because they are testing UCP without UTF and use characters > 255.
  61. /\x{c1}/i,no_start_optimize
  62. \= Expect no match
  63. \x{e1}
  64. No match
  65. /\x{c1}+\x{e1}/iB,ucp
  66. ------------------------------------------------------------------
  67. Bra
  68. /i \x{c1}+
  69. /i \x{e1}
  70. Ket
  71. End
  72. ------------------------------------------------------------------
  73. \x{c1}\x{c1}\x{c1}
  74. 0: \xc1\xc1\xc1
  75. 1: \xc1\xc1
  76. \x{e1}\x{e1}\x{e1}
  77. 0: \xe1\xe1\xe1
  78. 1: \xe1\xe1
  79. /\x{120}\x{c1}/i,ucp,no_start_optimize
  80. Failed: error 134 at offset 6: character code point value in \x{} or \o{} is too large
  81. \x{121}\x{e1}
  82. /\x{120}\x{c1}/i,ucp
  83. Failed: error 134 at offset 6: character code point value in \x{} or \o{} is too large
  84. \x{121}\x{e1}
  85. /[^\x{120}]/i,no_start_optimize
  86. Failed: error 134 at offset 8: character code point value in \x{} or \o{} is too large
  87. \x{121}
  88. /[^\x{120}]/i,ucp,no_start_optimize
  89. Failed: error 134 at offset 8: character code point value in \x{} or \o{} is too large
  90. \= Expect no match
  91. \x{121}
  92. /[^\x{120}]/i
  93. Failed: error 134 at offset 8: character code point value in \x{} or \o{} is too large
  94. \x{121}
  95. /[^\x{120}]/i,ucp
  96. Failed: error 134 at offset 8: character code point value in \x{} or \o{} is too large
  97. \= Expect no match
  98. \x{121}
  99. /\x{120}{2}/i,ucp
  100. Failed: error 134 at offset 6: character code point value in \x{} or \o{} is too large
  101. \x{121}\x{121}
  102. /[^\x{120}]{2}/i,ucp
  103. Failed: error 134 at offset 8: character code point value in \x{} or \o{} is too large
  104. \= Expect no match
  105. \x{121}\x{121}
  106. # ----------------------------------------------------
  107. # ----------------------------------------------------
  108. # Tests for handling 0xffffffff in caseless UCP mode. They only apply to 32-bit
  109. # mode; for the other widths they will fail.
  110. /k*\x{ffffffff}/caseless,ucp
  111. Failed: error 134 at offset 13: character code point value in \x{} or \o{} is too large
  112. \x{ffffffff}
  113. /k+\x{ffffffff}/caseless,ucp,no_start_optimize
  114. Failed: error 134 at offset 13: character code point value in \x{} or \o{} is too large
  115. K\x{ffffffff}
  116. \= Expect no match
  117. \x{ffffffff}\x{ffffffff}
  118. /k{2}\x{ffffffff}/caseless,ucp,no_start_optimize
  119. Failed: error 134 at offset 15: character code point value in \x{} or \o{} is too large
  120. \= Expect no match
  121. \x{ffffffff}\x{ffffffff}\x{ffffffff}
  122. /k\x{ffffffff}/caseless,ucp,no_start_optimize
  123. Failed: error 134 at offset 12: character code point value in \x{} or \o{} is too large
  124. K\x{ffffffff}
  125. \= Expect no match
  126. \x{ffffffff}\x{ffffffff}\x{ffffffff}
  127. /k{2,}?Z/caseless,ucp,no_start_optimize,no_auto_possess
  128. \= Expect no match
  129. Kk\x{ffffffff}\x{ffffffff}\x{ffffffff}Z
  130. ** Character \x{ffffffff} is greater than 255 and UTF-8 mode is not enabled.
  131. ** Truncation will probably give the wrong result.
  132. ** Character \x{ffffffff} is greater than 255 and UTF-8 mode is not enabled.
  133. ** Truncation will probably give the wrong result.
  134. ** Character \x{ffffffff} is greater than 255 and UTF-8 mode is not enabled.
  135. ** Truncation will probably give the wrong result.
  136. No match
  137. # ----------------------------------------------------
  138. # End of testinput14