site banner

Some potentialy naught strings

github.com

Problem

Innocuous strings which may be blocked by profanity filters (https://en.wikipedia.org/wiki/Scunthorpe_problem)

Scunthorpe General Hospital

Penistone Community Church

Lightwater Country Park

Jimmy Clitheroe

Horniman Museum

shitake mushrooms

RomansInSussex.co.uk

http://www.cum.qc.ca/

Craig Cockburn, Software Specialist

Linda Callahan

Dr. Herman I. Libshitz

magna cum laude

Super Bowl XXX

medieval erection of parapets

evaluate

mocha

expression

Arsenal canal

classic

Tyson Gay

Dick Van Dyke

basement

2
Jump in the discussion.

No email address required.

alert(0)

<script>alert('1');</script>

123<1>alert(3)

">alert(4)

'>alert(5)

alert(6)

alert(7)

< / script >< script >alert(8)< / script >

onfocus=JaVaSCript:alert(9) autofocus

" onfocus=JaVaSCript:alert(10) autofocus

' onfocus=JaVaSCript:alert(11) autofocus

<script>alert(12)</script>

<script>alert(13)</script>

-->alert(14)

";alert(15);t="

';alert(16);t='

JavaSCript:alert(17)

;alert(18);

src=JaVaSCript:prompt(19)

">alert(20);</script x="

'>alert(21);</script x='

alert(22);

" autofocus onkeyup="javascript:alert(23)

' autofocus onkeyup='javascript:alert(24)

<script\x20type="text/javascript">javascript:alert(25);

<script\x3Etype="text/javascript">javascript:alert(26);

<script\x0Dtype="text/javascript">javascript:alert(27);

<script\x09type="text/javascript">javascript:alert(28);

<script\x0Ctype="text/javascript">javascript:alert(29);

<script\x2Ftype="text/javascript">javascript:alert(30);

<script\x0Atype="text/javascript">javascript:alert(31);

'`"><\x3Cscript>javascript:alert(32)

'`"><\x00script>javascript:alert(33)

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

ABC

DEF

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

test

`"'><img src=xxx:x \x0Aonerror=javascript:alert(118)>

`"'><img src=xxx:x \x22onerror=javascript:alert(119)>

`"'><img src=xxx:x \x0Bonerror=javascript:alert(120)>

`"'><img src=xxx:x \x0Donerror=javascript:alert(121)>

`"'><img src=xxx:x \x2Fonerror=javascript:alert(122)>

`"'><img src=xxx:x \x09onerror=javascript:alert(123)>

`"'><img src=xxx:x \x0Conerror=javascript:alert(124)>

`"'><img src=xxx:x \x00onerror=javascript:alert(125)>

`"'><img src=xxx:x \x27onerror=javascript:alert(126)>

`"'><img src=xxx:x \x20onerror=javascript:alert(127)>

"`'>\x3Bjavascript:alert(128)

"`'>\x0Djavascript:alert(129)

"`'>\xEF\xBB\xBFjavascript:alert(130)

"`'>\xE2\x80\x81javascript:alert(131)

"`'>\xE2\x80\x84javascript:alert(132)

"`'>\xE3\x80\x80javascript:alert(133)

"`'>\x09javascript:alert(134)

"`'>\xE2\x80\x89javascript:alert(135)

"`'>\xE2\x80\x85javascript:alert(136)

"`'>\xE2\x80\x88javascript:alert(137)

"`'>\x00javascript:alert(138)

"`'>\xE2\x80\xA8javascript:alert(139)

"`'>\xE2\x80\x8Ajavascript:alert(140)

"`'>\xE1\x9A\x80javascript:alert(141)

"`'>\x0Cjavascript:alert(142)

"`'>\x2Bjavascript:alert(143)

"`'>\xF0\x90\x96\x9Ajavascript:alert(144)

"`'>-javascript:alert(145)

"`'>\x0Ajavascript:alert(146)

"`'>\xE2\x80\xAFjavascript:alert(147)

"`'>\x7Ejavascript:alert(148)

"`'>\xE2\x80\x87javascript:alert(149)

"`'>\xE2\x81\x9Fjavascript:alert(150)

"`'>\xE2\x80\xA9javascript:alert(151)

"`'>\xC2\x85javascript:alert(152)

"`'>\xEF\xBF\xAEjavascript:alert(153)

"`'>\xE2\x80\x83javascript:alert(154)

"`'>\xE2\x80\x8Bjavascript:alert(155)

"`'>\xEF\xBF\xBEjavascript:alert(156)

"`'>\xE2\x80\x80javascript:alert(157)

"`'>\x21javascript:alert(158)

"`'>\xE2\x80\x82javascript:alert(159)

"`'>\xE2\x80\x86javascript:alert(160)

"`'>\xE1\xA0\x8Ejavascript:alert(161)

"`'>\x0Bjavascript:alert(162)

"`'>\x20javascript:alert(163)

"`'>\xC2\xA0javascript:alert(164)

<img \x00s

308

Special Characters

ASCII punctuation. All of these characters may need to be escaped in some

contexts. Divided into three groups based on (US-layout) keyboard position.

,./;'[]-=

<>?:"{}|_+

!@#$%^&*()`~

Non-whitespace C0 controls: U+0001 through U+0008, U+000E through U+001F,

and U+007F (DEL)

Often forbidden to appear in various text-based file formats (e.g. XML),

or reused for internal delimiters on the theory that they should never

appear in input.

The next line may appear to be blank or mojibake in some viewers.



Non-whitespace C1 controls: U+0080 through U+0084 and U+0086 through U+009F.

Commonly misinterpreted as additional graphic characters.

The next line may appear to be blank, mojibake, or dingbats in some viewers.

€‚ƒ„†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ

Whitespace: all of the characters with category Zs, Zl, or Zp (in Unicode

version 8.0.0), plus U+0009 (HT), U+000B (VT), U+000C (FF), U+0085 (NEL),

and U+200B (ZERO WIDTH SPACE), which are in the C categories but are often

treated as whitespace in some contexts.

This file unfortunately cannot express strings containing

U+0000, U+000A, or U+000D (NUL, LF, CR).

The next line may appear to be blank or mojibake in some viewers.

The next line may be flagged for "trailing whitespace" in some viewers.

	

…

Unicode additional control characters: all of the characters with

general category Cf (in Unicode 8.0.0).

The next line may appear to be blank or mojibake in some viewers.

­؀؁؂؃؄؅؜۝܏᠎‌‍‏‪‫‬‭‮⁠⁡⁢⁣⁤⁦⁧⁨⁩𑂽𛲠𛲡𛲢𛲣𝅳𝅴𝅵𝅶𝅷𝅸𝅹𝅺󠀁󠀠󠀡󠀢󠀣󠀤󠀥󠀦󠀧󠀨󠀩󠀪󠀫󠀬󠀭󠀮󠀯󠀰󠀱󠀲󠀳󠀴󠀵󠀶󠀷󠀸󠀹󠀺󠀻󠀼󠀽󠀾󠀿󠁀󠁁󠁂󠁃󠁄󠁅󠁆󠁇󠁈󠁉󠁊󠁋󠁌󠁍󠁎󠁏󠁐󠁑󠁒󠁓󠁔󠁕󠁖󠁗󠁘󠁙󠁚󠁛󠁜󠁝󠁞󠁟󠁠󠁡󠁢󠁣󠁤󠁥󠁦󠁧󠁨󠁩󠁪󠁫󠁬󠁭󠁮󠁯󠁰󠁱󠁲󠁳󠁴󠁵󠁶󠁷󠁸󠁹󠁺󠁻󠁼󠁽󠁾󠁿

"Byte order marks", U+FEFF and U+FFFE, each on its own line.

The next two lines may appear to be blank or mojibake in some viewers.

Unicode Symbols

Strings which contain common unicode symbols (e.g. smart quotes)

Ω≈ç√∫˜µ≤≥÷

åß∂ƒ©˙∆˚¬…æ

œ∑´®†¥¨ˆøπ“‘

¡™£¢∞§¶•ªº–≠

¸˛Ç◊ı˜Â¯˘¿

ÅÍÎÏ˝ÓÔÒÚÆ☃

Œ„´‰ˇÁ¨ˆØ∏”’

`⁄€‹›fifl‡°·‚—±

⅛⅜⅝⅞

ЁЂЃЄЅІЇЈЉЊЋЌЍЎЏАБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюя

٠١٢٣٤٥٦٧٨٩

Unicode Subscript/Superscript/Accents

Strings which contain unicode subscripts/superscripts; can cause rendering issues

⁰⁴⁵

₀₁₂

⁰⁴⁵₀₁₂

ด้้้้้็็็็็้้้้้็็็็็้้้้้้้้็็็็็้้้้้็็็็็้้้้้้้้็็็็็้้้้้็็็็็้้้้้้้้็็็็็้้้้้็็็็ ด้้้้้็็็็็้้้้้็็็็็้้้้้้้้็็็็็้้้้้็็็็็้้้้้้้้็็็็็้้้้้็็็็็้้้้้้้้็็็็็้้้้้็็็็ ด้้้้้็็็็็้้้้้็็็็็้้้้้้้้็็็็็้้้้้็็็็็้้้้้้้้็็็็็้้้้้็็็็็้้้้้้้้็็็็็้้้้้็็็็

Quotation Marks

Strings which contain misplaced quotation marks; can cause encoding errors

'

"

''

""

'"'

"''''"'"

"'"'"''''"

<foo val=`bar' />

Two-Byte Characters

Strings which contain two-byte characters: can cause rendering issues or character-length issues

田中さんにあげて下さい

パーティーへ行かないか

和製漢語

部落格

사회과학원 어학연구소

찦차를 타고 온 펲시맨과 쑛다리 똠방각하

社會科學院語學研究所

울란바토르

𠜎𠜱𠝹𠱓𠱸𠲖𠳏

Strings which contain two-byte letters: can cause issues with naïve UTF-16 capitalizers which think that 16 bits == 1 character

𐐜 𐐔𐐇𐐝𐐀𐐡𐐇𐐓 𐐙𐐊𐐡𐐝𐐓/𐐝𐐇𐐗𐐊𐐤𐐔 𐐒𐐋𐐗 𐐒𐐌 𐐜 𐐡𐐀𐐖𐐇𐐤𐐓𐐝 𐐱𐑂 𐑄 𐐔𐐇𐐝𐐀𐐡𐐇𐐓 𐐏𐐆𐐅𐐤𐐆𐐚𐐊𐐡𐐝𐐆𐐓𐐆

Special Unicode Characters Union

A super string recommended by VMware Inc. Globalization Team: can effectively cause rendering issues or character-length issues to validate product globalization readiness.

表 CJK_UNIFIED_IDEOGRAPHS (U+8868)

ポ KATAKANA LETTER PO (U+30DD)

あ HIRAGANA LETTER A (U+3042)

A LATIN CAPITAL LETTER A (U+0041)

鷗 CJK_UNIFIED_IDEOGRAPHS (U+9DD7)

Œ LATIN SMALL LIGATURE OE (U+0153)

é LATIN SMALL LETTER E WITH ACUTE (U+00E9)

B FULLWIDTH LATIN CAPITAL LETTER B (U+FF22)

逍 CJK_UNIFIED_IDEOGRAPHS (U+900D)

Ü LATIN SMALL LETTER U WITH DIAERESIS (U+00FC)

ß LATIN SMALL LETTER SHARP S (U+00DF)

ª FEMININE ORDINAL INDICATOR (U+00AA)

ą LATIN SMALL LETTER A WITH OGONEK (U+0105)

ñ LATIN SMALL LETTER N WITH TILDE (U+00F1)

丂 CJK_UNIFIED_IDEOGRAPHS (U+4E02)

㐀 CJK Ideograph Extension A, First (U+3400)

𠀀 CJK Ideograph Extension B, First (U+20000)

表ポあA鷗ŒéB逍Üߪąñ丂㐀𠀀

Changing length when lowercased

Characters which increase in length (2 to 3 bytes) when lowercased

Credit: https://twitter.com/jifa/status/625776454479970304

Ⱥ

Ⱦ

Japanese Emoticons

Strings which consists of Japanese-style emoticons which are popular on the web

ヽ༼ຈل͜ຈ༽ノ ヽ༼ຈل͜ຈ༽ノ

(。◕ ∀ ◕。)

`ィ(´∀`∩

__ロ(,_,*)

・( ̄∀ ̄)・:*:

゚・✿ヾ╲(。◕‿◕。)╱✿・゚

,。・::・゜’( ☻ ω ☻ )。・::・゜’

(╯°□°)╯︵ ┻━┻)

(ノಥ益ಥ)ノ ┻━┻

┬─┬ノ( º _ ºノ)

( ͡° ͜ʖ ͡°)

¯_(ツ)_/¯

test⁧

Zalgo Text

Strings which contain "corrupted" text. The corruption will not appear in non-HTML text, however. (via http://www.eeemo.net)

Ṱ̺̺̕o͞ ̷i̲̬͇̪͙n̝̗͕v̟̜̘̦͟o̶̙̰̠kè͚̮̺̪̹̱̤ ̖t̝͕̳̣̻̪͞h̼͓̲̦̳̘̲e͇̣̰̦̬͎ ̢̼̻̱̘h͚͎͙̜̣̲ͅi̦̲̣̰̤v̻͍e̺̭̳̪̰-m̢iͅn̖̺̞̲̯̰d̵̼̟͙̩̼̘̳ ̞̥̱̳̭r̛̗̘e͙p͠r̼̞̻̭̗e̺̠̣͟s̘͇̳͍̝͉e͉̥̯̞̲͚̬͜ǹ̬͎͎̟̖͇̤t͍̬̤͓̼̭͘ͅi̪̱n͠g̴͉ ͏͉ͅc̬̟h͡a̫̻̯͘o̫̟̖͍̙̝͉s̗̦̲.̨̹͈̣

̡͓̞ͅI̗̘̦͝n͇͇͙v̮̫ok̲̫̙͈i̖͙̭̹̠̞n̡̻̮̣̺g̲͈͙̭͙̬͎ ̰t͔̦h̞̲e̢̤ ͍̬̲͖f̴̘͕̣è͖ẹ̥̩l͖͔͚i͓͚̦͠n͖͍̗͓̳̮g͍ ̨o͚̪͡f̘̣̬ ̖̘͖̟͙̮c҉͔̫͖͓͇͖ͅh̵̤̣͚͔á̗̼͕ͅo̼̣̥s̱͈̺̖̦̻͢.̛̖̞̠̫̰

̗̺͖̹̯͓Ṯ̤͍̥͇͈h̲́e͏͓̼̗̙̼̣͔ ͇̜̱̠͓͍ͅN͕͠e̗̱z̘̝̜̺͙p̤̺̹͍̯͚e̠̻̠͜r̨̤͍̺̖͔̖̖d̠̟̭̬̝͟i̦͖̩͓͔̤a̠̗̬͉̙n͚͜ ̻̞̰͚ͅh̵͉i̳̞v̢͇ḙ͎͟-҉̭̩̼͔m̤̭̫i͕͇̝̦n̗͙ḍ̟ ̯̲͕͞ǫ̟̯̰̲͙̻̝f ̪̰̰̗̖̭̘͘c̦͍̲̞͍̩̙ḥ͚a̮͎̟̙͜ơ̩̹͎s̤.̝̝ ҉Z̡̖̜͖̰̣͉̜a͖̰͙̬͡l̲̫̳͍̩g̡̟̼̱͚̞̬ͅo̗͜.̟

̦H̬̤̗̤͝e͜ ̜̥̝̻͍̟́w̕h̖̯͓o̝͙̖͎̱̮ ҉̺̙̞̟͈W̷̼̭a̺̪͍į͈͕̭͙̯̜t̶̼̮s̘͙͖̕ ̠̫̠B̻͍͙͉̳ͅe̵h̵̬͇̫͙i̹͓̳̳̮͎̫̕n͟d̴̪̜̖ ̰͉̩͇͙̲͞ͅT͖̼͓̪͢h͏͓̮̻e̬̝̟ͅ ̤̹̝W͙̞̝͔͇͝ͅa͏͓͔̹̼̣l̴͔̰̤̟͔ḽ̫.͕

Z̮̞̠͙͔ͅḀ̗̞͈̻̗Ḷ͙͎̯̹̞͓G̻O̭̗̮

Unicode Upsidedown

Strings which contain unicode with an "upsidedown" effect (via http://www.upsidedowntext.com)

˙ɐnbᴉlɐ ɐuƃɐɯ ǝɹolop ʇǝ ǝɹoqɐl ʇn ʇunpᴉpᴉɔuᴉ ɹodɯǝʇ poɯsnᴉǝ op pǝs 'ʇᴉlǝ ƃuᴉɔsᴉdᴉpɐ ɹnʇǝʇɔǝsuoɔ 'ʇǝɯɐ ʇᴉs ɹolop ɯnsdᴉ ɯǝɹo˥

00˙Ɩ$-

Unicode font

Strings which contain bold/italic/etc. versions of normal characters

The quick brown fox jumps over the lazy dog

𝐓𝐡𝐞 𝐪𝐮𝐢𝐜𝐤 𝐛𝐫𝐨𝐰𝐧 𝐟𝐨𝐱 𝐣𝐮𝐦𝐩𝐬 𝐨𝐯𝐞𝐫 𝐭𝐡𝐞 𝐥𝐚𝐳𝐲 𝐝𝐨𝐠

𝕿𝖍𝖊 𝖖𝖚𝖎𝖈𝖐 𝖇𝖗𝖔𝖜𝖓 𝖋𝖔𝖝 𝖏𝖚𝖒𝖕𝖘 𝖔𝖛𝖊𝖗 𝖙𝖍𝖊 𝖑𝖆𝖟𝖞 𝖉𝖔𝖌

𝑻𝒉𝒆 𝒒𝒖𝒊𝒄𝒌 𝒃𝒓𝒐𝒘𝒏 𝒇𝒐𝒙 𝒋𝒖𝒎𝒑𝒔 𝒐𝒗𝒆𝒓 𝒕𝒉𝒆 𝒍𝒂𝒛𝒚 𝒅𝒐𝒈

𝓣𝓱𝓮 𝓺𝓾𝓲𝓬𝓴 𝓫𝓻𝓸𝔀𝓷 𝓯𝓸𝔁 𝓳𝓾𝓶𝓹𝓼 𝓸𝓿𝓮𝓻 𝓽𝓱𝓮 𝓵𝓪𝔃𝔂 𝓭𝓸𝓰

𝕋𝕙𝕖 𝕢𝕦𝕚𝕔𝕜 𝕓𝕣𝕠𝕨𝕟 𝕗𝕠𝕩 𝕛𝕦𝕞𝕡𝕤 𝕠𝕧𝕖𝕣 𝕥𝕙𝕖 𝕝𝕒𝕫𝕪 𝕕𝕠𝕘

𝚃𝚑𝚎 𝚚𝚞𝚒𝚌𝚔 𝚋𝚛𝚘𝚠𝚗 𝚏𝚘𝚡 𝚓𝚞𝚖𝚙𝚜 𝚘𝚟𝚎𝚛 𝚝𝚑𝚎 𝚕𝚊𝚣𝚢 𝚍𝚘𝚐

⒯⒣⒠ ⒬⒰⒤⒞⒦ ⒝⒭⒪⒲⒩ ⒡⒪⒳ ⒥⒰⒨⒫⒮ ⒪⒱⒠⒭ ⒯⒣⒠ ⒧⒜⒵⒴ ⒟⒪⒢

aaaa