Scenario: Recently a person copy pasted an FB post into a textarea part of a form on our website and emojis were part of the original post. This resulted in some rather strange looking characters in the output. Emoji support was something I had never figured on.
What I’m looking for is a way for users to Copy & Paste these little images into form fields. I don’t want to clutter the form itself up with emojis to choose from (maybe in the distant future) because it is not a social media site but just to make it possible for a client to copy paste them in. I’m processing the form with Perl, if that means anything.
Any and all help would be greatly appreciated.
Thanx ~ Dave
Hi Dave - welcome to the forums
Are the emojis showing up correctly in your textarea when pasted from Facebook? If so, is the question around how to ensure the emojis are retained when processed by Perl?
The emojis are showing up great when I paste them into the textarea but after that… Just to see what would happen, I pasted 3 as the first three chars in this post. So, if its a perl issue, where do I go now?
I will admit that my Perl skills are almost non-existent these days. I remember writing a bunch of Perl back in the cgi-bin days of the early 2000’s
Can you check if the data is being encoded as UTF-8 by Perl. I found this link that may help here: https://www.perlmonks.org/?node_id=788911
The results are in: In the first I pasted in a smilie:
input: Input: (IS UTF8? No)
Decoded: (IS UTF8? Yes)
Encoded: ð (IS UTF8? No)
Entities input: 🙂
Entities decoded:
Entities encoded: 🙂
And the following text input (also tried with colon and right curved bracket with same results.
Input: UTF-8 (IS UTF8? No)
Decoded: UTF-8 (IS UTF8? Yes)
Encoded: UTF-8 (IS UTF8? No)
Entities input: UTF-8
Entities decoded: UTF-8
Entities encoded: UTF-8
Does this mean I either go to using PHP (which passes the emojis) which is nice but I have a great bad word filter in perl that tells the user exactly what words they need to remove before proceeding and so far I haven’t been able to get that to happen in PHP?
Is there a regex code for php which could equal the functionality of a perl one. I didn’t write the regex but at the same time have never seen another one like it.
btw…I forgot earlier, thank much for the reply
We need to ensure the input is UTF8 in Perl, and that should solve it for you. Does the suggestion in this thread help? Perl and HTML: UTF8 does not work in forms - Stack Overflow
Well Buddy I think I got it working and the script you pointed me to in perlmonks held the final key (although it took me a while to figure it out). Encoding HTML entities to prevent sql injections was the “hidden” culprit (it it had been a snake it would have bit me). Way back when tv’s were all b&w I placed the HTML::Entities::decode($var) tag at the top and when it came to outputting a copy I was using a regex to undo the code for qoute marks.
You’ll never know how much I appreciate your help–I’m pretty much bald to begin with but I would have pulled all the hair out of a wig till I got this. Even with the answer it took an extra day but success it is. Cheers.
Glad you got it working! Encoding issues and sanitizing inputs are some of the most challenging problems you will run into when processing untrusted/3rd party data, and you totally rocked it with a working solution!