Alright, inspired by the other thread, I wrote a regular expression in a Python function which will return a boolean whether or not the email is valid.
I followed the rfc (http://www.faqs.org/rfcs/rfc822.html) word-for-word. It is, as far as I know, exactly the same, except for support of ASCII characters 0x000-0x037 and 0x177 (those are special characters such as EOF, and backspace, which should be taken care of outside the script).
def match_email(address):
import re
mail_re = re.compile(r'(?#addr-spec)((?#local-part)((^|\.)((?#word)((?#quoted-string)"((?#quoted-pair)\\.|(?#qtext)[^"\\])+")|((?#atom)[^()<>@,;:\\".\[\] ]+)))+)@((?#domain)(?<=@|\.)((?#sub-domain)((?#domain-ref)((?#atom)[^()<>@,;: \\".\[\]]+))|((?#domain-literal)\[((?#dtext)[^\[\]\\]|(?#quoted-pair)\\.)+\])+)+($|\.)){2,}$')
return mail_re.match(address) != None
# Use like this:
print match_email(r'jeff@nokrev.com')
Some sample valid email addresses (believe it or not… just look at the rfc linked above):
"@ad \f".he."s\"".llo."@"@site.c[\]]m
"\"".hey." "@domain.doma["i"]n.domain
"\""."more".hey."@ @"@doma["i"]n.co.uk!
And some invalid ones:
"@ad \f"he."s\"".llo."@"@site.c[\]m
""".hey.""@domain.doma[]n.domain
@.com
[whisper]Sorry for the bad samples… hehe.[/whisper]