Item5352: Improve Email address validator to be more correct.

Priority: Low
Current State: Closed
Released In: 1.0.1
Target Release: patch
Applies To: Engine
Reported By: TWiki:Main.CullenNewsom
Waiting For:
Last Change By: KennethLavrsen
I think your email address validation scheme is broken. Here is why:

During configuration of a new twiki, the following error is encountered:

### Warning: I don't recognise this as a valid email address.

The email address in question is in the format: I believe twiki doesn't like the plus "+" sign. Please see the following links for information about the use of the plus "+" sign in an email address:



It may also be broken for other legal, (but little used) atoms. I did not test. Sorry if this should have been posted elsewhere, please tell me if this is the case.

-- TWiki:Main/CullenNewsom - 11 Feb 2008

True enough. Curiously enough the address regex used in the checker is different to that used in

The relevant part of the spec is:
   An addr-spec is a specific Internet identifier that contains a
   locally interpreted string followed by the at-sign character ("@",
   ASCII value 64) followed by an Internet domain.  The locally
   interpreted string is either a quoted-string or a dot-atom.  If the
   string can be represented as a dot-atom (that is, it contains no
   characters other than atext characters or "." surrounded by atext
   characters), then the dot-atom form SHOULD be used and the
   quoted-string form SHOULD NOT be used.
atext           =       ALPHA / DIGIT / ; Any character except controls,
                        "!" / "#" /     ;  SP, and specials.
                        "$" / "%" /     ;  Used for atoms
                        "&" / "'" /
                        "*" / "+" /
                        "-" / "/" /
                        "=" / "?" /
                        "^" / "_" /
                        "`" / "{" /
                        "|" / "}" /
atom            =       [CFWS] 1*atext [CFWS]

dot-atom        =       [CFWS] dot-atom-text [CFWS]

dot-atom-text   =       1*atext *("." 1*atext)
qtext           =       NO-WS-CTL /     ; Non white space controls

                        %d33 /          ; The rest of the US-ASCII
                        %d35-91 /       ;  characters not including "\"
                        %d93-126        ;  or the quote character

qcontent        =       qtext / quoted-pair

quoted-string   =       [CFWS]
                        DQUOTE *([FWS] qcontent) [FWS] DQUOTE
FWS             =       ([*WSP CRLF] 1*WSP) /   ; Folding white space

ctext           =       NO-WS-CTL /     ; Non white space controls

                        %d33-39 /       ; The rest of the US-ASCII
                        %d42-91 /       ;  characters not including "(",
                        %d93-126        ;  ")", or "\"

ccontent        =       ctext / quoted-pair / comment

comment         =       "(" *([FWS] ccontent) [FWS] ")"

CFWS            =       *([FWS] comment) (([FWS] comment) / FWS) 

As you can see it's non trivial to do properly. Most commentators don't bother to parse addresses properly, and simply use the RE that TWiki uses (with the missing '+' - that is defintiely missing)

-- CrawfordCurrie - 12 Feb 2008

