Item13134: HTML::Tidy fails to validate html5
Current State: Confirmed
Released In: n/a
Target Release: minor
HTMLValtidation tests currently fail mostly due to HTML::Tidy's incapability to deal with HTML5.
<meta charset="utf-8" />
line 5 column 19 - Warning: <meta> proprietary attribute "charset"
line 5 column 19 - Warning: <meta> lacks "content" attribute
... which is obviously wrong.
Similarly any custom
as introduced by HTML5 will cause validation warnings.
See also Item10739
. Paul's approach was to filter tidy's output and ignore those warnings.
Validating HTML5 (under perl) doesn't seem to be so easy. Alternatives are:
Note that there are more reasons to look into replacing HTML::Tidy with another HTML validator. For example, building HTML::Tidy using
requires tidyp, yet another fork of the original tidy project. However tidyp isn't shipped by distributions.
So you have to download and compile tidyp yourself before cpanm-ing HTML::Tidy.
- 05 Dec 2014
Best approach atm is to filter out these false alarms. I have downloaded the html5 aware tidy version to check that no problems persist.
- 13 Dec 2014
Task::HTML5 could be an option
- 26 Feb 2016
Task::HTML5 has seen its latest release in 2011, so I assume it is not maintained anymore.
The latest tool is called HTML::Tidy5 as per https://github.com/petdance/html-tidy5
However it does not comple ootb: https://github.com/petdance/html-tidy5/issues/3
as it requires tidy-5.6.0 ...yet debian and ubuntu only have 5.2.0.
commandline tool seems to be fine though.
- 02 Jun 2019
HTML::Tidy5 is probably the correct route. it is from the same author as HTML::Tidy and depends on htmltidy which both appear to be actively maintained
- 23 Feb 2020