You are here: Foswiki>Tasks Web>Item13134 (23 Feb 2020, TimothyLegge)Edit Attach

Item13134: HTML::Tidy fails to validate html5

pencil
Priority: Normal
Current State: Confirmed
Released In: n/a
Target Release: minor
Applies To: Extension
Component: UnitTestContrib
Branches: master Release02x01
Reported By: MichaelDaum
Waiting For:
Last Change By: TimothyLegge
HTMLValtidation tests currently fail mostly due to HTML::Tidy's incapability to deal with HTML5.

Example:

<meta charset="utf-8" />

... produces

line 5 column 19 - Warning: <meta> proprietary attribute "charset"
line 5 column 19 - Warning: <meta> lacks "content" attribute

... which is obviously wrong.

Similarly any custom data-* as introduced by HTML5 will cause validation warnings.

See also Item10739. Paul's approach was to filter tidy's output and ignore those warnings.

Validating HTML5 (under perl) doesn't seem to be so easy. Alternatives are:

Note that there are more reasons to look into replacing HTML::Tidy with another HTML validator. For example, building HTML::Tidy using cpanm requires tidyp, yet another fork of the original tidy project. However tidyp isn't shipped by distributions. So you have to download and compile tidyp yourself before cpanm-ing HTML::Tidy.

-- MichaelDaum - 05 Dec 2014

Best approach atm is to filter out these false alarms. I have downloaded the html5 aware tidy version to check that no problems persist.

-- MichaelDaum - 13 Dec 2014

Task::HTML5 could be an option

-- MichaelDaum - 26 Feb 2016

Task::HTML5 has seen its latest release in 2011, so I assume it is not maintained anymore. The latest tool is called HTML::Tidy5 as per https://github.com/petdance/html-tidy5 and http://www.html-tidy.org/

However it does not comple ootb: https://github.com/petdance/html-tidy5/issues/3 as it requires tidy-5.6.0 ...yet debian and ubuntu only have 5.2.0.

The tidy commandline tool seems to be fine though.

-- MichaelDaum - 02 Jun 2019

HTML::Tidy5 is probably the correct route. it is from the same author as HTML::Tidy and depends on htmltidy which both appear to be actively maintained

-- TimothyLegge - 23 Feb 2020
 
Topic revision: r9 - 23 Feb 2020, TimothyLegge - This page was cached on 23 Feb 2020 - 13:17.

The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy