You are here: Foswiki>Tasks Web>Item11312 (11 Apr 2012, GeorgeClark)Edit Attach

Item11312: TinyMCE Corrupting HTML Tables

pencil
Priority: Urgent
Current State: Closed
Released In: 1.1.5
Target Release: patch
Applies To: Extension
Component: WysiwygPlugin
Branches: Release01x01 trunk
Reported By: SteveLin
Waiting For:
Last Change By: GeorgeClark
TinyMCE 1.2.2 (experimental as of Nov 30th)

I created a table with the following:
<table border="1" cellpadding="0" cellspacing="1"> 
   <tbody> 
      <tr> 
         <td>A</td> 
         <td>B
         
            C
         </td>  
         <td>D</td> 
      </tr>   
   </tbody> 
</table>

When I go to tinymce, the markup is changed due to the newlines between "B" and "C" in the 2nd cell. Happens on IE9 and firefox 8

-- SteveLin - 30 Nov 2011

Probably caused by work in Item2174. trunk-only

-- PaulHarvey - 30 Nov 2011

it appears to corrupt the html on the way to tinyMCE - added ROUNDTRIP unit test to show it

but this test fails the same way in 1.1.4

so i guess 1.1.4 had code in js that fixed / covered up the problem?

-- SvenDowideit - 02 Apr 2012

The problem is that TML2HTML doesn't understand HTML markup as input. It's happily opening a paragraph where it sees a couple of newlines, but then never gets a signal to close it. So we get invalid HTML, which for some reason browsers happily coped with okay on 1.1.4, but the new whitespace preservation meta-markup in 1.1.5-RC1+ is apparently too much (span/para content not contained in <td>) and causing them to puke.

Here's the erroneous output, newlines and indenting manually added for readability
<!--$WYSIWYG_SECRET-->
<p class="foswikiDeleteMe">&nbsp;</p>
<p><table border="1" cellpadding="0" cellspacing="1">
  <span style="{encoded:'ns3'}" class="WYSIWYG_HIDDENWHITESPACE"> </span>
  <tbody>
    <span style="{encoded:'ns6'}" class="WYSIWYG_HIDDENWHITESPACE"> </span>
    <tr>
      <span style="{encoded:'ns9'}" class="WYSIWYG_HIDDENWHITESPACE"> </span>
      <td>A</td>
      <span style="{encoded:'ns9'}" class="WYSIWYG_HIDDENWHITESPACE"> </span>
      <td>B
        </p>
        <p>
          <span style="{encoded:'s12'}" class="WYSIWYG_HIDDENWHITESPACE"> </span>
          C
          <span style="{encoded:'ns9'}" class="WYSIWYG_HIDDENWHITESPACE"> </span>
      </td>
      <span style="{encoded:'s2'}" class="WYSIWYG_HIDDENWHITESPACE"> </span>
      <span style="{encoded:'ns9'}" class="WYSIWYG_HIDDENWHITESPACE"> </span>
      <td>D</td>
      <span style="{encoded:'ns6'}" class="WYSIWYG_HIDDENWHITESPACE"> </span>
    </tr>
    <span style="{encoded:'s3'}" class="WYSIWYG_HIDDENWHITESPACE"> </span>
    <span style="{encoded:'ns3'}" class="WYSIWYG_HIDDENWHITESPACE"> </span>
  </tbody>
  <span style="{encoded:'n'}" class="WYSIWYG_HIDDENWHITESPACE"> </span>
</table>
</p>

My thoughts are that we should do a HTML2TML first, then TML2HTML - but I would appreciate Crawford's guidance on this.

-- PaulHarvey - 02 Apr 2012

Most of the bogus spans disappear if you comment
 706                 # $line = $this->_hideWhitespace($whitespace) . $line
 707                   # if length($whitespace);

- but that's only because we have this situation where TML2HTML thinks it's still inside a paragraph

-- PaulHarvey - 02 Apr 2012

Nope, that won't work: I'm an idiot for not realising that HTML2TML would break all the line-sensitive TML markup. Damn.

-- PaulHarvey - 02 Apr 2012

So the fundamental problem is that WYSIWYG is designed to convert either HTML, or TML, but when TML input is mixed with complex HTML like tables and lists, it gets way too hairy.

-- PaulHarvey - 02 Apr 2012

I've checked in a small fix / workaround for this for 1.1.5, however the underlying issue is not resolved, so the task cannot be closed. The WYSIWYG_EXLUDE now supports "table" as an option. It has been added to the default setting, which becomes Set WYSIWYG_EXCLUDE = script,style,table.

This task will be marked fixed in 1.1.5, and a new task opened for a more complete solution in 1.2.

-- GeorgeClark - 02 Apr 2012

Note that the work around to disable WYSIWYG editing when it sees an HTML table is unacceptable.

A very common use of the WYSIWYG editor is to paste in tables from Word, Excel, EMails etc. They are always so full of "stuff" that they remain HTML tables. People live happily with that editing the tables in the WYSIWYG editor.

Disabling WYSIWYG editing with HTML tables will create a scream from our users.

This bug needs to be fixed before 1.1.5 is released.

-- KennethLavrsen - 02 Apr 2012

I've checked in a fix. It needs careful review. When the TML2HTML conversion runs, any <table> markup is passed through unmodified... except for converting blank lines to empty paragraphs. No trailing white space processing is done.

This is a rather brute force hack rather than trying to fix the state machine for line-by-line processing of the HTML in TML.

-- GeorgeClark - 02 Apr 2012

Tests broken by this change:
  • TranslatorTests::testTML2HTML_centeredTableItem5955 (missing <p> bookend)
  • TranslatorTests::testTML2HTML_ttTableNewlineCorruptionItem11312 (removed foswiki specific markup)

-- GeorgeClark - 02 Apr 2012

Hi George, I'm tempted to say that we should consider backing out to the 1.1.4 TMCE/WYSIWYG, at least to get 1.1.5 out the door.

But if we are determined to release the new WYSIWYG, then I had another idea which was to make the TML line-state keep track of opened/closed tags as it runs along. The idea would be that if it encounters a closing tag which should force the paragraph to close early, it can do so.

But perhaps what you've done is sufficient too.

-- PaulHarvey - 02 Apr 2012

What I've run into is that html markup, like the <table> tag causes the code to automatically open a new paragraph, then the first blank line closes it. I tried to address it a couple of different ways, both of which caused more issues, and didn't fix the issue. I ended up either with rendering order issues (row C & D appearing ahead of A & B, or something like that ), or the TML converter had conversion errors and fell back to plain text editing. What I've done had the most success.

-- GeorgeClark - 03 Apr 2012

Looks good to me; why aren't you closing this? At the end of the day mixing HTML and TML will never be 100%, and heuristics addressing individual cases will always be required; there is no general solution (IMHO)

-- CrawfordCurrie - 04 Apr 2012

I think Clark was waiting for me to do some testing. Which I did and it seems OK. So we can set it waiting for release.

But I found a new release blocker during testing.

Paul Harvey I give you the final word if we can set this to Waiting For Release.

-- KennethLavrsen - 04 Apr 2012

Latest fix adds a $inHTMLTable state, and makes sure that open/close paragraphs are properly handled. Also ensure that any list is closed by a table or cell close. This still needs another careful test before going ahead with 1.1.5, but the unit tests and manual testing I've done all seem successful.

-- GeorgeClark - 05 Apr 2012
 
Topic revision: r32 - 11 Apr 2012, GeorgeClark
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy