RegisterSectionTagHandler

Allow macros to automatically be passed text between %MACRO% and %/MACRO% tags

I'm new to the dev community, so this thinking may violate some core concepts that I'm not familiar with yet, but hey, this is brainstorming! smile

I've noticed that there are a number of plugins out there - and some built-in macros - that use a sort of extended syntax like this:

   %MACRO{params}% Potentially multi-line text that is affected by this macro %ENDMACRO%

If all that these macros need to do is "wrap" the intervening content with special formatting (like the various color macros plus ENDCOLOR) then this approach can be nicely handled with standard macros. But if you actually need to process the intervening text, then this approach doesn't work so well, and plugins need to resort to using commonTagsHandler - which - based on the documentation - is pretty inefficient and should be avoided.

What if, instead of using commonTagsHandler, a plugin could use a new function called reigsterSectionHandler() to handle a tag like this?:

%MACRO{params}%
Text for
MACRO
to operate
on
%/MACRO% 

Advantages

  • As a plugin developer, all you'd need to do would be to register a section tag handler for your MACRO and the core would take care of calling your handler, passing you any parameters and passing you the text you need to operate on.
  • More efficient - You wouldn't need to use commonTagsHandler and process the whole page. You'd only be passed the parts of the page that apply to you.
  • Safer? Easier to debug? - The plugin developer could only mess up their area of the page - not the whole page.
  • Could add an option to protect the intervening text from WYSIWYG reformatting. Lots of times you'll want to preserve the spacing inside those tags. For example, SyntaxHighlightingPlugin would be a lot easier to use in WYSIWYG mode if there were no need to mess with "protect" styles.
  • It similar to XML and HTML syntax, which lots of people are familiar with.

Disadvantages

  • Probably lots that I'm not thinking about, including potential complications in macro expansion.

Alternative approaches

  • Continue using commonTagsHandler for plugins that want to use this kind of extended syntax.
  • Allow XML style tags instead?

Note

I originally proposed this topic as allowing XML style tags, but have seen the error of my ways, so I've revised it to be more in line with standard Foswiki MACRO syntax. Early comments reflect the XML style macro proposal.


LeilaPearson

Comments?

Welcome Leila, great idea, you've identified something that we've all been muttering about since reigsterTagHandler() was invented, I suspect.

Traditionally I think introducing new syntax has a lot of resistance smile Personally I'm just concerned that an XML tag would complicate the macro parser (but I'm not familiar with the parser, so maybe I'm just being paranoid). For example:
  • A fundamental rule of foswiki %macros% is that they are always expanded inside-out, left-to-right. Do we treat XML tags in a special way - delay the inner macros before handing off to the XML tag handler - or do we expand those macros first? Maybe XML tag handlers want to specify whether the TML parser should process the content of an XML tag or not?
  • Maybe we just treat XML tags like we do STARTSECTION and ENDSECTION, Ie. they aren't really macros at all but simply mark-up sections of a topic (for example, this means START/ENDSECTION can only be used in topic text, they do absolutely nothing if they are expanded by some other macro dynamically (example, if a SEARCH returns text containing START/ENDSECTION, that doesn't define any new sections on the SEARCHing topic)
  • I would be very upset if we defined another section syntax which didn't have the ability to allow us to address the content of those sections, using Eg. %INCLUDE{section="foo"}%.
So, we already have STARTSECTION and ENDSECTION which does already nearly perform the exact functionality we seek. Can we just make that better instead of using XML tags? Let's see

Foswiki::Func::registerTagHandler handlers already get a %param hash where every key is a parameter that was passed to the macro. We could start a new registerSectionHandler() API, but instead of registering a tag it registers a pair of START/ENDFOO macros that behave just like START/ENDSECTION. My point is that we should be able to address macro sections the same we do include sections. For example, if I register a new section handler like this:

Foswiki::Func::registerSectionHandler('CODE', \&foo);

then STARTCODE and ENDCODE should be known to foswiki automatically as forms of STARTSECTION and ENDSECTION automatically (enforce macro naming consistency).

Then if we use it:

%STARTCODE{"example1" syntax="javascript" gutter="on"}%
jQuery(document).ready(
  function ($) {
    alert('example');
  }
);
%ENDCODE{"example1"}%

Then I should be able to re-use it as an ordinary topic section:
%ADDTOZONE{"script" text="
  <script type='text/javascript'>
    $percntINCLUDE{section=\"example1\"}$percnt
  </script>"
}%

Finally, I know this doesn't address your concern of making WysiwygPlugin protect these sections, but we can easily make it do that if the registerSectionHandler says so (or preferably, a separate call to WysiwygPlugin::protectSection('CODE') perhaps)

-- PaulHarvey - 23 Sep 2010

The other point is that any topic which makes use of lots of <real xml="tags">, the parser could spend more time ignoring "real" XML tags than macro-XML-tags

-- PaulHarvey - 23 Sep 2010

Hi Paul. Thanks for giving this idea some serious thought smile

I guess a little background on why I'm suggesting XML tags might be in order.

Besides the "it's a syntax people already know" thing, I have recently been working with MathModePlugin and LatexModePlugin, and converting over Mediawiki pages that use the <math> and </math> tags. MathModePlugin uses <latex> and </latex> tags currently to delimit latex, so that's one thing that got me started thinking this way.

The other thing was my less than fun experience using SyntaxHighlightingPlugin in combination with WYSIWYG, so I asked Support.Question644, which you actually were kind enough to answer, and in your answer you mentioned XML-style tags as one option there. So, those things are what got me thinking...

Anyway, let's say for the moment at least that we avoid introducing XML-style macro syntax as a core supported feature, and instead use something more like your suggestion. If so, I'd want to make the macro names a bit shorter to type if possible. I'm thinking about something more like:

%CODE% and %/CODE%

We'd obviously need to change the core to recognize the "/" in the "end tag" version of the macro, but it should be doable I think.

Finally, we could also consider, as part of the same plugin registration, providing the option of supporting %CODE/% - meaning without parameters or section text.

We'd also, of course, support:

%CODE{"params"}% %/CODE% and %CODE{"params"}/% forms.

This should give plugin developers lots of flexibility, plus an easy to type and easily parsed syntax. When doing macro expansion, the core just needs to know what type of plugin it's dealing with to handle it appropriately. If it's a standard tags plugin, it doesn't need to look ahead for any section end macro tag. If it's a RegisterSectionTagHandler macro, then it knows to look ahead when it comes across %CODE% but not when it comes across %CODE/%.

What do you think?

-- LeilaPearson - 23 Sep 2010

Re-jigged the main proposal to be aligned with my comment above.

-- LeilaPearson - 23 Sep 2010

Hi Leila!

Great Idea! I liked your proposal very much, except the %/CODE% part. The %STARTTAG% %ENDTAG%, like Paul suggested, is more consistent with the existing syntax and it's easier to read (it clearly states it's about some kind of section).

It'd would make it easier to write plugins and it helps to remove the ugly and performance-killer commonTagsHandler.

-- GilmarSantosJr - 23 Sep 2010

I don't want to sound too dismissive of the <xml>...</xml> style tag syntax - for example, DirectedGraphPlugin provides <dot>..</dot>. Perhaps we should invite feedback from other Foswiki developers on this.

I agree with Gilmar that %/CODE% doesn't add much readability over %ENDCODE%, especially given the % symbol next to it ( </xml> is much nicer to my eyes than %/CODE%). Foswiki is full of inconsistencies like this; it'd be a shame to introduce new ones.

-- PaulHarvey - 23 Sep 2010

Yes, I don't mind ENDCODE. But I don't much like STARTCODE. It's pretty long, and also most macros of this style don't seem to use START for the opening MACRO from what I've seen. I think I would prefer just CODE.

I agree that the slash next to the percent sign doesn't show up as well as the xml equivalent. One reason I thought the slash might be good would be to avoid any name clashes with existing macros. Slash is illegal in macro names currently so there should be no clashes.

All that said, I still kind of like the xml syntax. I just think if it's offered a more standard macro syntax should probably be offered too.

I'm still okay with XML instead or in addition too.

-- LeilaPearson - 24 Sep 2010

Maybe a double percent would work and be more obvious? Or an underscore? Dash? Something else? %%CODE% %_CODE% %-CODE%. I'm open to suggestions...

-- LeilaPearson - 24 Sep 2010

Maybe %CODE_% to start a section and %_CODE% to end one?

-- LeilaPearson - 24 Sep 2010

A registerSectionHandler definitely is a missing feature. However, this needs more close inspection of the way the normal inside-out-left-to-right parser processes those sections.

First I'd prefer to explicitly define start and ending macros while registering a new section type. E.g.:
Foswiki::Func::registerSectionHandler("SQL", "ENDSQL", \&handleInlineSQL);

Imho, any exrta magic by automatically adding a %START_ and %END_ is too much. Is it just to spare an additional parameter to registerSectionHandler() ? I'd prefer to have the freedom to call the start and end tag like I want.

Back to how a section is parsed. When is it parsed, that is: is it taken out before the normal parser processes all other tags, and inserted back in when it finishes? Probably not. Well but that's how any solution based on commonTagsHandler() would currently do it. Better find a way to let handled sections participate in a inside-out-left-to-right parsing order in a natural way. But wait, that means that first the inner part of a section is processed and its result is then passed over to the registered section handler. This isn't what most of the coders of a section handler would like to do. Maybe some do and others not. This however is more of the core of the problem: in how far do section boundaries change the normal inside-first rule of parsing TML? Can we parametrize that, using something like

use constant Foswiki::PARSE_INSIDEOUT => 0;
use constant Foswiki::PARSE_OUTSIDEIN => 1;
use constant Foswiki::PARSE_PROTECTED => 3;

...

Foswiki::Func::registerSectionHandler("SQL", "ENDSQL", \&handleInlineSQL, Foswiki::PARSE_PROTECTED);

In this example the SQL section is flagged to be protected from any processing of the normal TML parser, that is it will completeley take out the marked section from parsing and hand over the enclosed TML content as is. The section handler itself will then have to take care of expanding (or not expanding) any common tags produced after the SQL has been handed over to the database. In this case the complete section is treated as one atomic blob that the normal TML parser doesn't open at all. The blob as a complete entity should participate in the normal parsing flow.

Here are some cases mixing tags and sections. Question: When is MACRO1, 2 and 3 and STARTCODE-ENDCODE expanded?

Case 1: (proper linear order, properly nested)
%MACRO1{....}%

%STARTCODE{"..."}%

  %MACRO2{"..."}%

%ENDCODE%

%MACRO3{...}%

Case 2: (sections as parameters to tags)
%MACRO1{"

  param="%STARTCODE{param="..."}%

                     %MACRO2{"..."}%

                %STOPCODE%
  "

}%

Case 3: (sections as parameters to tags, escaped)
%MACRO1{"
  "$percntSTARTCODE{param=\"...\"}$percnt

          %MACRO2{"..."}%

   $percntSTOPCODE$percnt"
   format=" ...."
}%

Case 4: (escaped sections as parameters to tags with more escapred tags inside)
%MACRO{"
  "$percntSTARTCODE{param=\"...\"}$percnt

          $percntMACRO2{\"...\"}$percnt

   $percntSTOPCODE$percnt"
   format=" ...."
}%

Case 5: (escaped sections as parameters to tags with double escaped tags inside)
%MACRO{"
  "$percntSTARTCODE{param=\"...\"}$percnt

    $dollarpercntMACRO2{\\"...\\"}$dollarpercnt

   $percntSTOPCODE$percnt"
   format=" ...."
}%

Case 6: (properly nested sections)
%STARTCODE1%

    %STARTCODE2%

    %ENDCODE2%

%ENDCODE1%

Case 7: (crossing sections)
%STARTCODE1%

    %STARTCODE2%

%ENDCODE1%

    %ENDCODE2%

Case 8: (properly nested sections with tags inside)
%STARTCODE1%
 
    %MACRO1%

    %STARTCODE2%

        %MACRO2%

    %ENDCODE2%

    %MACRO3%

%ENDCODE1%

Case 9: (crossing sections with tags inside)
%STARTCODE1%

    %MACRO1%

    %STARTCODE2%

       %MACRO2%

%ENDCODE1%

       %MACRO3%

    %ENDCODE2%

-- MichaelDaum - 24 Sep 2010

Excellent elaboration. Firstly, making Foswiki do START/END for you is to just enforce consistency amongst all macros. It is frustrating enough remembering START/STOPINCLUDE, let alone START/ENDSECTION (and that's even for the same functionality!).

But I won't lose too much sleep if we'd rather have freedom.

Secondly, in my mind, I was thinking about the 'hidden' sections that aren't implemented yet (AddHideOptionToSTARTSECTION).

The section would be removed by the parser and the content handed off to the sectionHandler (always equivalent to your Foswiki::PARSE_PROTECTED). If the sectionHandler wants to expand macros, they can do that for themselves by calling expandMacros (equivalent to your Foswiki::PARSE_INSIDEOUT).

I'm not sure what Foswiki::PARSE_OUTSIDEIN would look like.

-- PaulHarvey - 24 Sep 2010

And I still maintain that it is highly desirable to be able to address sections with INCLUDE. Basically, IMHO it would be nice to just (ab)use an AddHideOptionToSTARTSECTION form of STARTSECTION to do what we want here.

-- PaulHarvey - 24 Sep 2010

I think this is a good idea, but rather than introduce new syntax, would it be acceptable to extend the range of section types, and support a handler? For example,
%STARTSECTION{type="mytype"}%
...
%ENDSECTION%
combined with registerSectionHandler("mytype", \&myFunction).

I appreciate it looks a bit clumsy, but it keeps the syntax small and if we combined it with:
   * Set STARTCODE{language style href} = %STARTSECTION{type="code" language="%language%" ...}%
   * Set ENDCODE = %ENDSECTION{type="code"}%
then we should have a clean, core-independent way to extend the section syntax.

Foswiki.pm already implements STARTSECTION...ENDSECTION and IMHO we should keep the expansion order that implements (whatever it is).

-- CrawfordCurrie - 24 Sep 2010

I like the idea of extending the type of sections.

One thing that has been bothering me for quite some time is this rather typial way to write sections and protect them

<verbatim class="foswikiHidden">
%STARTSECTION{"mysection"}%
...
%ENDSECTION{"mysection"}%
</verbatim>

So when we are about to extend the way STARTSECTON works, can we add a display or render parameter to it as well, with something like

  • display="on": default ... that's what we've got w/o another verbatim wrapper
  • display="escape": same as adding a verbatim wrapper
  • display="hidden" : same as verbatim-class-hidden
  • display="off": removed from the final markup, even no hidden <pre> tag
This would make wiki apps consisting of lots of sections in a single page a lot easier to read and write.

-- MichaelDaum - 24 Sep 2010

Hence AddHideOptionToSTARTSECTION

-- PaulHarvey - 24 Sep 2010

I saw this discussion this morning and scribbled few notes, thinking I must login and add these after my client's work is done. A few hours later all my important points and more have been raised.

I am left with a small suggestion and that's:
   %DO{"my_macro"}%
   ...
   %END{"my_macro"}%

That just feels more user friendly and less geeky to me (or maybe more geeky as geeks like short names). Yes it would potentially mean two section types, but only two and I think that there some distinctions. DO would be primarily to extend macro capabilities for plugins. To elaborate a little more of these distinctions

The full STARTSECTION equivalent would need to be this:
  %STARTSECTION{type="my_macro"}%
  ...
  %ENDSECTION{type="my_macro"}%

That makes me think that DO may just be syntactic sugar where 'type' is the positional parameter and 'name' must be explicit. Whereas with STARTSECTION its vice versa. I note that STARTSECTION already generates a name if not explicitly given.

In which case will this be sufficient (borrowing from above)
   * Set DO{language style href} = %STARTSECTION{type="%type%" language="%language%" ...}%
   * Set END{language style href} = %ENDSECTION{type="%type%"}%
BTW: Do we have parameterised Set statements yet?

When we have something like this in place are we going to recommend replacing existing XML style syntax (ie deprecate this syntax). For example, update DirectedGraphPlugin to use %DO{"dot"}% instead of <dot> etc? Similarly, are we going to resist further introduction of foswiki facing XML syntax (XML embedded between %DO{}% ... %END% would be OK as it's not dealt with by foswiki core).

I appreciate that having two ways of saying the same thing could be confusing. However, I definitely have a sense that there is a distinction here: STARTSECTION is generally placed elsewhere and some form of indirect processing is done to retrieve it. Whereas in a general MACRO case it's embedded and processed within the original topic. I need to think more about about how distinct these really are.

-- JulianLevens - 24 Sep 2010

So I've been quiet for a few days while I think a bit more.

Regarding the freedom to choose your own start and end tags, as Paul mentioned, the motivation was to introduce some sort of consistency. That said, I don't mind freedom either, and it may be best to provide this since it would allow existing plugins to "upgrade" to the new tags handler function and keep their existing start and end tags. But if we go that way, I do think it would be good to establish a naming convention and clearly document that convention so going forward, new Plugins can use that convention. Also, existing plugins that don't follow the convention would be advised to handle the new convention in addition to their existing "non-standard" syntax.

So all this discussion about start and end tag names will still be useful smile

Now I just need to think some more about sections, includes, and macro expansion.

-- LeilaPearson - 27 Sep 2010

A rather typical candidate for section tags diverging from the proposed START-END standard is IFDEFINEDTHEN .... FIDEFINED from IfDefinedPlugin. What makes it worse is
%IFDEFINEDTHEN{...}%

%ELSIFDEFINEDTHEN{...}%

%ELSEDEFINED%

%FIDEFINED%

where the end of a section marks the start of the next one. As far as I see this can't be covered by the registerSectionHandler() as specified up to now.

-- MichaelDaum - 27 Sep 2010

This might not be a problem. If IFDEFINEDTHEN is registered as the start tag, and FIDEFINED is registered as the end tag, and the core just passes the text that's between those two tags to the Plugin's section handler function without expanding any other macros, then I guess the plugin's section handler function could handle the inner complexity without the core having to know anything about it.

I still have to think through all the cases you listed earlier though. And get a better understanding of all the steps involved and the order of macro expansion in getting from the raw TML to the fully rendered topic.

Up to now at least, I've been thinking of registerSectionHandler as basically a commonTagsHandler that operates only on a section of a topic, as delimited by the specified tags. The basic idea would be that these handlers would be called at the same time as (or rather immediately after) the commonTagsHandlers. At least that's my initial thinking without having done the in-depth investigation yet.

-- LeilaPearson - 28 Sep 2010

To move from the current simple preprocessing of sections to a full commonTagsHandler approach requires section content to be re-embedded into a text context after modification. This takes us back in the direction of HereDocumentSyntaxForMacros, and the early/late eval discussion.

-- CrawfordCurrie - 28 Sep 2010

 
Topic revision: r17 - 28 Sep 2010, CrawfordCurrie
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy