Feature Proposal: Demand-parsing of tables would solve several outstanding problems

Motivation

This is a core technical proposal which will have minimal impact on end users. It will, however, have significant impact on plugin authors (though existing plugins should continue to work).

We have long wanted to centralise and simplify table parsing; by doing this we regain control over table syntax and positioning and ensure consistent support.

Description and Documentation

Several plugins, and even some core functions, use tables in topic text to source data. There are three requirements:
  1. Parse tables statically defined in topic text, e.g. to support editing of the topic (EditRowPlugin is an example).
  2. Parse tables already generated in the topic for further processing e.g. output of SEARCH
  3. A common parsing tool to support tables used by core (e.g. form schema definitions)

The first of these requirements can be addressed by re-use of an existing table parser; the parser in EditRowPlugin already creates an abstract table representation, and could be extended to support HTML tables as well (if appropriate). So the first proposal is to re-use this parser to create a "table-view parser" that will generate a structure when run on a topic. This parser would be made visible through Func.

The second requirement is more challenging. The problem is characterised by the question "how can I do CALC as a true macro" (rather than in a commonTagsHandler). The problem is that CALC has to operate on the results of other macro expansions - either (1) a named table or (2) the immediately preceding table in the topic. CALC is currently very difficult to use, because the flow of processing is not intuitive, and it can be very difficult to predict the state of the topic when it is called. This is because it is implemented using the commonTagsHandler, which is extremely general, but from a user's perspective is called unpredictably. To be understood by end users, macro expansion has to follow predictable, natural, simple rules.

At any given time during macro expansion a topic being processed has three parts: (1) content that has been fully expanded (2) content that is in processing and (3) content that has not started processing. Content that is in processing is the current content of the processing stack, i.e. the recursive context of the currently-processed macro.

Looking briefly back at commonTagsHandler, this handler is called during processing immediately after processing of registered macros. Thus the content passed to the handler has all registered macros expanded. A CALC is currently able to assume that all tables in the topic that are a result of macro expansions (such as INCLUDE) are available at the time the CALC is invoked. However CALC does not need this; because it only supports table operations on the immediately-preceding table, it in fact only needs access to the immediately-preceding table (i.e. the last table expanded immediately prior to the CALC call). Remember at this stage we re talking only about TML tables; pure HTML tables are not handled by CALC.

The conclusion of this is that CALC does not in fact require to be run from the commonTagsHandler. CALC can be run from a registered tag handler so long as it has access to the most recent table. Such access can be made available by demand parsing the content that has been full expanded (part (1) above, available from the macro processing stack). The same parser described for requirement (1) can be used to provide this support. Access to the most recent table

So how does this help wiki application developers? Discussions elsewhere have focused on efficient processing of topics by recognising the flow of processing through the topic viz. macros are expanded in inside-out-left-right order. This ordering is simple and natural, and relatively easy to explain.

Finally, note that some usages of CALC might not work. This is because they have been coded to rely on the fact that the commonTagsHandler is run in a convoluted and unpredictable way, and usually relate to issues with plugin execution ordering.

Notes on the EditRowPlugin table parser:
  • The parser extracts EDITTABLE and TABLE macros, as well as TML tables.
  • The parser can be made to operate either in a sequential event based style, or in a parse tree output style.
  • The output of the parser is currently tailored to table editing, and as such some of the call addressing mechanisms are rather heaviweight. However by passing clues as to the application of the parser, this can be optimised.

Notes on SpreadSheetPlugin and EditRowPlugin:
  • As noted above, this change may impact some applications of SSP that are overly sensitive to inter-plugin calling orders.
  • ERP will have to be extensively recoded to adopt the new parser. However it has already been written with this abstraction in mind, so hopefully this should minimise the pain.

-- Contributors: CrawfordCurrie - 21 Apr 2012

Discussion

yes please smile

-- SvenDowideit - 21 Apr 2012

TablePlugin is affected as well as it implements yet another table parser.

-- MichaelDaum - 23 Apr 2012

Is this planned for 1.2?

-- GeorgeClark - 07 May 2012

Yup; it's under the hood, and the EditRowPlugin uses it. I've yet to determine if any other core modules can be / need to be recoded to use it, but it should be in 1.2. It's not a user feature (yet) so should not need any external doc.

-- CrawfordCurrie - 07 May 2012

Foswiki::Form might take advantage of it

-- PaulHarvey - 14 Jun 2012
 
Topic revision: r11 - 05 Jul 2015, GeorgeClark
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy