Path

ez components / documentation / api reference / 2008.2.3 / document


eZ Components 2008.2.3

Document: ezcDocumentWikiParser

[ Tutorial ] [ Conversion ] [ Class tree ] [ Element index ] [ ChangeLog ] [ Credits ]

Class: ezcDocumentWikiParser

Parser for wiki documents [source]

Parents

ezcDocumentParser
   |
   --ezcDocumentWikiParser

Member Variables

protected array $conversionsArray = array(
'ezcDocumentWikiEndOfFileToken' => 'ezcDocumentWikiDocumentNode',
'ezcDocumentWikiTextLineToken' => 'ezcDocumentWikiTextNode',
'ezcDocumentWikiWhitespaceToken' => 'ezcDocumentWikiTextNode',
'ezcDocumentWikiSpecialCharsToken' => 'ezcDocumentWikiTextNode',

'ezcDocumentWikiTitleToken' => 'ezcDocumentWikiTitleNode',
'ezcDocumentWikiParagraphIndentationToken' => 'ezcDocumentWikiBlockquoteNode',
'ezcDocumentWikiQuoteToken' => 'ezcDocumentWikiBlockquoteNode',
'ezcDocumentWikiPageBreakToken' => 'ezcDocumentWikiPageBreakNode',
'ezcDocumentWikiBulletListItemToken' => 'ezcDocumentWikiBulletListItemNode',
'ezcDocumentWikiEnumeratedListItemToken' => 'ezcDocumentWikiEnumeratedListItemNode',
'ezcDocumentWikiLiteralBlockToken' => 'ezcDocumentWikiLiteralBlockNode',
'ezcDocumentWikiTableRowToken' => 'ezcDocumentWikiTableRowNode',
'ezcDocumentWikiPluginToken' => 'ezcDocumentWikiPluginNode',

'ezcDocumentWikiBoldToken' => 'ezcDocumentWikiBoldNode',
'ezcDocumentWikiItalicToken' => 'ezcDocumentWikiItalicNode',
'ezcDocumentWikiUnderlineToken' => 'ezcDocumentWikiUnderlineNode',
'ezcDocumentWikiMonospaceToken' => 'ezcDocumentWikiMonospaceNode',
'ezcDocumentWikiSubscriptToken' => 'ezcDocumentWikiSubscriptNode',
'ezcDocumentWikiSuperscriptToken' => 'ezcDocumentWikiSuperscriptNode',
'ezcDocumentWikiDeletedToken' => 'ezcDocumentWikiDeletedNode',
'ezcDocumentWikiStrikeToken' => 'ezcDocumentWikiDeletedNode',
'ezcDocumentWikiInlineQuoteToken' => 'ezcDocumentWikiInlineQuoteNode',
'ezcDocumentWikiLineBreakToken' => 'ezcDocumentWikiLineBreakNode',
'ezcDocumentWikiInlineLiteralToken' => 'ezcDocumentWikiInlineLiteralNode',

'ezcDocumentWikiSeparatorToken' => 'ezcDocumentWikiSeparatorNode',
'ezcDocumentWikiTableHeaderToken' => 'ezcDocumentWikiTableHeaderSeparatorNode',

'ezcDocumentWikiExternalLinkToken' => 'ezcDocumentWikiExternalLinkNode',
'ezcDocumentWikiInterWikiLinkToken' => 'ezcDocumentWikiInterWikiLinkNode',
'ezcDocumentWikiInternalLinkToken' => 'ezcDocumentWikiInternalLinkNode',
'ezcDocumentWikiLinkStartToken' => 'ezcDocumentWikiLinkNode',
'ezcDocumentWikiLinkEndToken' => 'ezcDocumentWikiLinkEndNode',

'ezcDocumentWikiImageStartToken' => 'ezcDocumentWikiImageNode',
'ezcDocumentWikiImageEndToken' => 'ezcDocumentWikiImageEndNode',

'ezcDocumentWikiFootnoteStartToken' => 'ezcDocumentWikiFootnoteNode',
'ezcDocumentWikiFootnoteEndToken' => 'ezcDocumentWikiFootnoteEndNode',
)

Array with token node conversions.

Token to node conversions are used for tokens, which do not require any additional checking of the tokens context. This is especially useful, because the wiki tokenizer already implement a lot of this logic.
protected array $documentStack = array()
Contains a list of detected syntax elements.

At the end of a successfull parsing process this should only contain one document syntax element. During the process it may contain a list of elements, which are up to reduction.
Each element in the stack has to be an object extending from ezcDocumentRstNode, which may again contain any amount such objects. This way an abstract syntax tree is constructed.
protected bool $insideLineToken = false
Flag if we are inside a line level node
protected array $reductions = array(
'ezcDocumentWikiTextNode' => array(
'reduceText',
),'ezcDocumentWikiParagraphNode'=>array('reduceParagraph',),'ezcDocumentWikiInvisibleBreakNode'=>array('reduceLineNode',),'ezcDocumentWikiTitleNode'=>array('reduceTitleToSection',),'ezcDocumentWikiSectionNode'=>array('reduceLists','reduceSection',),'ezcDocumentWikiMatchingInlineNode'=>array('reduceMatchingInlineMarkup',),'ezcDocumentWikiBlockquoteNode'=>array('reduceBlockquoteNode',),'ezcDocumentWikiLinkEndNode'=>array('reduceLinkNodes',),'ezcDocumentWikiImageEndNode'=>array('reduceImageNodes',),'ezcDocumentWikiFootnoteEndNode'=>array('reduceFootnoteNodes',),'ezcDocumentWikiBulletListItemNode'=>array('reduceBulletListItem',),'ezcDocumentWikiEnumeratedListItemNode'=>array('reduceEnumeratedListItem',),'ezcDocumentWikiTableRowNode'=>array('reduceTableRow',),)

Array containing simplified reduce ruleset

We cannot express the Wiki syntax as a usual grammar using a BNF. This structure implements a pseudo grammar by assigning a number of callbacks for internal methods implementing reduction rules for a detected syntax element.
1.   array(
2.       ezcDocumentWikiNode::DOCUMENT => 'reduceDocument'
3.       ...
4.   )
protected array $shifts = array(
'ezcDocumentWikiEscapeCharacterToken'
=> 'shiftEscapeToken',
'ezcDocumentWikiTitleToken'
=> 'shiftTitleToken',
'ezcDocumentWikiNewLineToken'
=> 'shiftNewLineToken',
'ezcDocumentWikiEscapeCharacterToken'
=> 'shiftEscapeToken',
'ezcDocumentWikiToken'
=> 'shiftWithTokenConversion',
)

Array containing simplified shift ruleset

We cannot express the Wiki syntax as a usual grammar using a BNF. With the pumping lemma for context free grammars [1] you can easily prove, that the word a^n b c^n d e^n is not a context free grammar, and this is what the title definitions are.
This structure contains an array with callbacks implementing the shift rules for all tokens. There may be multiple rules for one single token.
The callbacks itself create syntax elements and push them to the document stack. After each push the reduction callbacks will be called for the pushed elements.
The array should look like:
1.   array(
2.       WHITESPACE => array(
3.           reductionMethod,
4.           ...
5.       ),
6.       ...
7.   )
[1] http://en.wikipedia.org/wiki/Pumping_lemma_for_context-free_languages

Inherited Member Variables

From ezcDocumentParser:
protected  ezcDocumentParser::$options
protected  ezcDocumentParser::$properties

Method Summary

protected ezcDocumentWikiListNode mergeListRecursively( $lists )
Merge lists recusively
public ezcDocumentWikiDocumentNode parse( $tokens, &$tokens )
Parse token stream
protected mixed reduceBlockquoteNode( $node )
Reduce multiline blockquote nodes
protected mixed reduceBulletListItem( $node )
Reduce bullet list items to list
protected mixed reduceEnumeratedListItem( $node )
Reduce enumerated list items to list
protected mixed reduceFootnoteNodes( $node )
Reduce wiki footnotes
protected mixed reduceImageNodes( $node )
Reduce wiki image references
protected mixed reduceLineNode( $node )
Reduce line node
protected mixed reduceLinkNodes( $node )
Reduce wiki links
protected mixed reduceLists( $node )
Reduce lists
protected mixed reduceMatchingInlineMarkup( $node )
Reduce matching inline markup
protected mixed reduceParagraph( $node )
Reduce paragraph
protected void reduceSection( $node )
Reduce prior sections, if a new section has been found.
protected mixed reduceTableRow( $node )
Reduce table rows
protected mixed reduceText( $node )
Reduce text nodes
protected void reduceTitleToSection( $node )
Reduce all elements to one document node.
protected mixed shiftEscapeToken( $token, &$tokens )
Shift escape token
protected mixed shiftNewLineToken( $token, &$tokens )
Shift new line token
protected mixed shiftTitleToken( $token, &$tokens )
Shift title token
protected mixed shiftWithTokenConversion( $token, &$tokens )
Shift with token conversion

Inherited Methods

From ezcDocumentParser :
public ezcDocumentParser ezcDocumentParser::__construct()
Construct new document
protected void ezcDocumentParser::triggerError()
Trigger parser error

Methods

mergeListRecursively

ezcDocumentWikiListNode mergeListRecursively( $lists )
Merge lists recusively
Merge lists recusively

Parameters

Name Type Description
$lists array  

parse

ezcDocumentWikiDocumentNode parse( $tokens, array &$tokens )
Parse token stream
Parse an array of ezcDocumentWikiToken objects into a wiki abstract syntax tree.

Parameters

Name Type Description
&$tokens array  
$tokens array  

reduceBlockquoteNode

mixed reduceBlockquoteNode( ezcDocumentWikiBlockquoteNode $node )
Reduce multiline blockquote nodes
Reduce multline block quote nodes, which are not already closed by line endings.

Parameters

Name Type Description
$node ezcDocumentWikiBlockquoteNode  

reduceBulletListItem

mixed reduceBulletListItem( ezcDocumentWikiBlockLevelNode $node )
Reduce bullet list items to list
Reduce list items to lists, and create new wrapping list nodes.

Parameters

Name Type Description
$node ezcDocumentWikiBlockLevelNode  

reduceEnumeratedListItem

mixed reduceEnumeratedListItem( ezcDocumentWikiBlockLevelNode $node )
Reduce enumerated list items to list
Reduce list items to lists, and create new wrapping list nodes.

Parameters

Name Type Description
$node ezcDocumentWikiBlockLevelNode  

reduceFootnoteNodes

mixed reduceFootnoteNodes( ezcDocumentWikiFootnoteEndNode $node )
Reduce wiki footnotes
Reduce inline footnotes

Parameters

Name Type Description
$node ezcDocumentWikiFootnoteEndNode  

reduceImageNodes

mixed reduceImageNodes( ezcDocumentWikiImageEndNode $node )
Reduce wiki image references
Reduce image references with all of their aggregated parameters.

Parameters

Name Type Description
$node ezcDocumentWikiLinkEndToken  

reduceLineNode

mixed reduceLineNode( ezcDocumentWikiInvisibleBreakNode $node )
Reduce line node
Line nodes are closed at the end of their respective line. The end is marked by an ezcDocumentWikiInvisibleBreakNode.

Parameters

Name Type Description
$node ezcDocumentWikiInvisibleBreakNode  

reduceLinkNodes

mixed reduceLinkNodes( ezcDocumentWikiLinkEndNode $node )
Reduce wiki links
Reduce links with all of their aggregated parameters.

Parameters

Name Type Description
$node ezcDocumentWikiLinkEndToken  

reduceLists

mixed reduceLists( ezcDocumentWikiBlockLevelNode $node )
Reduce lists
Stack lists with higher indentation into each other and merge multiple lists of same type and indentation.

Parameters

Name Type Description
$node ezcDocumentWikiBlockLevelNode  

reduceMatchingInlineMarkup

mixed reduceMatchingInlineMarkup( ezcDocumentWikiMatchingInlineNode $node )
Reduce matching inline markup
Reduction rule for inline markup which is intended to have a matching counterpart in the same block level element.

Parameters

Name Type Description
$node ezcDocumentWikiMatchingInlineNode  

reduceParagraph

mixed reduceParagraph( ezcDocumentWikiParagraphNode $node )
Reduce paragraph
Paragraphs are reduce with all inline tokens, which have been added to the document stack before. If there are no inline nodes, the paragraph will be ommitted.

Parameters

Name Type Description
$node ezcDocumentWikiParagraphNode  

reduceSection

void reduceSection( ezcDocumentWikiSectionNode $node )
Reduce prior sections, if a new section has been found.
If a new section has been found all sections with a higher depth level can be closed, and all items fitting into sections may be aggregated by the respective sections as well.

Parameters

Name Type Description
$node ezcDocumentWikiSectionNode  

reduceTableRow

mixed reduceTableRow( ezcDocumentWikiTableRowNode $node )
Reduce table rows
Reduce the nodes aagregated for one table row into table cells, and merge the table rows into table nodes.

Parameters

Name Type Description
$node ezcDocumentWikiTableRowNode  

reduceText

mixed reduceText( ezcDocumentWikiTextNode $node )
Reduce text nodes
Reduce texts into single nodes, if the prior node is also a text node. This reduces the number of AST nodes required to represent texts drastically.

Parameters

Name Type Description
$node ezcDocumentWikiTextNode  

reduceTitleToSection

void reduceTitleToSection( ezcDocumentWikiTitleNode $node )
Reduce all elements to one document node.

Parameters

Name Type Description
$node ezcDocumentWikiTitleNode  

shiftEscapeToken

mixed shiftEscapeToken( ezcDocumentWikiToken $token, &$tokens )
Shift escape token
Escape tokens will cause that the following token is ignored in his common meaning. The following token is converted to plain text, while the escape token will be removed.

Parameters

Name Type Description
$token ezcDocumentWikiToken  
&$tokens array  

shiftNewLineToken

mixed shiftNewLineToken( ezcDocumentWikiToken $token, &$tokens )
Shift new line token
Paragraphs are always indicated by multiple new line tokens. When detected we just shift a paragraph node, which the will be reduced with prior inline nodes.

Parameters

Name Type Description
$token ezcDocumentWikiToken  
&$tokens array  

shiftTitleToken

mixed shiftTitleToken( ezcDocumentWikiToken $token, &$tokens )
Shift title token
Some wiki markup languages use a second title token at the end of the line instead of just a line break. In the case we are already inside a line token, just shift an invisible line break.

Parameters

Name Type Description
$token ezcDocumentWikiToken  
&$tokens array  

shiftWithTokenConversion

mixed shiftWithTokenConversion( ezcDocumentWikiToken $token, &$tokens )
Shift with token conversion
Token to node conversions are used for tokens, which do not require any additional checking of the tokens context. This is especially useful, because the wiki tokenizer already implement a lot of this logic.
The actual conversions are specified in the class property $conversionsArray.

Parameters

Name Type Description
$token ezcDocumentWikiToken  
&$tokens array  

Last updated: Mon, 11 May 2009