The Popcorn OpenMath Representation

Popcorn logo
Version: 1.0
The SCIEnce EU Project
Editors
Peter Horn, Dan Roozemond
April 2009

Abstract

This document describes the POPCORN representation of OpenMath [20]: a human readable and writable OpenMath Representation that is intended for direct user interaction. The acronym POPCORN stands for ``Possibly Only Practical Convenient OpenMath Replacement Notation'' and is written as ``Popcorn'' for typographic beauty.

In this document, we present Popcorn in a way following the structure of the OpenMath Standard.

Contents

3 OpenMath Encodings
3.4 The Popcorn OpenMath Encoding
3.4.1 A Grammar for the Popcorn Encoding
3.4.2 Description of the Grammar
3.4.3 Popcorn Infix Operators
3.4.4 Popcorn Shortcut Symbols
3.4.5 Examples of Popcorn Encoding
3.4.6 Implementation Note
3.4.7 Summary

3. OpenMath Encodings

3.4 The Popcorn Encoding

In the SCIEnce EU FP6 Project 26133 [21], OpenMath (OM) is heavily used to marshal mathematical objects to send them between Computer Algebra Systems (CASes). When developing the OM libraries for the CASes, there was a lot of `manual' inspection and typing of OM necessary to streamline the code. And, whenever discussions about `how-to-do', everybody started to use some de-xml'ed OM representation that was well understandable but by no means standardized.

Here, the fact that the OpenMath language allows for different representations anyway, inspired us to develop an OpenMath representation that can easily be handled by humans.

Furthermore, Popcorn is a great starting point when entering the world of OM XML, since you can type your semantics in a more ``what you type is how you think""™ way and convert that to ``proper'' OpenMath and vice versa.

3.4.1 A Grammar for the Popcorn Encoding

start
start expr
expr blockExpr
blockExpr assignExpr (';' assignExpr)*
assignExpr implExpr (':=' implExpr)?
implExpr orExpr (('==>' | '<=>') orExpr)?
orExpr andExpr ('or' andExpr)*
andExpr relExpr ('and' relExpr)*
relExpr intervalExpr (('=' | '<' | '<=' | '>' | '>=' | '!=' | '<>') intervalExpr)?
intervalExpraddExpr ('..' addExpr)?
addExpr multExpr (('-' | '+') multExpr)*
multExpr powerExpr (('/' | '*') powerExpr)*
powerExpr complexExpr ('^' complexExpr)?
complexExprrationalExpr ('|' rationalExpr)?
rationalExpr negExpr ('//' negExpr)?
negExpr ('-' | 'not') compExpr
 | compExpr;
compExprparaExpr
 |ecall
 |attribution
 |binding
 |listExpr
 |setExpr
 |anchor
commalistEMPTYCOMMALIST
 | expr (',' expr)*
callanchor '(' commalist ')'
ecall anchor '!' '(' commalist ')'
listExpr'[' commalist ']'
setExpr '{' commalist '}'
foreignExpr'`' FOREIGN '`'
attributionanchor '{' attributionList '}'
attributionListattributionPair (',' attributionPair)
attributionPairexpr '->' expr
bindinganchor '[' commalist '->' expr ']'
anchoratom (':' ID)
atomparaExpr
 |ID
 |symbol
 |var
 |intt
 |floatt
 |ref
 |OMB
 |FOREIGN
 |STRING
 |ifExpr
 |whileExpr
paraExpr'(' expr ')'
ifExpr'if' expr 'then' expr 'else' expr 'endif'
whileExpr'while' expr 'do' expr 'endwhile'
unaryOp'-' | 'not'
ID(('a'..'z' | 'A'..'Z' | '_')('a'..'z' | 'A'..'Z' | '0'..'9' | '_')*)|
('\''('a'..'z' | 'A'..'Z' | '_')('a'..'z' | 'A'..'Z' | '0'..'9' | '_' | '.' | ':' | '_' | ...)*'\'')
(see here for Details.)
symbolID '.' ID
var'$' ID;
reflref | GREF
lref'#' ID
GREF'##' .+ '##'
OMB'\%'('a'..'z' | 'A'..'Z' | '0'..'9' | '=')+'\%'
inttHEXINT | DECINT
DECINT'0'..'9'+
HEXINT'0x'('a'..'f' | 'A'..'F' | '0'..'9')+
floattHEXFLOAT | DECFLOAT
HEXFLOAT'0f'('a'..'f' | 'A'..'F' | '0'..'9'){8,8}
DECFLOAT'0'..'9'+ '.' '0'..'9'+ ('e' '-'? '0'..'9'+)?
STRING('"' .* '"')
FOREIGN'`' .* '<' .+ '>`'
WS(' ' | '\t' | '\n' | '\r')+
COMMENT'/*' .* '*/'
Figure 3.7 Grammar of the Popcorn encoding of OpenMath objects.

Figure 3.7 gives a grammar for the Popcorn encoding ("start" is the start symbol). Characters within ' are considered as literals, and |, +, ?, ., and * have the usual regexp meaning.

3.4.2 Description of the Grammar

An OpenMath object is encoded as a sequence of characters, where the tree structure is defined in a rather intuitive way, very much as in programming languages, mathematics, or CASes.

Here is a description of the Popcorn encodings of every kind of OpenMath object:

Integers

can be encoded in two ways, either decimally or hexadecimally. In the decimal representation, an integer is typed simply as a sequence of '0',...,'9' with the most signifacts decimals first.

For the hexadecimal encoding, the hex-digits are typed as '0',...,'9' and 'a',...,'f' (where case does not matter), and the whole number is prefixed with '0x' as in C. Again, the most significant nibbles come first.

Symbols

are encoded as the name of the cd, immediately followed by a dot and the name of the symbol, e.g. arith1.plus. Additionally, for some symbols, there is a cdname free representations, cf. section 3.4.4.

Above that, for the application of some symbols, there is a `cannonical' infix notation available, cf. section ``infix''.

Variables

are encoded as the name of the variable prefixed with an $.

Floating-point number

are encoded by either the decimal representation known from the xml encoding, or by the hexadecimal ecoding, which has to be prefixed with a 0f.

Character string

are encoded by wrapping them in ", where the usual substitutions \n, \r, \t, \\, and \" can be used.

Bytearrays

are encoded as a base64 character stream wrapped in %. Whitespaces are ignored, here.

Foreign Objects

are encoded as an xml string wrapped in `, where the value of the encoding attribute may preceed the actual xml encoded string, e.g., `xhtml 1.0<h1>text</h1>`.

cdbase scopes

are not yet dealt with. They will be added later on by postfixing the cdbase url behind an @.

Applications

are encoded by postfixing the ()-paranthesized list of comma seperated arguments to the object to be applied. The argument list may of cause be empty.

Bindings

are encoded by appending a []-bracketed pair of comma seperated bound variables and the bound expression to the binding operator. Inside the brackets, the variables and the expression are seperated with ->.

Attributions

are encoded by appending a {}-braced, comma sperated list of attribution pairs. Each attribution pair is seperated with ->.

Errors

are encoded as an application where a ! is inserted between the error symbol and the parantheses.

Internal References

are encoded by prefixing the local name with #.

External References

are encoded by wrapping the external url in ##.

id's
are set by postfixing the object with :theid.

3.4.3 Popcorn Infix Operators

As a convenience addition, there are several infix operators defined in Popcorn:

3.4.4 Popcorn Shortcut Symbols

As another convenience addition, for some commonly used symbols there is a cdname-free version available:

3.4.5 Examples of Popcorn Encoding

To clarify the concept of Popcorn, we give a couple of examples:

3.4.6 Implementation Note

A Popcorn parser and a Popcorn renderer are implemented in the symcomp.org java package written by the authors. It is implemented using the ANTLR v3 parser generator, and therefore the grammar files can easily be used to produce e.g. C output, too.

Since the library is published under an Apache 2 license, it may be freely used, although we appreciate every kind of feedback.

3.4.7 Summary

Popcorn offers an easy entry into the world of OpenMath, by allowing humans to easily read and write OpenMath objects.

Notice: We want to stress, again, that Popcorn is not intended as a 'global' replacement for the other OM representations, but merely as a more intuitive option to deal with OM whenever human interaction is required.

Appendix G
Bibliography

[20] OpenMath Consortium OpenMath Version 2.0, June 2004.
http://www.openmath.org/standard/om20-2004-06-30/omstd20html-0.xml

[21] Symbolic Computation Infrastructure for Europe.
http://www.symbolic-computation.org