Department of Computer Science Rochester Institute of Technology Phone: (585) 475-4536 |
[ Home ] | [ News ] | [ Members ] | [ Projects ] | [ Publications ] | [ Software ] | [ Support ] |
For CROHME 2013, we (R. Zanibbi and H. Mouchère) have created libraries for use by participants, and for others interested in using the CROHME data and tools for their research in the future. This year, CROHME participants may create symbol layout trees in a label graph file, and then have these automatically converted to CROHME .inkml and 'normalized' label graph formats (details are provided below). We are providing two libraries:
- LgEval (Label Graph Evaluation Library): a python-based library for creating, comparing and visualizing label graphs (i.e. labeled adjacency matrices) representing objects (i.e. segments) and their structure. For CROHME, this will be math symbols and their spatial relationships (see more on this, below). LgEval uses Comma-Separated Variable files (CSV) to represent label graphs and evaluation metrics. A script (evaluate) for automatically evaluating label graph files, compiling metrics and visualizing errors is provided.
- CROHMELib: a set of tools for converting between different file formats representing stroke and recognition results. CROHME uses a custom .inkml (XML) file format, with MathML used to represent symbol layout, and tags and annotations for sampled points in strokes, and the segmentation of strokes into symbols with their associated class. CROHMELib also provides grammars for the competition, a parser, and the CROHME perl-based evaluation script for CROHME .inkml files (evalInkml_v1.X.pl).
Contents
- Installation
- CROHME InkML File Format
- Label Graph File Format
- Tools
- Symbol Classes and Relationships for CROHME 2013
Installation
To install CROHMELib and LgEval on your system, follow the steps below.
- Make sure that you have python (recommmended: 2.6 or 2.7) and perl installed on your system, and install the TXL programming language. All three languages are available for a wide variety of platforms, and python and perl are normally installed by default on Unix-based systems (including Linux, and MacOS X). Under Windows, Cygwin provides similar command-line facilities.
- Check that GraphViz) is installed on your system.
- Download the .zip archives
- CROHMELib (0.1.12 (updated May 3 - small correction in evalInkML script to handle COMMAs.)
- LgEval (0.2.11 (updated June 5, 2013*)
Uncompress these on your system.
To be able to use the scripts on the command line in bash shell without providing paths, assuming that you are using bash shell, add the following to the .bashrc file in your home directory:
export CROHMELibDir=[path_to_CROHMELib]
export LgEvalDir=[path_to_LgEval]
export PATH=$PATH:$CROHMELibDir/bin:$LgEvalDir/bin
CROHME InkML File Format
The CROHME .inkml file format is XML-based, and has the following sections:
- Header: identifies inkml format, and annotation (<annotation>, <traceFormat>) tags provide a unique identifier for each expression along with the ground truth (i.e. target recognition results) in LaTeX format. Additional optional annotations may be provided, e.g. to describe the format used for pen samples, demographic information on the writer, etc.
- MathML: this represents the structure of the math expression in Presentation MathML format (this represents symbol layout in a manner akin to LaTeX). MathML can be rendered directly by most web browsers (e.g. as used in the CROHME InkML Viewer). The MathML used for CROHME includes annotations that refer to symbols in the symbol list at the end of the file (given as xml:id attributes in symbol tags).
- Pen Strokes:
tags provide an identifier for each stroke, along with a series of (x,y) pen coordinates. Note!! expressions have been collected from different labs, on different devices, and in different countries - a variety of coordinate representations are used, including negative, floating point and integer coordinates. CROHME participants will want to carefully consider how to represent pen strokes across these representations.
- Symbol List: At the end of a CROHME .inkml file is a list of segments (i.e. symbols), given by a symbol label and list of pen strokes (<traceView> elements). Each symbol has an identifier that is used in the MathML structure representation (given by
tags). There is an outer <traceGroup> that contains a <traceGroup> for each symbol.
Label Graph File Format
A Label Graph is a labeled adjacency matrix representation for a graph (for more on the representation, please see our paper Evaluating structural pattern recognition for handwritten math via primitive label graphs from DRR 2013). For example, consider the expression '2+2,' written using one stroke for each 2, and two strokes for the '+' (one vertical stroke, and one horizontal stroke). To describe the layout of symbols in this expression from the set of strokes, we need to define:
- How strokes are grouped into symbols (i.e. stroke segmentation), and
- Which class each symbol has, and
- How symbols are spatially arranged.
In a label graph, we use adjacency matrices to represent relationships between all stroke pairs:
- Segmentation: an asterisk (*) label is defined between all pairs of strokes belonging to a symbol. This defines an undirected edge between all strokes in a symbol.
- Classification: a symbol's class is associated with each of its strokes, represented by a self-edge in the adjacency matrix (i.e. along the main diagonal). We represent symbol classes as node rather than edge labels (see example below).
- Structure (Symbol relationships): Are represented by a relationship from each stroke in a symbol to every other stroke in a symbol. Note that for math notation, spatial relationships are directed (i.e. hierarchical). In our example, all strokes of the '+' have an at-right (R) label associated with the stroke for the 2 on the right.
In LgEval, an 'undefined' label (e.g. no class, or no relationship between a pair of strokes) is represented by an underscore ('_'). Using stroke identifiers s1-s4, the label graph and associated adjacency matrix ('label matrix') for our '2+2' example looks like this:
\( \left[\begin{array}{cccc} 2 & R & R & R \\ \_ & + & * & R \\ \_ & * & + & R \\ \_ & \_ & \_ & 2\\ \end{array}\right] \) Inherited Relationships: you may be wondering about the 'R' edge between the leftmost and righmost '2.' Symbol layout in math expressions is often represented in the form of a tree (e.g. in LaTeX or MathML). So 2 + 2 can be represented by '\(2 \rightarrow + \rightarrow 2\),' the left-to-right ordering of symbols along the baseline. Note that this graph represents the fact that both the '+' and righmost '2' are right of the leftmost 2, i.e. the relationship is inherited along the tree. By inheriting spatial relationship, this allows us, for example, to identify that if we mis-recognize \(2^{a_i}\) as \(2^{ai}\), that while the relationship between 'a' and 'i' is incorrect, 'i' is still correctly within the superscript region of the 2. You do not need to create inherited relationships in your .lg outputs for the CROHME competition (see below, under 'Tools to the rescue!').
For vertical structures such as nested fractions, the dominant operator of the sub-expression must possess the incoming and outgoing right-of edges to sub-expressions adjacent on the baseline; this is identical to the representation used for LaTeX or MathML. As an example, below is a fraction and its associated layout tree. Note: for simplicity, here the primitives (nodes) are connected components (one node per symbol). Relationships are labeled by (A)bove, (B)elow, and at (R)ight.
\(\Huge \frac{\frac{a}{2} + \frac{b}{3}}{c}\) In the (final) label graph representation, each symbol in the symbol layout tree above will inherit all relationships from its ancestors; for example, all symbols above the widest fraction line inherit the Above ('A') relationship between the widest fraction line and the leftmost fraction line of the numerator, with an edge labeled 'A' between the widest line and all the strokes in symbols above it.
CSV Format (.lg files): Here is the CSV file, 2p2.lg for our '2+2' example. The file defines the labeled nodes and edges in the label graph shown above. Stroke/node labels are identified by N, and edges between strokes by E.
N, s1, 2
N, s2, +
N, s3, +
N, s4, 2
E, s2, s3, *
E, s3, s2, *
E, s1, s2, R
E, s1, s3, R
E, s1, s4, R
E, s2, s4, R
E, s3, s4, RLabels not specified in the file are automatically interpreted as undefined by LgEval. By default LgEval will generate label weights for all nodes and edges of value '1.0,' but you do not need to provide label weights in your .lg files.
Tools to the rescue! Participants can produce label graphs defining a layout trees similar to that for the fraction example above (with additional nodes for multi-stroke symbols), and then use tools in CROHMELib (specifically, mergeLgCrohme and convertLgCrohme) that will convert an .lg file containing a layout tree into a CROHME .inkml file, and another .lg file where all spatial relationships have been inherited.
Tools
Below is a brief summary of the tools available from LgEval and CROHMELib. The tools were created to make it easier to produce recognition output and view recognition results in the CROHME InkML Viewer, allow new stroke-level as well as object level metrics to be computed from label graphs, and to provide a toolset for converting between CROHME InkML, label graph, and other (e.g. from MathBrush) file formats. Additional documentation may be found by calling the scripts shown without arguments, and in the README files provided with the two libraries.
Note: if you use label graphs and/or LgEval for projects other than the CROHME 2013 competition, we would appreciate it if you would cite our paper published at Document Recognition and Retrieval XX (2013), Evaluating structural pattern recognition for handwritten math via primitive label graphs.
LgEval
LgEval is python-based. The programs in the library are described in the README included in the LgEval download.
- evaluate: takes two directories as input, one containing label graph files to be evaluated, and a second directory containing (identically named) label graph files providing ground truth. Metrics, errors, summaries, and visualizations of recognition errors are produced by the script, and stored in a new directory. Note: user .lg files should be first converted to the 'normlized' .lg format with inherited relationships using mergeLgCrohme or convertLgCrohme (batch conversion) prior to running evaluate.
- lg2dot: used to transform a label graph to a .dot (GraphViz) file, which is then rendered as a .pdf file. If two graphs are provided, the differences between the graph will be shown in the generated graph. Different graph types are available (layout tree (default), symbol DAG, primitive segmentation graph, primitive bipartite graph)
- lg2mml: converts a label graph to MathML.
CROHMELib
Each of the programs below may be found in the bin/ directory after uncompressing the LgEval archive. Additional details are provided in a README.
- mergeLgCrohme: takes a label graph file and the CROHME .inkml file whose stroke data the label graph describes, and produces a CROHME .inkml file as output (complete with generated MathML and symbol segmentation and classification information). This program will also produce 'normalized' .lg output, where inherited spatial relationships are created automatically.
- crohme2lg.pl: (perl script) produces a label graph from the MathML and symbol segmentation information in a CROHME .inkml file.
- mb2crohme: converts MathBrush files with the same symbol set as CROHME 2013 to a CROHME .inkml file (MathBrush data set is available separately).
- convertLgCrohme, convertCrohmeLg and convertMathBrush are designed for batch-converting large numbers of files.
- xmlGrammar2txt converts XML-format grammars for CROHME to a human-readable format.
- There are also TXL programs to 'pretty print' CROHME .inkml and MathML files in src/ (pprintCROHME.Txl, pprintMathML.Txl).
- The perl CROHME .inkml evaluation script evalInkml_v1.X.pl is included, along with startTestPhase.pl, used for batch testing.
- tokenAndParse.pl is provided for testing the validity of CROHME .inkml files for a given CROHME XML grammar.
Symbol Classes and Relationships for CROHME 2013
Spatial Relationships
CROHME 2013's training and test sets include single expressions that do not include matrices or other tabular/grid structures. The spatial relationships include the following (the labels to be used for label graphs/LgEval are shown in parentheses):
- (R)ight
- (A)bove
- (B)elow
- (I)nside (for square root)
- (Sup)erscript
- (Sub)script
nth-Roots, **Update (March 25): some cube roots have been found in the training data set, e.g. \( \sqrt[3]{x} \). The representation we are using has the 'nth-root' (here the 3) as (A)bove the square root, with the contents of the radical ('root' symbol) represented using the (I)nside relationship as before.
Limits: Limits of an integral or summation (\( \int \) or \( \sum \)) should be designated as (A)bove/(Sup)erscript and (B)elow/(Sub)script, consistent with the locations of the limits relative to the operator (i.e. the location of the limits matters).
Symbols
Participants should use the LaTeX names for symbols as given below. There are 102 symbol classes in total (Note: this has changed from the first release). Symbols were rendered from their LaTeX codes below using MathJax.
Note: fraction lines and subtraction signs are both represented by '-'; \times and X will also be treated as synonymns. For the function names given below, all strokes belonging to the function name should be grouped into a single 'symbol' labeled by the function name. COMMA is used instead of ',' to avoid conflicts with separators in CSV files.
- Digits
0-9- Letters
a-z, A-C, E-I, L-N, P, R-T, V, X, Y- Greek Letters
\alpha, \beta, \gamma, \lambda, \phi, \pi, \theta, \sigma, \mu, \Delta
\(\alpha, \beta, \gamma, \lambda, \phi, \pi, \theta, \sigma, \mu, \Delta\)- Arithmetic Operators
+, -, \pm, \div, !, \times, /
\( +, -, \pm, \div, !, \times, / \)- Logical Operators
\rightarrow, |, \forall, \exists
\(\rightarrow, |, \forall, \exists\)- Set Operators
\in
\( \in\)- Operators with Limits
\sum, \int
\(\sum, \int\)- Functions and Relations
=, \neq, \lt, \leq, \gt, \geq, \log, \sin, \cos, \tan, \lim
\( =, \neq, \lt, \leq, \gt, \geq, \log, \sin, \cos, \tan, \lim \)- Fence symbols
(, ), \{, \}, [, ]- Other symbols
\infty, COMMA, ., \ldots, \cdots
\( \infty, COMMA, ., \ldots, \cdots\)Expression Grammar
A text file containing the list of symbols along with the grammar defining legal symbol layout trees for the CROHME competition is included in CROHMELib, and is also available through the CROHME 2013 web pages. All training and test expressions will be consistent with the grammar provided.
Contact
If you have questions about CROHMELib or LgEval (or find bugs!), please contact Richard Zanibbi (rlaz@cs.rit.edu, Rochester Institute of Technology, USA) or Harold Mouchère (harold.mouchere@univ-nantes.fr, University of Nantes, France).
Last Updated: March 26, 2013