The content of this dataset is licensed to SIGMathLing members for research and tool development purposes.
This is the project to create a dataset for grounding of formulae.
As a trial work, this dataset consists of an annotated long paper (20 pages in PDF):
The original XHTML file of the paper was taken from the arXMLiv:08.2018
dataset, and we manually annotated all
937 identifiers (i.e.,
<mi> tags) in the document to the corresponding
mathematical objects (meanings).
The annotation is performed with our open-source annotation tool MioGatto. The tool is also suitable for viewing the data. Please refer to its documentation for the details.