MathDeck is a first-of-its-kind search interface designed to make it easier to edit, reuse, annotate, share, and search formulas. The system is based around a formula 'chip and card' metaphor inspired by board games. Recently the system was extended to support search of formulas extracted from PDF files in the ACL Anthology, with matches shown in-place within search results.
(funded by NSF & Alfred P. Sloan Foundation)
SIGIR 2023 (demo page - poster - video); CHI 2021 demo video; ECIR 2020 demo video - online system (Google Chrome strongly recommended)
We have created a number of state-of-the-art formula search engines, including models based on graphs representing operations and symbol placement on writing lines (Tangent-S and Approach0), on symbol locations in rendered formulas (PHOC), and on vectors produced by neural networks (Tangent-CFT ).
(Funded by NSF and the Alfred P. Sloan Foundation)
(Video overview ECIR 2021) (ARQMath web page)
ARQMath is a cooperative evaluation exercise aiming to advance math-aware search and the semantic analysis of mathematical notation and texts. The lab has been run for three years, for CLEF 2020, 2021, and 2022.
ARQMath has two main tasks, one for answer retrieval, and one for formula retrieval. In the third year a task for open-domain question answering was added, where answers may come from any source, or may be generated (e.g., by GPT-3). The ARQMath collection has become the largest collection of its kind, with over 200 search topics annotated for the answer retrieval and formula retrieval tasks.
(funded by NSF & Alfred P. Sloan Foundation)
AccessMath tracks whiteboard contents in lecture videos through time and generates keyframes. 'Ink' in keyframes can be used to jump to where that 'ink' is drawn in the video, or searched using image queries through the Tangent-V search engine. (NSF-funded project - Video Demo)
The first multi-modal equation editing prototype, with support for handwriting, typing, and image input for formulas, along with support for math + keyword search.
(NSF-funded project; Video Demo)
A prototype that supports searching lecture videos for spoken keyterms (within-speaker), using a modified Dynamic Time Warping algorithm.
(NSF-funded project; Example Results)
Back in 2008, Kurt Kluever created the first video captcha, where users enter three words describing a video.
(Funded by Xerox Corporation)
The dprl has participated in every CROHME competition since its inception in 2011, and co-organized the competition since 2013. CROHME has become a standard benchmark for handwritten formula recognition. The last CROHME the lab helped organize was held in 2019, adding a typeset formula detection task. (CROHME 2023 was next held for ICDAR 2023)
ChemScraper is an online tool for extracting molecules from PDFs. Unlike previous molecule extraction tools, it locates and recognizes diagrams that were created with vector graphics (i.e., explicit drawing instructions in the PDF) or images, and generates CDXML (ChemDraw) and SMILES files as output.
The first 'alpha' version was released in June 2023, and we have continued to improve the system. ChemScraper was created through a collaborative effort between the dprl, the Denmark Lab at UIUC, and NCSA. (ICDAR 2024 journal paper preprint)
(funded by NSF through the MMLI AI Center)
Room 70-3500, GCCIS
Dept. Computer Science
Rochester Inst. Technology
Rochester, NY, 14623-5608
USA
Email: rxzvcs@rit.edu
Phone: +1 (585) 475-4536
Fax: +1 (585) 475-4935
© Copyright 2019 - 2024 dprl@RIT - All Rights Reserved