Nougat is a visible transformer mannequin from Meta AI that converts doc pictures, together with advanced math equations, into structured textual content, providing developments in tutorial paper parsing.
Now you can check out nougat-base within the Clarifai Platform and entry it via the API.
Desk of Contents
- Introduction
- Mannequin Structure
- Operating Nougat mannequin with Python
- Operating Nougat mannequin with Javascript
- Finest Use Circumstances
Introduction
Nougat is a visible transformer mannequin developed by researchers at Meta AI that may convert pictures of doc pages into structured textual content. It takes a scanned picture of a doc web page as enter and outputs textual content in a light-weight markup language.
The important thing benefit of Nougat is that it depends solely on the doc picture and doesn’t want any OCR textual content. This enables it to get well semantic construction like math equations correctly. It’s skilled on hundreds of thousands of educational papers from arXiv and PubMed to be taught the patterns of analysis paper formatting and language.
Mannequin Structure
Nougat makes use of a visible transformer encoder-decoder structure. The encoder makes use of a Swin Transformer to encode the doc picture into latent embeddings. The Swin Transformer processes the picture in a hierarchical trend utilizing shifted home windows. The decoder then generates the output textual content tokens autoregressive utilizing self-attention over the encoder outputs.
Operating Nougat mannequin with Python
You’ll be able to run Nougat with Clarifai’s Python SDK in only a few strains of code. To get began, Signup to Clarifai and get your Private Entry Token(PAT) following the directions right here.
Export your PAT as an setting variable
export CLARIFAI_PAT={your private entry token}
Take a look at the Code under to run the Mannequin:
Operating Nougat mannequin with Javascript
You can too run it with our Javascript Consumer:
You can too run Nougat utilizing different Clarifai Consumer Libraries like Java, cURL, NodeJS, PHP, and many others.
Mannequin Demo within the Clarifai Platform:
Check out the Nougat mannequin right here: https://clarifai.com/fb/nougat/fashions/nougat-base
Finest Use Circumstances
Nougat Mannequin has a variety of functions within the area of doc understanding and extraction. Some key use instances embody:
- Analysis Paper Parsing: Nougat can precisely parse analysis papers, extracting textual content, tables, figures, and equations from doc pictures. This functionality is essential for making the data in analysis papers extra accessible for numerous functions.
- Knowledge Extraction: The mannequin’s potential to transform documented pictures into structured textual content makes it helpful for extracting helpful knowledge from tutorial papers, which can be utilized for analysis, evaluation, and data-driven decision-making.
- Summarization: Nougat will be built-in into textual content summarization pipelines to extract and summarize the content material of analysis papers mechanically, saving effort and time for researchers.
Hold in control with AI
-
Comply with us on Twitter X to get the most recent from the LLMs
-
Be a part of us in our Discord to speak LLMs!