Extract textual content from images
No description provided.
Workflow steps(13)
2 Define the characteristics of the image
3 Survey existing experiences
4 Choose engine based on the type of the content
5 Layout analysis
6 Create manual transcriptions
7 Training the model
8 Test on a subset and assess quality
9 Correct output
10 Re-train the model with corrected output
11 Produce OCR output in standardized format
12 Extract the structure information from recognized blocks
13 A few visualization options
The SSH Open Marketplace is maintained and will be further developed by three European Research Infrastructures - DARIAH, CLARIN and CESSDA - and their national partners. It was developed as part of the "Social Sciences and Humanities Open Cloud" SSHOC project, European Union's Horizon 2020 project call H2020-INFRAEOSC-04-2018, grant agreement #823782.