Document extract

This model is layoutlmv2 base model

if you want to use this model then you have to preprocessing the data to use this model.(use LayoutLMv2Processor models)

Process

  1. I used Korean language invoice document image data to training this model
  2. Use Naver Clova service for extract text data from images
  3. Determining text Label(target) for each text box
  4. Combining the image text, bounding box position data, Label
  5. And use LayoutLMv2Processor models for encoding the data
  6. Do prediction for encoded data to this model