Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
380 changes: 190 additions & 190 deletions artifacts/validation/OCR/bench-ocr.json

Large diffs are not rendered by default.

10 changes: 5 additions & 5 deletions artifacts/validation/OCR/summary-ocr.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,20 @@

## Global
| scope | CER | Token-F1 | line_F1 | n_files |
| Global | 0.2364 | 0.7035 | 0.3878 | 24 |
| Global | 0.2447 | 0.6954 | 0.3794 | 24 |

## By dataset
| scope | CER | Token-F1 | line_F1 | n_files |
| FUNSD | 0.0148 | 1.0000 | 1.0000 | 4 |
| SROIE2019 | 0.0631 | 0.8816 | 0.7352 | 4 |
| ICDAR | 0.7248 | 0.3618 | 0.0000 | 4 |
| ICDAR | 0.7281 | 0.3611 | 0.0000 | 4 |
| PUBTABLES | 0.2689 | 0.5715 | 0.0899 | 8 |
| MARMOT | 0.0778 | 0.8346 | 0.4114 | 4 |
| MARMOT | 0.0807 | 0.8319 | 0.4025 | 4 |
| SROIE2019 | 0.1066 | 0.8367 | 0.6941 | 4 |

## Top-5 worst files
| dataset/file | cer_char | token_f1 | line_f1 | note |
| ICDAR/cTDaR_t00015 | 0.7768 | 0.3080 | 0.0000 | |
| ICDAR/cTDaR_t00080 | 0.7516 | 0.3719 | 0.0000 | |
| ICDAR/cTDaR_t00080 | 0.7648 | 0.3690 | 0.0000 | |
| ICDAR/cTDaR_t00016 | 0.7078 | 0.4268 | 0.0000 | |
| PUBTABLES/PMC1064082_table_0 | 0.6912 | 0.2000 | 0.0000 | |
| ICDAR/cTDaR_t00014 | 0.6631 | 0.3406 | 0.0000 | |
Expand Down
98 changes: 50 additions & 48 deletions dataset/validation/_ocr/markitdownnet/ICDAR/cTDaR_t00080.txt
Original file line number Diff line number Diff line change
@@ -1,50 +1,52 @@
Ba otr alorrire, mn Cathe L ares ; Fg 4 es Page Le -
LE as 3 SORE ST Sera Syprr stack aite Heya,
= oe ee 9 iy A ah ; , 2A:
fy |Peoee 2 7 : : rinse a
Too i en = base GLE, : /
pth Aye — ssf
eae ON I % ia. eee
nfs] sp ak f- OY te OO, ee MPEP — —— a see.
ye WO en |e eaten | faa! ae 1 Tie
om GA f i oe ae ) i ' Bas 7
AU | v AYN tL LIP ru oY 2 spl 2m g 7 pe Sofi Hob ’
hod} mins i/ Bf. V4 Gdns ie M Al ff // :
# | Lietigtia a. é wr di iS Luh if / J <7} . ff vy 4 a
five | Lu Tap, Lom ling Mlseelors thn ym ) Faas H
na PEW tan | Peep lil, » eae
Te os > 44 ref 2 » WF Fe ile ae ; ( ots e A q i
0 seal Dp on Pkeanaby Sai. |
5 eat ey IE ADR | WY, 24 . @
: : [= | * a . oe BE bd sar jee v pod A a
be Xh > f Oy r VLnMNALBY wr REEL. “le a
| Gogh. Wy Pre! AT Ae anv pam
ay 4 prov. heb a Yow kath s ge fv br : 7 U/ zs : ;,
cl Lv" | Aix wher. _ 4 sey of ) | | —
oes Pi 8 de oe ee —— 4a
Vig ee by’ 4 pages ES oat | iB SS : :
tc 9 9t AS Z it padlox voa- fire tin soe a Wy eh
TEBE : eae a |
ee 4 a re hy is Z bp if A. Di i
Fees at = ade Let ooh flor ft |B ae
f { f j Jie f- . e 4
ipcion aig eter | ae ae . £L,. ey
4 sub. by. “LYS Px : Lt 7 : : ae : i
re ae eee) >,
| ee acs Pflomnn heed yt nto ae i — - a ‘4
ut 6 Us 4 Z Py 4 3 ef
YP hi oy , bey sit a “Ay s6Uy Sack ase: & a |
eae ter flay ay 2be / A
De nh ; 90 —— ! a
fa Hafli Rabat Bint SyhegYouyinem Be isto? @
4 SES eo, ale 2 bm «fe 7 ee, .. i
Si! eee Se ee
cig 77zZ Cache Ja 2a rocaala Igy ie hae by > A387. ;
5 ayy : al > 7 < .

ff : Ree.
b2 1G. a Aid v7 = Me
i eithe Aero ee | oe
if La 3° VA , of " : J OH 3 Rese
fe te pee [tig | ~ a Aha AP IF7 : y PLD) sa cf y ’ ¥ Bo gh Z P PES.
| spilt Bisping nt) | oe
Za ak ae Fas 4 FA EB “ age Lb, : ane ee oe =e
soy iene a Ny AOP i
Tuy é eee | ope Ke . SiN aay oy i
Tie 0 ‘ one s Ai : Pua¢t7sz2t , * g thy ad —<— FF
i. es 4 fi : BPCLLS q
rin Pose f | . 7 5 ce eee ome oe
Pov". aunaber ss A Gls Moa pi pice TN CHa fanf Vit. ee > oe
Uhm Bry. eas On / 7, é : ry Ss oe haisdd 4 , f SU rine Gy §
J Wi | v ALY sess ALAS 14 esasdit act gay, nl $f i} Y) fe ‘ gp | : ;
{wd | eh Es / il ping “4 VA a rarer / h| '/ é /, 5 i f
mas TiO» Ie Ce. , 7 — ‘it —~Mlawbei. than yr ) a i '
its pil csobetip bs, é par Arig Balbo 4’ | VG Baa) |
tay { ene —— A Serssta ot ppg ya a ae i - VA y &* ‘.
pvr. / 7 Urng~ if ay i
eas By oe TE PW GY yo .
if lon MAMA PAO. WA ‘ t, 24 fr a
Fie Porshe y gilyorud~ a.
IG BN eee eee He
bicNh > (Og {$+ W aap haee . A 1 =
LW ~ | 8: sess / bi Ba Mbt Dorn a! } “6 . Ts EBE AS Py dt i
/; | PO 7 wh o/s oe f fh) Pf, y 7 } fh i |
66 Bon i, Whine eg) prey A Tle
Us Sane io 0 a Sa ne ce a =
ee PANG 4 iil ee ee
fee on ULROS . | tind. j i oe Hi 2 |
fee 9h a LN testy p “2 ae, . oA st ieee en q
la Ii f ees gh ost a : (Poor 4 tl ;
be 4 sg i) * y he. : s - ae

poe col (eee xe ee Cy ag x ae es
EH bef. rare fbdais fF? he > an
eee LOO ff i|
F on Mason | Ghar ghee Wn gle | | Diy og I

pit Tied pele ue isto
pas Mave PT I pat on~ Gi BE a |
er ee Bee ree 9, Mg a
Feet ne AL. fed G8 fo

oe af 4
{ ne AF, ae : : E |
$= Gollah Phase) Slaw 8) F, 7 : |
tls pe §> Cof Oh 9.8 77 fj Vis A Pr ae ‘i be oe x
fe gs 1 aye ae? a 4 / ‘ © 4 : A (7 ; Ps of 58 are ; Sy 1
be, qd. We Hla i” by (Oe ) ike
: : th cnl GRA Pe pe «ft {/. » See
MS yg Set OS ae) a a
D: Ritcarlefer fod en MR ne
* hs gee flor tae YU Cc ae eens MORSE 5 Nope eg
Eee See co ee ae + A eee Se eee Se eee ee Sei i
16 changes: 8 additions & 8 deletions dataset/validation/_ocr/markitdownnet/MARMOT/10.1.1.1.2006_3.txt
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
09 these consume 0,99 CPU. That is, task3- misses
deadlines and the inverted pendulum! falls down, This
an explains the fact that the cost function in Figure 2 goes
a isaaise deadlines and the inverted pendulum! falls down, This
og EF c explains the fact thatthe cost function in Figure 2 goes
. to infinite. Note that under RM, the task is not
Bod schedulable
a. + EDP: EDF is « dynamic alorihm in open bop
Which assigns priorities to tasks according to their

i + EDP: EDF is « dynamic alorihm in open bop
§ Which assigns priorities to tasks according to their
i, Beavis dedine, Une BG’ holler te
schedulabiity condition is given by U = 1. For our
wg simulations, since U <1, the task set is schedulable
4 simulations, since U <1, the task set is schedulable
and the three pendulums can be controlled as it can be
A seen in Figure 2: the accumulated cost reaches a finite
a a oe a! value, which means that the deviation caused by each
ii Ail sal aed perturbation thet affected each of the three pendulums
a a oe i value, which means that the deviation caused by each
sig cic al ad perturbation thet affected each of the three pendulums
could be adequately comected. ‘The performance
with the difference in performance due 10 the we of sehieved by EDF is also given in Figure 2 in tems of
different scheduling polices, which is the objective of our tie coat BocHonsreadting a valus oF W2754 at tho
Expand Down
27 changes: 14 additions & 13 deletions dataset/validation/_ocr/markitdownnet/SROIE2019/X00016469670.txt
Original file line number Diff line number Diff line change
@@ -1,37 +1,38 @@
tan chay yee
ay y

88 COPY **#
*** COPY ***
OJC MARKETING SDN BHD
ROC NO: 538358-H
NO 2 & 4, JALAN BAYU 4,
BANDAR SERI ALAM,
81750 MASAI, JOHOR
Tel:07-388 2218 Fax:07-388 8218
Email: ng@ojegroup.com
Email: ng@ojcgroup.com

TAX INVOICE

"jfvelee No” peGAOSoIEs
Date + 15/01/2019 11:05:16 AM
“Ynvoice No: PEGIV-1030765..~~CS~S~S
Date : 15/01/2019 11:05:16 AM
Cashier : NG CHUAN MIN
Sales Persor : FATIN
Bill To : THE PEAK QUARRY WORKS

Address bs
Address es

Description. Oty. Price Amount
000000111 1 193,00 193.00 SR
_Description Qty. Price Amount
000000111 1 193.00 193.00 SR

KINGS SAFETY SHOES KWD 805

OT Fetal Beclide tT T9560
@ty:1 ‘Total Exclude GST: ~—«193.00
Total GST @6%: 0.00
Total Inclusive GST: 193.00
sippeaciaa Round Amtt 550.00...
sige ROUNE AMES 8-00 ..
TOTAL: 193,00

9000000000004 31.8
Approval Code:000 VJ a
camcacuecniae (Te
100000000004 3:18 (iu.
Approval Code:000 4 a)

Goods Sold Are Not Returnable & Refundable
***4Thank You. Please Come Again.****
*+*+Thank You. Please Come Again.****
21 changes: 11 additions & 10 deletions dataset/validation/_ocr/markitdownnet/SROIE2019/X00016469671.txt
Original file line number Diff line number Diff line change
Expand Up @@ -11,23 +11,24 @@ Email: ng@ojegroup.com

Cash Bill

invoice No" PEGI 03085
“Tavoice No”: PEGIV-1030531 =~:
Date 02/01/2019 2:47:14 PM
Cashier : RHYS TAN
Sales Persor : FATIN

2 Deseription | Qty. Price Amount
000000111 1 170,00 = 170.00
KINGS SAFETY ‘POO4
SHOES KWD 805
eOROGR, RY Fe Arne
000000111 1 170,00 170.00
KINGS SAFETY Pood
SHOES KWO 805

“Gey T Total Tem Biscount! G60
Qty’ 1 Total Item Discount! =i.”
Total Amount: 170.00
cecseeseneeeeen ROUNd Amt. 0.00
slags causes OO OD
TOTAL: 170.00
VISA CARD 170.00
2090000000043 18
Approval Code:123 I30 U0

XXX KK KKK KK4 318
Approval Code: 123 ey) U0

Goods Sold Are Not Returnable & Refundable
Thank You. Please Come Again.****
****Thank You. Please Come Again.****
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
PETRON BKT LANJAN SB
ALSERKAM ENTERPRISE
Tel: 03-6156 8757 Co No: 001083069-M
KM 458.4 BKT LANJAN UTARA,
KM 456.4 BKT LANJAN UTARA,
L/RAYA UTARA SELATAN,SG BULOH
47000 SUNGAI BUL

Expand Down Expand Up @@ -32,4 +32,4 @@ Use 3000 Petron Miles
points to pay for
RM45 Fuel

* Boo bia aes H
* F eee H
Loading