RU version is available. Content is displayed in original English for accuracy.
Demo video here: https://share.extend.ai/kRmSGKRF
When we started, we tried every file viewer and document component library we could find. Unfortunately, none of them had all the functionality (and polish) that we wanted, so we ended up building our own for https://extend.ai/. It was only ever meant to be internal, but enough customers kept asking for it that we decided to open source it.
It's useful for building document processing agents, real-time user facing document intake flows, or all kinds of internal tooling.
We naively thought this would be a solved problem. Turns out, making PDF/XLSX/DOCX viewers that work at scale is not trivial...we use and maintain it for Extend ourselves, so we've fixed a lot of edge cases that came up while running millions of pages / day through our own system. Our hope is that with our resources + community support, it'll keep getting better over time.

Discussion (16 Comments)Read Original on HackerNews
Thanks for releasing publicly.
[1] https://github.com/J-F-Liu/lopdf
[2] https://github.com/d0rianb/rtf-parser
[3] https://github.com/tafia/calamine
we hope this can be useful for people building in React though!
By quirk of fate i've spent the past 2 days prototyping some stuff on pdfjs. Just trying to figure out a game plan for handling bounding boxes in the face of page zooming, different resolutions etc. etc. I can't see it mentioned whether the components are virtualising pages (as in reusing dom elements as document pages scroll by). I guess i just learned what i'll be exploring tomorrow then...
the zoom should work with the bounding box highlights, we're working on adding rotation support
i can't promise its visually 1:1 with Word/Excel but its pretty close on the corpus we tested with
could not have been easy
On mobile Safari…