Close

File handling

A project log for PDF Merge

Overlay synchronized spreadsheet calculations on PDF forms

lion-mclionheadlion mclionhead 10/17/2023 at 07:330 Comments

The immediate problem was a lot of goofy restrictions on local file access from inside the browser.  It's hard to believe a system with so many fabricated rules is the standard for applications.


PDF.js famously can't read pdf files from a hard coded local filename.  There are some diabolical workarounds on the internets.  

Another problem is full paths to local files aren't available in the browser.  There's no way a project file could reference a local PDF file with an absolute path or have a list of recently opened files.

The standard solution is to have the server manetain a virtual filesystem of its own.  POST a PDF file to the server & have PDF.js then read the copy on the server.  POST & GET project files on the server.  The trick with this is not being able to store PDFs & project files in convenient locations on the hard drive. 


The quick & dirty way is passing a PDF file & project file to the server command line.  This requires running different servers on different ports to edit multiple PDF files.  Ideally 1 server would support multiple browsers, but it's not unthinkable to run a different server for every file & have the server automatically pick an unused port.  It would be minimal.  It would be a pain to have to remember the PDF file to load the project though. 

There's a way to get PDF.js to read a byte array.  ChatGPT says replace pdfjsLib.getDocument('LM124.pdf') with pdfjsLib.getDocument({ data: pdfData })  If the javascript loaded the project file from the file dialog & then extracted the PDF from it, the problem would be restoring its state after the confirm dialog, given the different starting URL's.

More likely, the project file will be passed to the server command.  Then the browser would POST a PDF file to copy to the project file on the server & GET the text from the server.  The program state would be stored on the server.  Cookies would cause multiple browser windows with different projects to share the same state.  Confirmed it by loading ultramap in multiple browsers.  The problem is having to provide the right port number to the browser.

The project file would thus need the raw PDF, the text entries, the undo entries, & the current state of the browser.  The server would be all the storage while the browser would be the view controller.

Manetaneing a 2nd copy of the PDF in a server project file ended up so dreadful, ended up just requiring an absolute path in a textbox which the project file could reference.  It might be better to make the path relative to the project file location or just store it as a bunch of PNGs on the server.  This would create either a bunch of PNG files or a project file nightmare.  The server still needs to have the latest state of the project file for page reloads.  There can't be a manual save button.


Discussions