Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How reliable are the processes which these things run?

I'm processing thousands of files using Copilot, and even 20 at a time, it usually skips a couple, and sometimes, when skipping, it merges the data from one file to the next, not applying anything to the second file, other times it completely applies the data parsed from one file to the second --- not a big deal since I'm reviewing each operation manually, but the only reason the error rate is acceptable is the files are so inconsistent that normal techniques weren't working.

Is there an equivalent to "double-keying" where two different LLMs process the same input and it only moves forward if both match perfectly?





You should probably ask the AI to write a script to do the task. Any procedure that needs to be perfect should be done by writing deterministic code.

The task is renaming scans of checks (with the barcode/bank account info obscured by a pen) --- the checks are of varying sizes, placement is not exact, the placement and formatting of the information on the check is essentially random, and many of them are (poorly) handwritten).

The LLM is working well enough for my needs (and I'm using a locked-down computer which installing/running development environments/scripts on is awkward), and it's a marked improvement over the previous technique of opening 50 files at a time, noting the Invoice ID, closing the file, typing the Invoice ID as a name, then quitting Adobe Acrobat and re-launching it for the next 50 (if that was not done, eventually Acrobat would reach a state where it would close a file and despite the name having been typed, not save it), then using a .bat file made using concatenation in an Excel column.

It would be nice if it were perfect, but each check has to be manually entered, and the filename updated to match the entry by hand.


test how much data from a file they actually see.. its not unlimited// build a deterministic output ie. 2 of 5 chunks received, send chunk 3



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: