Kubro(TM) in Action: Document Explorer and Large PDF Files
Public disclosure documents such as annual reports from pension funds, the SEC filings or the UK Companies House documents can be a source of valuable market intelligence.
But to monitor and collect the 200-page PDFs systematically and efficiently, filter for relevance, extract relevant info and keep it human-analyst-friendly, needs much more than a simple LLM call.
That's the type of orchestration we have been building at Robotic Online Intelligence (ROI) and the Kubro(TM) platform, as one of the modules we call Document Explorer.
When there is a signal that the new document is out, the system automatically collects the document, stores and converts it to text/markdown and passes it through a particular workflow.
The screen recording below shows an example using CalPERS 2023-2024 and 2024-2025 annual documents.
In this case, we show an example of a manual comparison of the disclosures for two consecutive years, where the human analyst (which can also be automated) checks for how DigitalBridge (as an example) features in the CalPERS portfolio and related management and performance fees difference 2024 vs 2025, using LLMs with our RegexyRAG shortcut.
Then comes a human verification (where accuracy is not optional), tracing back the key numbers in the source documents.
This particular example may be interesting in itself as SoftBank announced on Dec 29 they had entered into a definitive agreement to acquire DigitalBridge Group, Inc. for a total enterprise value of approximately USD 4 billion, another case in the digital infrastructure investments.