You are viewing a preview of this job. Log in or register to view more details about this job.

AI Engineer

Background
Mistakes in audits at public companies can cost them billions of dollars, which is why audits today can cost tens of millions of dollars and take hundreds of thousands of hours to complete.
We are building an AI junior auditor that can complete these audits in a fraction of the time at a higher quality benchmark than humans can.
For the multibillion dollar incumbents like EY and KPMG, this is an obvious problem to solve and they are investing hundreds of millions of dollars into this annually. Whoever wins this space can easily reach billions of dollars in revenue at 90%+ margins.
Currently, our lean team is outperforming the incumbents, simply by having a more talented, hardworking team. When building reliable, scalable agentic systems, the quality of the hires matter way more than the quantity.
That is where you come in. If you’re one of the most technical engineers you know, you thrive with autonomy over architectural and implementation decisions, and like to be up to your neck in work, lets talk. We have multiple fortune 500 clients that can use your help.
Technical challenges

Agentic systems struggle to understand Excels with millions of cells and complicated layouts. How can we smartly represent multiple large data files such that agents can effectively query and traverse them to find information on the fly?
Given that there is minimal audit information in LLM’s pre and post training data, responses per each LLM call are highly variable. What strategies can we employ to minimize this?
- Some ideas, regression from LLM logits to historical correct answers, finetuned 500M models to act as verifiers, sending 256 requests per step, and smartly combining them to 1 answer.
For complicated audits, auditors have to try multiple ways of auditing information before they decide on the best approach. How can we mimic that in an agent? Can we beam search over multiple differenet approaches? How do we give agents visibility into the other approaches so that we can determine which approach is the “best”?

Stellar applicants will

Have clear technical opinions on how to improve AI agent workflows based on literature and intuition.
Meticulously comb through data to isolate spots for model performance improvement.
Have the agency to build and deploy production ready solutions to problems they notice without oversight.
Work closely with the founding team, with the same level of hunger to win an antiquated industry.

Tech Stack React (Typescript) | Python | Docker | AWS