
Senior Data Scientist, Model Engineering
TRM Labs
Job Description
As a Senior Data Scientist on the AI Engineering team, you will drive the development and deployment of sophisticated machine learning and LLM models that power TRM's AI Engineering platform. Working closely with engineers, data scientists, and research scientists, you will build cutting-edge AI systems that help analysts detect, prevent, and mitigate cryptocurrency fraud and financial crime. Your work will focus on creating intelligent agent systems and evaluation frameworks that enable seamless interaction with complex blockchain data. These tools will be consumed by prestigious entities such as former FBI, Secret Service, and other law enforcement agencies, providing them with unprecedented capabilities to investigate evolving criminal activities in the cryptocurrency space. The impact you will have here: Design and deploy sophisticated ML/LLM models that accomplish impactful customer needs within TRM’s AI Engineering platform Develop robust catalogs of evaluation sets and observability tools for ML/LLM models to ensure reliable performance in production environments Monitor for data drift, concept drift, and emerging failure modes in production AI systems serving critical law enforcement use cases Leverage embeddings, retrieval, and ranking methods to improve agent tool selection and enhance the product experience for analysts Build models and analyses that optimize orchestration of multi-agent workflows, enabling complex investigative tasks through natural language interactions Stay up-to-date on state-of-the-art model and system design, easily implementing modern architecture patterns to serve external users Translate vague infrastructure and product requirements into measurable, data-driven metrics and evaluation frameworks Collaborate with engineers and research scientists to design agentic systems that analyze complex blockchain data structures, cryptographic protocols, and transactional patterns Create evaluation frameworks that measure the effectiveness of AI agents in helping investigators navigate blockchain data and identify suspicious activities Establish best practices for LLM evaluation, prompt engineering, and AI system monitoring What we're looking for: You have 4+ years of experience working in a data science role, with demonstrated expertise in machine learning and data science Strong fluency with Python and SQL, with proven experience developing code in team environments (git, notebooks, testing) Strong background in data science, experimental design, and hypothesis testing Experience evaluating and building LLMs, embeddings, knowledge graphs, and information retrieval systems Familiarity with ML/AI methods (NLP, RLHF, embeddings) to analyze and improve agent behaviors General familiarity with ML/AI Ops and production system monitoring Ability to analyze large, multi-modal datasets (text, logs, graphs, APIs) for actionable insights Strong ownership mentality and comfort with uncertainty - ability to translate ambiguous intelligence questions into measurable evaluation frameworks History of taking strong ownership over systems and delivering value at high velocity Curiosity and drive to get to the root cause of problems, with evidence of creativity and self-directed work in ambiguous spaces Building and delivering Agentic/LLM systems with active users would raise the bar (industry experience preferred, but hobby or open source contributions valued, given the emerging nature of the technology) About the Team: The AI Engineering team operates with a high level of collaboration and interdependence, fostering an inclusive culture where connections extend beyond work. We value diverse personalities and seek someone who is both results-driven and sociable, with strong communication, leadership, and teamwork skills, along with a positive, solutions-oriented mindset As a globally distributed team, members may observe different time zones. However, most of the team will overlap between the hours of 7am-12pm PST for meetings and collaboration All team members, regardless of location, must have at least 4 hours of overlap with PST business hours The team is involved in 2 different on-calls: Analytics on-call: only requires focus on working time (average of < 1 hour a day every month) Data Science on-call: full week on-call to ensure Data Science pipelines are running smoothly (average of 1-2 hours a day every 3 months) The following represents the expected range of compensation for this role: The estimated base salary range for this role is $170,000 - $195,000. Please note that the base range is applicable to the US and will be different depending on the country. Additionally, this role may be eligible to participate in TRM’s equity plan.