Senior Data Scientist with LLM Application Experience
Job Description
Job DescriptionSalary:
Senior Data Scientist with LLM Application Experience
About Catalytic Data Science (CDS):
Catalytic Data Science is a fast-growing SaaS company building cutting-edge, AI-driven solutions for regulatory affairs professionals shaping innovation in life sciences. Our engineering team leverages generative AI to extract insights from complex, unstructured data at scale. We believe in clean code, collaborative problem-solving, and a culture where engineers have a direct impact on meaningful products used by global life sciences organizations. Our customers are passionate about making the world a better place, and we are inspired by the opportunity to help them. If you are passionate about solving technical challenges that improve medical innovation and regulatory processes, youll find your next home with us.
Who You Are:
You are an innovative data scientist and ML engineer eager to push the envelope in Generative AI, NLP, and vector search. Youre fluent in turning messy, real-world data into structured insights and production-grade models. You thrive in multidisciplinary teams, relish experimentation, and are driven to deliver solutions that have meaningful impact on scientific and regulated domains.
What You Will Do:
- Develop, fine-tune, and evaluate LLM-based applications for information retrieval, Q&A, and summarization.
- Lead the experimentation and benchmarking of various LLMs (LLama, GPT, domain-adapted models).
- Design and evaluate Retrieval-Augmented Generation workflows using vector databases.
- Work closely with engineers to transition prototypes to production.
- Identify and mitigate bias, privacy, and compliance issues in AI outputs.
Qualifications:
- Ph.D. degree in the field of systems biology, bioinformatics, computational biology, data science, ML, or equivalent.
- 5+ years in data science or NLP roles.
- Specialized expertise in LLMs, transfer learning, and vector-based retrieval methods.
- Strong Python and ML stack skills (PyTorch, Huggingface, etc.).
- Experience developing solutions for regulated industries is a plus.
- Experience leveraging AI-powered coding assistants (e.g., GitHub Copilot, Copilot X, ChatGPT Code Interpreter, Amazon CodeWhisperer) to enhance productivity in day-to-day software development activities, including code generation, refactoring, and documentation.
- Familiarity with best practices for integrating AI coding assistants into team workflows while maintaining code quality, security, and regulatory compliance.
- Excellent communication and problem solving.
In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification document form upon hire.
remote work