The Research Data Scientist participates in biomedical research projects using programming, data-mining, statistics, machine learning, and visualization techniques to develop, evaluate, and/or apply algorithms and software for data analysis.

Responsibilities include querying databases, data processing, supervised and unsupervised machine learning, deploying production models, and communication of scientific findings via peer-reviewed publications and scientific conferences. Writes clean, performant, reusable code managed on GitHub to perform repeatable analyses and to train and deploy models to multiple environments.

Primary Duties and Responsibilities

•Assists with the development, evaluation, and/or application of computational and statistical methods including artificial intelligence and machine learning algorithms and software for the analysis of biomedical data.

•Assists with the presentation and communication of scientific results through laboratory meetings, scientific conferences, and peer-reviewed publications.

•Creates database-to-deployment pipelines for models using the necessary programming languages (primarily R, Python, SQL).

•Creates sustainable data science infrastructure and adheres to data analysis/machine learning best practices.

•Performs exploratory data analysis to gauge the need for or appropriateness of advanced analytical methods.

•Works with senior or lead data scientists and principal investigators to identify areas where data science can best be applied to answer biomedical research questions.

•Tests and validates code to ensure robustness of data applications with version control through GitHub.

•Performs all other duties as assigned.


Educational Requirements:

Bachelor's Degree in computer science, machine learning, applied mathematics, econometrics, statistics, engineering, physics, or related discipline required.

Master's Degree in computer science, machine learning, applied mathematics, econometrics, statistics, engineering, physics, or related discipline preferred.


2 years of professional experience in healthcare or pharmaceutical industries working with biomedical data.

2 years - Experience in programming at an intermediate skill level with a high-level programming language such as R or Python. College projects may be acceptable.

2 years - Experience in programming at a basic to intermediate proficiency level in SQL preferred

Experience in biomedical machine learning is preferred

Physical Demands:

Standing, Walking, Sitting, Lifting 50 lbs. Carrying 50 lbs. Pushing 50 lbs. Pulling 50 lbs. Reaching, Handling, Grasping, Feeling, Talking, Hearing, Repetitive Motions, Eye/Hand/Foot Coordination

