Cedars-Sinai Research Data Scientist - Gonzalez-Hernandez Lab - Computational Biomedicine in Los Angeles, California
Join us as we translate today's discoveries into tomorrow's medicine!
The Department of Computational Medicine (CBM) is a robust infrastructure established to develop, evaluate, and apply cutting-edge computational and statistical methods and software for the analysis of biomedical and clinical data across the Cedars-Sinai enterprise.
Are you ready to be a part of breakthrough research?
Dr. Graciela Gonzalez-Hernandez is a recognized expert and leader in natural language processing (NLP) applied to bioinformatics, medical/clinical informatics, and public-health informatics, with more than 100 publications in this area. The mission of the Gonzalez-Hernandez Lab is to improve healthcare delivery and outcomes, and public health monitoring and surveillance through innovations in automated language processing. The goals of the Gonzalez-Hernandez Lab include developing innovative NLP solutions for health-related language, creating data, resources, and tools for the global research community, and promoting research in health language processing through the organization of workshops and shared tasks, and through interdisciplinary collaborations.
The Research Data Scientist participates in biomedical research projects using programming, data -mining, statistics, machine -learning, natural language processing, and visualization techniques to develop, evaluate, and/or apply algorithms and software for health data analysis. Responsibilities include active communication with collaborators (clinicians, epidemiologists and other domain experts) in the design and deployment of end-to-end scientific studies (qualitative or quantitative), including participation in different tasks such as querying data repositories and data processing, deploying supervised and unsupervised machine learning methods, as well as releasing and/or validating production models. The Research Data Scientist writes clean, performant, reusable code managed on GitHub to perform repeatable analyses and to train and deploy models to multiple environments, and communicates scientific findings via peer-reviewed publications and scientific conferences.
Primary Duties & Responsibilities:
Assists with the development, evaluation, and/or application of computational and statistical methods including artificial intelligence, machine learning and natural language processing algorithms and software for the analysis of biomedical data.
Assists with the presentation and communication of scientific results through laboratory meetings, scientific conferences, and peer-reviewed publications.
Creates database-to-deployment pipelines for models using the necessary programming languages (primarily R, Python, SQL).
Creates sustainable data science infrastructure and adheres to data analysis/machine learning best practices.
Performs exploratory data analysis to gauge the need for or appropriateness of advanced analytical methods.
Works with senior or lead data scientists and principal investigators to identify areas where data science can best be applied to answer biomedical research questions.
Tests and validates code to ensure robustness of data applications with version control through GitHub.
Performs all other duties as assigned.
Participates in the development of innovative algorithms and analytical methods.
Participates in the evaluation and interpretation of all analytical methods and results.
Participates in the communication of scientific results including publications.
Participates in analytical training activities for faculty, staff, and students.
- Bachelor's degree in Computer Sciences, Machine Learning, Applied Mathematics, Econometrics, Statistics, Engineering, Physics, or related field, required. Master's degree, preferred.
Experience and Skills:
Two to five (2-5) years of professional experience in healthcare or pharmaceutical industries working with biomedical data.
Experience programming at an intermediate skill level with a high-level programming language such as Python. College projects may be acceptable.
Experience programming and deploying at a basic to intermediate proficiency level machine learning and natural language processing frameworks such as those for regular expressions and classifiers.
Experience writing scientific manuscripts. College projects may be acceptable.
Experience in biomedical machine learning is preferred.
Working knowledge of data privacy and security including best practices for data with personal health identifiers (PHI) covered under HIPAA.
Strong interpersonal and communication skills including scientific writing, and has full command (verbal and written) of the English language.
Demonstrates commitment to customer service and an ability to meet the needs and expectations of patients and health care colleagues.
Demonstrated success working independently, forging relationships, and managing multiple tasks with minimal directions.
Ability to promote and foster participation/collaboration among individuals and groups.
Ability to handle multiple demands and/or manage complex and competing priorities.
Ability to analyze qualitative and quantitative information for research.
High level of proficiency using Microsoft Windows and other Microsoft Office software: MS Excel, Outlook, Powerpoint Word, etc.
Must be able to manage competing priorities, while being extremely adaptable and flexible and maintaining a positive work environment.
Working Title: Research Data Scientist - Gonzalez-Hernandez Lab - Computational Biomedicine
Department: Computational BioMedicine
Business Entity: Academic / Research
Job Category: ,Information Technology,Information Technology
Job Specialty: Business Intelligence/Reporting
Position Type: Full-time
Shift Length: 8 hour shift
Shift Type: Day
Cedars-Sinai is an EEO employer. Cedars-Sinai does not unlawfully discriminate on the basis of the race, religion, color, national origin, citizenship, ancestry, physical or mental disability, legally protected medical condition (cancer-related or genetic characteristics or any genetic information), marital status, sex, gender, sexual orientation, gender identity, gender expression, pregnancy, age (40 or older), military and/or veteran status or any other basis protected by federal or state law.