At Prolaio, our SaMD (Software as Medical Device) relies on a robust data
pipeline to transform patient data into actionable insights. If you're a
passionate engineer with a keen interest in healthcare, you could play a
critical role in enhancing our system and working alongside our data
scientists.
Our current pipeline contains multiple steps:
* Parsing / cleaning
* Computing new metrics
* Reviewing metrics & QA
and leverage multiple technologies
* Databases / Time Series / Data Lake
* SQL / Pandas / PySpark
* AI & ML
Inside the data-engineering team you will be in charge of evolving our current
infrastructure while working closely with the data-science team who is in
charge of the medical and research aspects.
### Key Responsibilities
**Optimize our Data/ML Pipeline**
* Collaborate closely with our product and data science teams to refine and expand our existing data processing system in AWS
* Contribute to the different applications used for reviewing/enhancing our medical data (such as Label Studio)
* Design, implement, and maintain our ML platform using all the latest Machine Learning Operations (MLOps) tooling in AWS
**Support our Data Science Team**
* Think of our data scientists as your customers, ensuring they get the best tools to do their jobs.
* Provide the engineering tools and processes they need to experiment and iterate on models efficiently.
* Help the data science team to push their work into production with performance analysis and code reviews.
**Maintain and Monitor the System**
* Ensure every addition to the data pipeline is backed by thorough automated testing.
* Design, implement, and maintain the CI of Prolaio Platform (with GitHub actions, AWS CodePipeline)
* Monitor system health and performance using DataDog.
### Required Qualification
* Strong problem-solving skills, attention to details, and an analytical mind.
* Proficient in Python / Pandas and ideally also PySpark.
* Solid understanding of Machine Learning and AI concepts.
* Hands-on experience of AWS, Docker and associated technologies.
### Bonus points
* Prior collaboration with data scientists and researchers
* Experience with MLOps tools such as Metaflow and AWS SageMaker
* Experience with AWS CDK, Code Pipeline, and GitHub Actions
* Prior experience in the healthcare industry
Prolaio focuses on Healthcare and Health Care Information Technology. Their
company has offices in San Francisco, New York, and Boston. They have a small
team that's between 11-50 employees.
You can view their website at <https://prolaio.com>