Overview
This Lead Engineer I position, under the general direction of the Lead and/or Manager, Machine Learning Engineering, will be responsible for technical and development support for our award-winning K-12 software. This role will help in all AI/generative AI products in the areas of engineering, data, deployment and infrastructure.
Responsibilities
Essential duties and responsibilities include the following. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions
- Design and implement Machine Learning models and data ingestion pipelines
- Develop and support a platform that enables data scientists to rapidly develop, train, and experiment with machine learning models
- Expand and optimize data pipelines, data flow, and collection for cross functional teams
- Create and maintain optimal data pipeline architecture by assembling large, complex data sets to meet functional and non-functional business requirements
- Identify and implement internal process improvements including automating manual processes, optimizing data delivery, and redesigning infrastructure for greater scalability
- Support the building of machine learning, data platforms, and infrastructure required for optimal data extraction, transformations, and loading of data from a wide variety of data sources
- Work with architecture, data, and design teams to assist with data related technical issues and support data infrastructure needs
- Deploy ML models in AWS environment specifically in AWS Sage Maker environment
- Implement Model Monitoring, Data Quality Checks, Data Drifts in Inference Pipelines
- Support ML teams in the delivery of continuous integration, continuous deployment, providing templates and patterns
- Perform root cause analysis for production issues where the root cause is in infrastructure, environment, configuration, or deployment routines; understand when to escalate to product development teams; remediate root causes and implement preventative actions
- Own the AWS stack which comprises all ML resources and collaborate on managing ML infrastructure costs
- Establish standards and practices around MLOps, including governance, compliance, and data security
- Uses Generative AI models, other LLMs, Agents, RAG and LangChains to build different smart solutions
- Uses customer management system to provide status on open customer issues and properly verifies when an issue can be closed
- Participate in afterhours maintenance, when necessary, respond to emergencies, participate in customer calls when called upon in support of initiatives and incident response
Qualifications
To be considered for and to perform this job successfully, an individual must be able to perform each essential duty and responsibility satisfactorily. The requirements listed below are representative of the knowledge, skill and/or ability required.
Qualifications include:
- 5+ years of experience within the full software development lifecycle from planning through deployment and maintenance
- Demonstrated ability to design, implement, and scale machine learning workflows (ML OPs); including deployment and delivery of production-ready model APIs
- Demonstrated proficiency with version control systems and automated software testing and delivery
- Proficiency with at least one machine learning lifecycle platform (Sagemaker, MLFlow, TensorFlow, etc.), orchestration platform (Airflow, Dagster, etc.) and data platform like SnowFlake/DataBricks
- 5+ years of experience with ML infrastructure and ML DevOps
- 5+ years of overall engineering experience in distributed systems and data infrastructure
- 3+ years’ experience coding in Python (preferred) or other languages like Java, C#, etc.
- Experience working with ML engineers to build tooling and automation to support the entire ML engineering lifecycle, from experimentation to production operations
- Experience with Kubernetes and ML CI/CD workflows
- 3+ years of experience with AWS or other public cloud platforms (GCP, Azure, etc.)
- Excellent verbal and written communication skills.
- Experience with Infrastructure-as-Code tools and frameworks
- Bachelor's degree in computer science, data science, mathematics, or a related field. Master’s degree preferred
Compensation & Benefits
PowerSchool offers the following benefits: Comprehensive Insurance Coverage (including Medical, Dental, Vision, Pharmacy benefits, Life Insurance and AD&D)Flexible Spending Accounts and Health Savings AccountsShort-Term Disability and Long-Term DisabilityComprehensive 401(k) planGenerous Parental LeaveUnrestricted paid time off (known as Discretionary Time Off - DTO) Paid Community and Volunteer Time Off (VTO)Wellness Program, including ClassPass& Employee Assistance ProgramTuition ReimbursementOptional Benefits: Pet Insurance, Identity Theft Protection, Student Debt Repayment Program and Prepaid Legal coverage A reasonable estimate of the base compensation range for this position is $78,100 - $210,300. The compensation range is specific to the United States and incorporates many factors including but not limited to an applicant’s skills and prior relevant experience and training; licensures, degrees, and certifications; internal equity; internal pay ranges; and market data/range parameters.
EEO Commitment
PowerSchool is committed to a diverse and inclusive workplace. PowerSchool is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. Our inclusive culture empowers PowerSchoolers to deliver the best results for our customers. We not only celebrate the diversity of our workforce, we celebrate the diverse ways we work. If you have a disability and need an accommodation regarding our recruiting process, please let us know by emailing accommodations@powerschool.com.
#LI-ME1 #LI-REMOTE