Machine Learning Operations Engineer, Limassol
Our client, a cutting-edge Technology Services Provider located in Limassol, is looking to hire a Machine Learning Operations Engineer.
- Maintain AI infrastructure clusters
- Maintain models training infrastructure (GPU clusters)
- Deploy and maintain Kubeflow infrastructure
- Design and implement alerts system for models quality and AI services availability
- Deploy and maintain hyper-parameters tuning infrastructure
- Prioritizing requests from AI team fairly while demonstrating a sense of empathy
- Maintain and enhance our CI/CD pipelines for AI
- Collaborate with data engineering team to support production grade AI system
- Develop automation flows that enable fast delivery and replace manual operating procedures wherever they exist to enable self-service operations
- Drive analysis, design, and development of automation tools for deployment, development, and operational tasks
- Deploy & manage monitoring/observability infrastructure for staging & production
- Collaborate with DevOps team to enhance common infrastructure
- Make sure new environments meet requirements and conform to best practices
- 2+ years’ experience within hands-on technical DevOps/Cloud engineering
- Good knowledge of Python or Golang
- Experience with Kubernetes deployment patterns and tools such as Helm, Kustomize and Operators
- Experience utilizing DevOps tool chains including Jenkins, Docker, SonarQube, GitHub
- Experience with tools used for observability such as Elasticsearch, Kibana, Grafana, Prometheus, Jaeger etc.
- Experience with SQL & NoSQL databases such as PostgreSQL and MongoDB
- Experience with event steaming tools (i.e. Apache Kafka) and architecture patterns
- Exposure to Agile environments (use of Jira/Confluence, sprints, etc.)
- Good understanding of Machine Learning project life-cycle
- Great communication skills and team player mentality
- Experience with production grade machine learning systems
- Advanced knowledge of Fairing frameworks or Kubeflow
- Experience with development of custom Kubernetes operators
- Experience with AutoML infrastructure
- Infrastructure as Code experience (Terraform, CloudFormation, etc.)
- Experience with Azure public clouds is a plus
- Understanding of network engineering and security principles (e.g. protocols, routing, switching, filtering, firewall rules, etc.)
- Attractive remuneration package including a 13th salary & Medical Insurance.
- Flexible working hours
- 100% Remote work is possible
- Professional growth opportunities
- Relocation Packages (where applicable)
- Many more!
If you wish to apply for this role, please send your updated CV to Stella Theodoulou at email@example.com quoting ‘Machine Learning Operations Engineer’ on the email’s subject line. Please note that only shortlisted candidates will be contacted.
CareerEsti HR & Recruitment Services Limited is a licensed Private Employment Agency, by the Ministry of Labour, Welfare and Social Insurance (Cyprus), since 2019 (LN:370)