Get AI-powered advice on this job and more exclusive features.
Direct message the job poster from AlphaNeural AI
Executive Coach for Founders and Leaders at Fast Growing Start-ups
AlphaNeural AI is a pioneer in the Web3 and AI sectors, dedicated to building a decentralized infrastructure for AI assets, including models, agents, and datasets. Our platform features an expansive marketplace for the trading and licensing of these assets and supports their deployment through an advanced GPU compute layer. As a remote-first organization with a global presence, we are committed to pushing the boundaries of technology to create innovative solutions that empower our users and enhance the AI community.
At AlphaNeural AI, we embrace diversity and strive to create an inclusive environment where all team members can thrive. We are an international team, fostering a culture of collaboration and continuous improvement. We prioritize employee engagement and professional growth through frequent offsites, workshops, and team-building activities that reinforce our commitment to innovation and excellence.
As an MLOps/DevOps Engineer at AlphaNeural AI, you will be responsible for architecting, building, and maintaining the infrastructure that supports our platform. This includes hosting datasets, deploying AI models and agents, and ensuring the seamless operation of our decentralized marketplace. This role requires a deep understanding of both machine learning operations and cloud infrastructure to support the seamless delivery of AI services.
Key Responsibilities:
1. Infrastructure Design & Management: Architect and manage scalable and resilient infrastructure for hosting datasets and deploying AI models and agents. This includes provisioning and managing cloud resources, configuring virtual machines, and ensuring optimal use of GPU instances.
2. Containerization & Orchestration: Develop and maintain Docker images for various services and applications. Manage Kubernetes clusters for container orchestration, ensuring high availability and scalability of deployed services.
3. CI/CD Pipeline Development: Design, implement, and maintain continuous integration and continuous deployment (CI/CD) pipelines using tools such as Jenkins, GitLab CI/CD, or GitHub Actions. Automate testing, deployment, and monitoring processes to ensure rapid and reliable delivery of new features and updates.
4. Networking & Security: Configure and manage networking components, including DNS, load balancers, and firewalls, to ensure secure and efficient access to deployed models. Implement security best practices to protect AI assets and infrastructure, including network segmentation, encryption, and access controls.
5. Monitoring & Optimization: Set up monitoring and logging solutions using tools like Prometheus, Grafana, and the ELK stack to track system performance, detect anomalies, and ensure the reliability of services. Optimize resource usage and performance to minimize costs and maximize efficiency.
6. Collaboration & Integration: Work closely with data scientists and software engineers to integrate new models and features into the platform. Provide technical guidance and support to ensure seamless integration and deployment.
7. Automation & Scripting: Automate repetitive tasks and workflows using scripting languages such as Python, Bash, or Ansible. Develop tools and frameworks to improve operational efficiency and reduce manual intervention.
8. Troubleshooting & Support: Diagnose and resolve infrastructure-related issues promptly.
Technical Requirements:
1. Programming & Scripting: Proficiency in Python and Bash scripting.
2. Operating Systems: Strong experience with Linux-based systems.
3. Containerization: Hands-on experience with Docker/Podman, including creating and managing Docker images and containers.
4. Orchestration: Expertise in Kubernetes including cluster management, Helm charts, and custom resource definitions (CRDs).
5. Cloud Platforms: Experience with cloud providers (AWS, GCP, Azure) and managed Kubernetes services.
6. Infrastructure as Code (IaC): Proficiency with IaC tools such as Terraform or Ansible.
7. CI/CD Tools: Knowledge of CI/CD tools and practices including Jenkins, GitLab CI/CD, or GitHub Actions.
8. Networking: Strong understanding of networking concepts, including DNS, load balancing, VPNs, and TLS/SSL.
9. Monitoring & Logging: Familiarity with tools like Prometheus, Grafana, and ELK stack.
10. Security: Solid experience with implementing security best practices, including network security, identity and access management (IAM), and data encryption.
Why Join Us?
AlphaNeural AI offers a unique opportunity to work at the forefront of AI and Web3 innovation. As part of a diverse and collaborative team, you’ll have the chance to shape a transformative platform while growing your career in a dynamic and supportive environment.
Seniority Level
Mid-Senior Level
Employment Type
Full-time
Job Function
Engineering and Information Technology
Industries
Data Infrastructure and Analytics
Referrals increase your chances of interviewing at AlphaNeural AI by 2x.
#J-18808-Ljbffr