To apply for this job you must first either login or register

Site Reliability Engineering (DevOps)

Montreal, Quebec  - Permanent



Job Description

Our client are one of the first networks devoted to helping B2B enterprise. Their first-of-its-kind Sales Analytics platform combines a proprietary, self-learning network with applications that is ready to use, data backed, and built on predictive analyses.

SRE ensures that our internally critical and our externally-visible systems—have reliability and uptime appropriate to users' needs and a fast rate of improvement while keeping an ever-watchful eye on capacity and performance.

SRE is also a mindset and a set of engineering approaches to running better production systems—we build our own creative engineering solutions to operations problems. Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation.

Behind everything our users see online is the architecture built by the Technical Infrastructure team to keep it running.

What your day to day will look like:

•Engage in and improve the whole lifecycle of services from inception and design, through deployment, operation and refinement.
•Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
•Practice sustainable incident response and blameless postmortems.
•Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
•Support services before they go live through activities such as system design, developing software platforms and frameworks (Grid computing related), capacity planning and launch reviews.
•Coordinate implementation and execution of configuration management infrastructure
•Make thoughtful choices about the adoption of new technologies based on your research and past experience
•Strong sense of ownership and passion for engineering great products with stellar user experiences
•Contribute to developing and implementing security policies
•Join our on-call rotation as a first line of defense during production issues


Must Have Skills:

•Degree in Computer Science or a related technical field involving coding (e.g., physics or mathematics), or equivalent practical experience.
•Experience with operating systems and TCP/IP network fundamentals
•Experience with algorithms, data structures, complexity analysis and software design
•Experience coding in higher-level languages (e.g. Ruby, Go, Python, C++, or Java)
•Experience with Big Data, PaaS, and IaaS technologies.
•Experience in configuration and maintenance of applications such as web servers, load balancers, relational databases, storage systems and messaging systems
•Experience learning software, frameworks and APIs
•Knowledge of Linux/UNIX fundamentals


Nice to Have Skills:

•BS or MS in Computer Science
•Expertise in designing, analyzing and troubleshooting large-scale distributed systems.
•Ability to debug and optimize code and automate routine tasks.
•Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
•Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.
•You have used Puppet, Terraform, Ansible, Chef, Salt or another config management suite, know where it's broken, and open to trying new alternatives
•Demonstrated knowledge and understanding about infrastructure as code (with code review, release engineering and more), immutable infrastructure, and/or software defined network/storage
•Experience in understanding a complex existing software workloads and the ability to define a technical migration roadmap to the Cloud.
•Ability to quickly learn, understand, and work with new and emerging technologies, methodologies, and solutions in the cloud technology space.
•Knowledge of cloud infrastructure-related technologies (e.g., Containers, Kubernetes, VMs, etc.).
•Experience in, and understanding of, data and information management for Big Data ML trends within businesses.
•Experience with Big Data related technologies (at least 2 ) : Apache Hadoop, Spark, Kafka, Cassandra, MongoDB Sharded Cluster, Elasticsearch Cluster


Details:
Starting: ASAP
Travel: 10%
To apply for this job you must first either login or register