To apply for this job you must first either login or register

Data Engineer/Data Scientist

Toronto, Ontario  - Permanent

Job Description

Do you have a passion for working with Big Data? Imagine working for an exciting entrepreneurial company whose employees are committed to making a real difference combined with stepping up to big challenges. Our client is a leading communications and media company where people come to do great work.
Our Client is currently seeking a highly experienced Data Scientist to join our Big Data Innovation team. The successful candidate will be responsible for developing and maintaining Statistical and Machine Learning models, Data transformation flows, reports and visualizations supporting all major Business Units and key initiatives, including Customer Experience, Brand, IT, Media, Network, Consumer and Business units and many others.


- Define and Sculpt Analytics, Machine Learning, Predictive Modeling and strategy and roadmap leveraging industry trends, tools and best practices
- Design, build and deploy effective and robust descriptive and predictive Big Data statistical and machine learning models and applications
- Design and build the necessary ETL/ELT data transformations, reports and visualizations in support of statistical and machine learning models
- Design, build and test integration interfaces between Big Data platform and external systems (API-based and other) in support of deploying predictive models
- Leverage Hortonworks HDP Hadoop platform, RDBMS Databases, Predictive Modeling and BI tools, SaaS platforms and APIs
- Build and maintain an expanding set of predictive models supporting various business units
- Follow the best practices of the Predictive Modeling Lifecycle, monitor performance and make ongoing corrections by executing extensive model testing strategies
- Support PoC and lab technology initiatives
- Collaborate with all levels of business and IT stakeholders to understand complex data relations and business requirements and implement and test Big Data models, data transformation and visualization solutions
- Train and educate IT and business teams in the use of Big Data environment and predictive models to democratize Big Data capabilities
- Collaborate with IT Infrastructure teams in monitoring and capacity planning of the environment ensuring ongoing stability
- Prepare comprehensive model development and deployment documentation
- Apply concepts, industry research, and agile methodologies/tools (Jira, Confluence) to implement Big Data statistical and machine learning solutions
- Occasional troubleshooting production maybe required during off-hours

Must Have Skills:

- A degree in Machine Learning, Statistics, AI, Computer Science, Engineering, or Technology
- Hadoop Developer or Data Scientist certification (Hortonworks preferred)
- 5+ years of production programming experience with Java, Scala, Python or other languages
- 5+ years of experience Unix/Linux shell scripting
- 3+ years of Applied Statistics, Data Mining, DB Marketing or Machine Learning experience
- 3+ years of production experience with SAS, R or equivalent statistical or ML framework
- Excellent understanding of the underlying Statistical, Machine Learning theory and Predictive Modeling Lifecycle
- Production experience implementing Lookalike, Clustering Decision Tree models
- Production experience with Text Mining and Recommender systems would be an asset
- Expert-level SQL querying skills
- In-depth understanding and production experience with the core Hadoop components, including HDFS, YARN, MapReduce, Tez, Hive, Pig, Spark
- Minimum 1 year of production Hadoop experience. Hortonworks HDP 2.2 and above experience is strongly preferred
- Minimum 1 year of production experience developing Hadoop Big Data applications for ETL/ELT, analytics, machine learning or reporting
- 1 year of production experience with Hadoop performance tuning, configuration, optimization, job processing and security
- Production experience with BI Reporting and Visualization tools: Tableau, Oracle OBIEE, Microstrategy
- Working knowledge of DBMS, including Oracle (12C), Vertica or equivalent
- Working knowledge of NoSQL databases such as Cassandra, HBase or equivalent
- Proficiency with source control, continuous integration and testing frameworks
- Experience working in lean/agile environments is strongly preferred, including working knowledge of Jira/Confluence

Starting: ASAP
Travel: 0%
Dress Code: Business Casual
To apply for this job you must first either login or register