Machine Learning Engineering Senior Engineer
Position Title: Machine Learning Engineering Senior Engineer
Dearborn, MI
12 Months
Position Description:
ML Ops • Build scalable and robust ML data pipelines in the cloud to process large volumes of connected vehicle data to support Ford's agentic initiatives. • Optimize existing ML solutions for performance, security, and cost-effectiveness • Utilize continual learning methods to continuously improve model performance Other • Develop exceptional analytical data products using both streaming and batch ingestion patterns on Google Cloud Platform with solid data warehouse principles. • Build data pipelines to monitoring quality of data and performance of analytical models and agentic solutions. • Maintain the infrastructure of the data platform using terraform and continuously develop, evaluate, and deliver code using CI/CD. • Collaborate with data analytics stakeholders to streamline the data acquisition, processing, and presentation process. • Implement an enterprise data governance model and actively promote the concept of data - protection, sharing, reuse, quality, and standards. • Enhance and maintain the DevOps capabilities of the data platform. • Continuously optimize and enhance existing data solutions (pipelines, products, infrastructure) for best performance, high security, low vulnerability, low costs, and high reliability. • Work in an agile product team to deliver code frequently using Test Driven Development (TDD), continuous integration and continuous deployment (CI/CD). • Promptly address code quality issues using SonarQube, Checkmarx, Fossa, and Cycode throughout the development lifecycle. • Perform any necessary data mapping, data lineage activities and document information flows. • Monitor the production pipelines and provide production support by addressing production issues as per SLAs. • Provide analysis of connected vehicle data to support new product developments and production vehicle improvements. • Provide visibility to data quality/vehicle/feature issues and work with the business owners to fix the issues. • Demonstrate technical knowledge and communication skills with the ability to advocate for well-designed solutions. • Continuously enhance your domain knowledge of connected vehicle data, connected services and algorithms/models/solutions developed by data scientists and AI engineers. • Stay current on the latest data engineering practices and contribute to the technical direction of the company while keeping a customer-centric approach.
Skills Required:
Technical Communication, Communications, Google Cloud Platform, TensorFlow, Data Governance, Machine Learning, Python, Artificial Intelligence & Expert Systems, GitHub, Tekton, Docker, Jira, Microservices, Data Architecture, Agile Software Development, SQL, Java, Spark, Cloud Architecture, Apache Kafka, REST APIs 1. Technical Communication – This person will need to describe clearly the ML/AI Ops needs and strategy to colleagues potentially up to executives across a wide cross section of people from very knowledge to not technically knowledgeable in this area. 2. Communications – In addition to the technical communication needed, this person will need to be a great communicator to work with people in other organizations who are stakeholders and we need to work together and not have there be communication gaps 3. Google Cloud Platform – Deep knowledge of how to implement ML / AI Ops in the GCP Platform specifically is required 4. TensorFlow – 5. Data Governance – This role will need to implement an enterprise data governance model and actively promote the concept of data - protection, sharing, reuse, quality, and standards. 6. Machine Learning – We need an ML Ops expert 7. Python – Some of the ML Ops pipeline will likely need to be setup using this code 8. Artificial Intelligence & Expert Systems – The ML Ops pipeline needs to be set up for AI Agentic Solutions in mind as well. 9. GitHub – This is where our code will reside, so this is needed SEE 10 TO 21 IN ADDITION INFORMATION
Skills Preferred:
Telematics, Machine Learning, Data Modeling, Cloud Infrastructure, Data Mining, Database Design, Troubleshooting (Problem Solving), Labor Supervision 1. Telematics – Knowledge of this is nice, as some of our data will be Telematics data 2. Machine Learning – 3. Data Modeling – In order to understand how the data will interact with the ML Operations. 4. Cloud Infrastructure – 5. Data Mining – 6. Database Design – 7. Troubleshooting (Problem Solving) – 8. Labor Supervision – Will need to mentor and advise junior team members to spread ML Ops expertise across the organization
Experience Required:
• Master’s degree or foreign equivalent degree in Computer Science, Software Engineering, Information Systems, Data Engineering, or a related field, and 4 years of experience OR equivalent combination of education and experience (6+ years with Bachelor's Degree). • 4 years of professional experience in: o Data engineering, data product development and software product launches o At least three of the following languages: Java, Python, Spark, Scala, SQL • 3 years of cloud data/software engineering experience building scalable, reliable, and cost-effective production batch and streaming data pipelines using: o Data warehouses like Amazon Redshift, Microsoft Azure Synapse Analytics, Google BigQuery. o Workflow orchestration tools like Airflow. o Relational Database Management System like MySQL, PostgreSQL, and SQL Server. o Real-Time data streaming platform like Apache Kafka, GCP Pub/Sub o Microservices architecture to deliver large-scale real-time data processing application. o REST APIs for compute, storage, operations, and security. o DevOps tools such as Tekton, GitHub Actions, Git, GitHub, Terraform, Docker. o Project management tools like Atlassian JIRA. Even better if you have...
Experience Preferred:
• Ph.D. or foreign equivalent degree in Computer Science, Software Engineering, Information System, Data Engineering, or a related field. • 2 years of experience with ML Model Development and/or MLOps. • Committed code to improve open-source data/software engineering projects • Experience architecting cloud infrastructure and handling application migrations/upgrades. • GCP Professional Certifications. • Demonstrated passion to mine raw data and realize its hidden value. • Passion to experiment/implement state of the art data engineering methods/techniques. • Experience working in an implementation team from concept to operations, providing deep technical subject matter expertise for successful deployment. • Experience implementing methods for automation of all parts of the pipeline to minimize labor in development and production. • Analytics skills to profile data, troubleshoot data pipeline/product issues. • Ability to simplify, clearly communicate complex data/software ideas/problems and work with cross-functional teams and all levels of management independently. • Ability to mentor and advise junior team members
Education Required:
Bachelor's Degree
Education Preferred:
Master's Degree
Additional Safety Training/Licensing/Personal Protection Requirements:
Additional Information :
***HYBRID / 4 days per week in the office*** 10. Tekton – Will likely be needed to work in our DevOps 11. Docker – Our vendor will be using Docker images, so we will need to know how to account for this. 12. Jira – Our projects are managed in Jira, so knowledge of Jira would be nice. 13. Microservices – Microservices architecture to deliver large-scale real-time data processing application. 14. Data Architecture – Optimize existing ML solutions for performance, security, and cost-effectiveness 15. Agile Software Development – Need to be able to work in an Agile environment, related to Jira and Communication skills 16. SQL – There will be SQL in the pipeline, so knowledge is important 17. Java – May be in the pipeline 18. Spark – May be in the pipeline 19. Cloud Architecture – Knowledge to Build scalable and robust ML data pipelines in the cloud to process large volumes of connected vehicle data to support Ford's agentic initiatives. 20. Apache Kafka – Knowledge of this for real time data streaming in the pipeline is important 21. REST APIs – REST APIs for compute, storage, operations, and security.