You are viewing a preview of this job. Log in or register to view more details about this job.

Junior Data Scientist

Join the Soothsayer Analytics team as a Junior Data Scientist. You will be working from our office in Detroit, MI as a Junior Data Scientist. Your scope can range from ad-hoc analysis of customer data, building POC, to mission critical state-of-the-art deep learning AI products that are deployed in production. The Data Science team works on a variety of exciting analytics (ML, AI) projects.
  • Participate in requirements gathering, technical specification, design and development of complex operationalizing machine learning projects.
  • Ability to translate business requirements into plausible technical solutions for articulation to other development team members.
  • Contribute to the architecture design, development of data or machine learning pipelines, and integration into enterprise systems
  • Responsible for Build and configure multi-tenant machine learning environments on-prem, cloud or hybrid
  • Responsible for Build, test and optimize ML models
  • Interact with teams of engineers from multiple disciplines Identifying and defining the scope
  • Defining technical approach, data and algorithms needed
  • Responsible for building out the data product from POC to product• Communicating value, insight, possibilities and limitation of Data Science product for customer and internal stakeholders
  • Communicate results to scientific and business stakeholders
Required Education:
  • PhD in computer science, Physics, Statistics, applied Mathematics, or other quantitative/ Computational discipline or Master's with over 2 year of real time relevant real-time Industry experience as a DATA SCIENTIST is a MUST.
Required experience: over 3 years
  • Experience: At least 3 year industry experience as a Data Scientist is required
  • Basic knowledge of operationalizing analytics projects at scale
  • Advanced Experience in Python and ML
  • Experience with at least one CI/CD tool, e.g. Jenkins, Github actions, or cloud equivalents
  • Experience in application containerization and orchestration tools - Docker, Kubernetes, or cloud equivalents
  • hands on experience in a range of ML and AI techniques (e.g. supervised and un-supervised machine learning techniques, deep learning, graph data analytics, statistical analysis, time series, geospatial, NLP, sentiment analysis, pattern detection, etc.)
  • Experience with building Machine Learning models and pipelines using Python, R & Spark to extract insights from data
  • Proficient in data processing using pandas and pySpark.
  • Knowledge of SQL for accessing and processing data
  • Experience using the latest Data Science platforms (e.g. Databricks, AWS SageMaker) and frameworks (e.g. Tensorflow, scikit-learn)
  • Preferred experience in building recommendation engines on large datasets with multiple formats Knowledge in implementing models based on game theory/mechanism design like Bayesian bargaining games and auctions.
  • Knowledge of deep learning methods (autoencoders, embeddings etc.) using Pytorch, Tensorflow etc. and semi-supervised learning Knowledge in working with Graph databases and algorithms
  • Knowledge in building Bayesian network-based models.
  • Software engineering practices (coding practices to DS, unit testing, version control, code review
  • Hadoop (especially the Cloudera and Hortonworks distributions), and streaming technologies (Kafka, Spark Streaming)
  • Deep understanding of data manipulation/wrangling techniques
  • Delivering insights using visualization tools (such as Power BI, Qlick) or libraries
  • Experience building and deploying solutions to Cloud (such as AWS, Azure, GCP)
  • Experience with containerization and virtualization (e.g. Docker, Kubernetes, VMs etc.), AWS Sagemaker, Python, Scala
Please submit the following skills matrix with your application.
  • Data Analysis and Modeling using Python/R - 24 months = Essential
  • Time Series Analysis and Modeling - 6 months = Essential
  • Distributed data processing using pyspark - 12 months = Essential
  • Deep Learning on structured data - 12 months = Essential
  • Computer Vision/NLP - 6 months = Preferred
  • MLOps - 6 months = Preferred
  • Domain experience in solving ML problems in one or more of Manufacturing, Insurance, Procurement, Finance, Supply Chain and Logistics - 12 = Essential
Covid-19 update: Work from our office in Detroit, MI, remote work is not allowed.
About Soothsayer Analytics
Soothsayer Analytics is a Data Science company based out of Livonia, MI and offices in Columbus, OH and offshore in Hyderabad India. Advanced Analytics firm focused on Pattern Recognition and Unstructured Data. Our team actively works with, creates, and researches cutting edge Data Science techniques. We approach each problem individually and architect custom solutions. Our strength lies in our ability to build and productize proprietary algorithms and analytical tools. Advised by 10 industry experts whose domain knowledge we leverage to better architect industry specific solutions. Our delivery partners include 20+ Data Scientists with a combined 75 Patents and 300+ Publications. We collaborate with this brain trust to keep us abreast of state-of-the-art techniques and to help deliver world-class results. We utilize cutting edge Machine Learning and Statistical Techniques to extract Hidden Insights and Patterns from Complex, High Dimensional, and Unstructured Data. Our major clients include: DOW Chemicals, Ford, Visteon, D&B, Timken, US steels, Steelcase, Abercrombie, Express, Stanley Steemer, Whirlpool, AEP, NiSource, GEA, etc. Soothsayer Analytics is an Equal Opportunity Employer and e-Verify Company.