Logo

Data Scientist (m/f/d)

  • Pune (IND)
  • IT
  • Fulltime
  • Published: 2024-02-27
scheme imagescheme imagescheme image
Want to work in a culture built on mutual trust and respect? How about having the freedom to make work fit into your life (and not the other way round)? A career with Thinkproject could be just the opportunity you're looking for.

What do we do?
Thinkproject is a European market-leader in digitalisation tools for construction companies. It sounds complex, but we'll explain further! Construction companies used to use manual administration and physical paperwork for projects (sometimes hundreds of thousands of bits of paperwork for one project!). Using our construction intelligence solutions, businesses can go digital, which benefits everyone from the construction companies to the wider public. 
Our mission is to deliver digitalisation to make a safer, healthier and more sustainable AECO (Architecture, Engineering, Construction, Operations) industry. This is a really exciting time to join our company, since our founding in 2000 we have gone from strength-to-strength and have lots of exciting developments coming up soon that you could be a part of.

We are looking for a skilled and motivated Data Scientist (m/f/d) to join our team in India. As a Data Scientist, you will play a critical role in analyzing, interpreting, and deriving valuable insights from vast datasets. You'll work closely with cross-functional teams to develop data-driven solutions that drive innovation and optimization within our software solution.

Location: Pune
Department: R&D
Contract: Permanent

What your day will look like

  • Collect, preprocess, and analyze structured and unstructured data sets related to the construction industry using statistical methods and machine learning techniques.
  • Develop predictive models, algorithms, and data-driven solutions to address business challenges and enhance decision-making processes.
  • Collaborate with software engineers, product managers, and domain experts to integrate analytical solutions into our cloud-based software platforms.
  • Design and implement experiments, tests, and data-driven initiatives to improve product functionalities and user experience.
  • Perform exploratory data analysis to identify trends, patterns, and correlations within construction-related datasets.
  • Communicate findings and insights to both technical and non-technical stakeholders through reports, visualizations, and presentations.
  • Stay updated with the latest advancements in data science, machine learning, and construction technology to drive innovation within the organization.

What you need to fulfill the role

Master's degree in Computer Science, Data Science, Statistics, or a related quantitative field.
5+ yrs of Previous experience in a Data Scientist or similar role, preferably within the software industry or construction domain.
Proficiency in programming languages like Python or R for data analysis, machine learning, and statistical modeling, with expertise in relevant libraries.
Strong understanding of machine learning techniques (supervised/unsupervised learning, regression, clustering, normalization, etc.), along with practical experience using libraries like scikit-learn, TensorFlow, PyTorch, etc.
Hands-on experience working with large datasets, utilizing data visualization tools (especially Power BI), and working with SQL/NoSQL databases.
Excellent problem-solving abilities and adeptness in translating business requirements into data-driven solutions.
Effective communication skills, capable of presenting complex findings in a clear and understandable manner.
Ability to contribute to the productionization of models.
Proficient in creating models from scratch and fine-tuning existing models.
Good understading of Spark SQL and PySpark.
Should be able to contribute in large models management.
Evaluate out-of-box Supervised/Un-supervised/Neural Network Models for their effectiveness on Thinkproject Business challenges
Experience in entire ML application development lifecycle - data preparation, experiment tracking, model result reproducibility and deployment.
Experience of working with both ML Training and Inference pipelines
Experience in using tools like ML Flow for ML development tracking, Apache Spark for deploying ML Production applications etc.
Flexibility to work with both Traditional and Neural network based ML models for use cases spanning NLP, Computer Vision and Tabular Structured data.

Bonus Points for:
Experience with Snowflake transformations and Snowpark.
Knowledge of Azure Data Factory or Kafka ingestion.
Understanding of Parquet file data handling.
Familiarity with MLFlow or Kubeflow

What we offer

Health Days I Lunch 'n' Learn Sessions I Women's Network I LGBTQIA+ Network I Demo Days I Coffee Chat Roulette I Ideas Portal I Free English Lessons I Thinkproject Academy I Social Events I Volunteering Activities I Open Forum with Leadership Team (Tp Café) I Hybrid working I Unlimited learning

We are a passionate bunch here. To join Thinkproject is to shape what our company becomes. We take feedback from our staff very seriously and give them the tools they need to help us create our fantastic culture of mutual respect. We believe that investing in our staff is crucial to the success of our business.

Your contact:

Vikas Gaikwad

Please submit your application, including salary expectations and potential date of entry, by submitting the form on the next page.


Working at thinkproject.com - Make your intelligence an asset.