Senior Data Engineer (f/m/d)
- Pune (IND)
- Software Development
- Fulltime
- Published: 2026-06-30
What do we do?
Introducing Thinkproject Platform
Thinkproject builds construction intelligence software for the firms that deliver Europe's largest
infrastructure, energy, and real estate projects. Our platform manages the information flow across the full lifecycle of a built asset — from design and construction through operation and eventual decommissioning.
By combining deep domain knowledge of the building, infrastructure, and energy industries with a modern SaaS architecture, Thinkproject empowers customers to digitise, connect, and control their construction workflows across their entire asset lifecycle.
What your day will look like
About the Role
We are looking for a Data Integration Engineer to own the data pipelines and integration layer that powers our AI Search Platform. You will design, build, and maintain the workflows that move data reliably from source systems into GCP services — including Vertex AI — and expose that capability through secure, well-designed APIs consumed by internal and external systems.
This is a hands-on engineering role. You will write production code, own the reliability of what you ship, and work closely with DevOps, Network, and Platform Engineering teams.
Tech you will work with daily:
Python | SQL | GCP (Cloud Run, Pub/Sub, Cloud Storage, Cloud Spanner, Vertex AI) | Terraform | PostgreSQL | Docker | Git | CI/CD
Key Responsibilities
Data Integration & Pipeline Development
- Design, implement, and optimise scalable data integration workflows supporting inference and data synchronisation across GCP services (Cloud Run, Pub/Sub, Cloud Storage, Cloud Spanner, Vertex AI)
- Build and maintain event-driven pipelines and ETL/ELT workflows that deliver clean, reliable data to the AI Search Platform
- Automate deployment, testing, and pipeline orchestration using Cloud Run, Pub/Sub triggers, and Terraform
API Development for AI Integration
- Build and maintain APIs that expose data integration and AI inference capabilities to internal and external systems
- Ensure secure, reliable, and performant access to the AI Search Platform — correct authentication, rate limiting, and error handling by default
Permissions & Compliance Layer
- Integrate and enforce API and IAM policies for compliant access control across all AI Search Platform components
- Own and evolve the permissions API layer to meet growing scalability and security requirements
Data Quality & Reliability
- Ensure data integrity through monitoring, validation, and alerting across all integrated systems and services
- Continuously monitor workflows for latency, reliability, and cost efficiency — implement
improvements without waiting to be asked
Documentation & Standards
- Maintain architecture documentation and runbooks
- Contribute to best practices for data integration, reproducibility, scalability, and security
What you need to fulfill the role
Required Skills & Qualifications
You have 3+ years of hands-on experience in data engineering, cloud integrations, or backend
development and have shipped production data pipelines on GCP. Specifically:
- 3+ years of professional experience in data engineering, cloud integrations, or backend
development - Strong proficiency in Python and SQL
- Production experience with Google Cloud Platform services: Cloud Run, Pub/Sub, Cloud Storage,Cloud Spanner, and Vertex AI
- Experience with event-driven architectures and cloud-based ETL/ELT workflows
- Experience with relational databases (PostgreSQL, Cloud Spanner) and exposure to NoSQL
- Proficient with Git and familiar with CI/CD workflows and containerisation (Docker)
- Experience with Terraform or equivalent Infrastructure-as-Code tooling
- Working knowledge of IAM, data governance, and access management principles
Nice-to-Have (Bonus Skills)
- Azure DevOps or cross-cloud integration experience
- API design experience (REST or gRPC)
- Experience with AI/ML inference pipelines or Vertex AI in production
- Prior work in construction, engineering, or real estate software domains
Soft Skills
- Engineering rigour — you care about pipeline reliability and data correctness, not just throughput
- Ownership mindset — you monitor what you build and fix it when it breaks
- Clear written communication: able to document integration contracts and architecture decisions for non-specialist readers
- Collaborative: comfortable working across DevOps, Network, and Platform teams without friction
- Comfortable with ambiguity — you can scope and deliver integration work from incomplete upstream specs
What success looks like
- Month 3: Core data pipelines understood and contributing to production; first reliability or latency improvement shipped
- Month 6: Owning at least one integration area end-to-end; permissions API layer extended with evidence-backed design decisions
- Month 12: Data integration reliability measurably improved; pipeline documentation and monitoring coverage complete; identified and closed at least one material cost or latency inefficiency
You're probably NOT a fit if
- Your data engineering experience is primarily batch ETL without event-driven or streaming context
- You are not comfortable working across cloud-native GCP services in production
- You treat IAM and access control as someone else's concern
- You need fully defined requirements before designing an integration
What we offer
Compensation (Pune, Mid–Senior)
- Competitive fixed salary — shared on request
- Variable performance bonus: 5% of fixed
- Continuous learning & certification budget
Learning programmes | Career growth | International exposure
Your contact:
Preethika Ramdass
Submit your application at careers.thinkproject.com, including:
- Salary expectations
- Potential start date
- A short write-up (max 300 words) on the most complex data integration or pipeline reliability challenge you have solved in production — what broke, what you built, and what you would do differently now
Working at thinkproject.com - think career. think ahead.
#LI-PR1
#LI-Hybrid
