Title: Data Integration Engineer
The Data Integration Engineer will be responsible for designing, developing and maintaining end-to-end data pipelines and system integrations in our Microsoft-based ecosystem. Working primarily in Python (PySpark, Pandas) and SQL, they will extend our internal ETL library, automate ingestion from APIs, SharePoint and other enterprise sources, and ensure data flows reliably through a Medallion architecture (Bronze → Silver → Gold). They will optimise performance and cost, embed robust logging, testing and CI/CD practices, and uphold governance standards required in a regulated financial-services environment - all while collaborating closely with business stakeholders to turn complex requirements into secure, scalable and well-documented solutions.
Key Responsibilities
- Design, build and monitor end-to-end data pipelines in Microsoft Fabric, following a Medallion architecture (Bronze → Silver → Gold).
- Develop and maintain robust system-to-system integrations (e.g. External-system APIs, Azure Key Vault, SQL / Lakehouse) that feed our analytics and reporting platforms.
- Extend and refactor our internal libraries and systems ensuring clean code, strong documentation, unit-tests and CI/CD packaging.
- Optimise pipeline performance, security and cost (Spark pools, OneLake tables, data-partitioning, delta format, etc.).
- Contribute to data-modelling standards, naming conventions and governance artefacts; champion best practice in a regulated financial-services setting.
- Troubleshoot production issues, perform root-cause analysis and implement preventative controls and monitoring / alerting.
- Document solutions clearly and provide knowledge-sharing sessions to analysts, developers and business stakeholders.
- Work closely with multiple business units to translate requirements into incremental, testable deliverables on an agile backlog.
Personal Attributes
-
Proactive problem-solver: a self-starter who can work independently, take ownership of deliverables and remain accurate under pressure.
-
Collaborative communicator: able to explain technical concepts to non-technical colleagues and build strong working relationships across teams, clients and third-party vendors.
-
Continuous improver: curious about new tooling and eager to automate, refactor and document for long-term maintainability.
-
Detail-oriented pragmatist: balances perfection with shipping value quickly, using sound judgement to prioritise.
-
Domain aware: appreciates the controls, confidentiality and audit expectations of a regulated financial-services environment and designs solutions accordingly.
Preferred Qualifications
- 3 - 5 years’ experience in data engineering / software development, ideally within financial services or another regulated industry (but we care more about what you can do than how long you’ve done it).
- Strong Python skills (Pandas, PySpark, packaging, unit-testing, linting); comfortable building reusable libraries or SDKs.
- Hands-on experience with Microsoft Fabric (Lakehouse, Notebooks, Data Factory pipelines) or the broader Azure data stack (Azure Synapse, Data Factory, Databricks, ADLS Gen2).
- Solid grasp of ETL/ELT patterns, Medallion / Data Vault architectures, delta format, and schema-evolution strategies.
- Proficient in SQL (T-SQL, Spark SQL) and performance-tuning large datasets.
- Familiarity with DevOps practices: Git branching, pull requests, automated testing, CI/CD pipelines (GitHub Actions / Azure DevOps).
- Experience integrating REST/Graph APIs, message queues or event-driven services; understanding of OAuth2.0 and managed identities.
- Working knowledge of data-quality frameworks, observability, logging and alerting tools.
- Degree (or equivalent experience) in Computer Science, Data Engineering, Mathematics or related discipline; relevant cloud or data certifications are advantageous.