Thinkgrid Labs: Fabric Data Engineer (Remote)
 
				
			 
Headquarters: Not Specified
Thinkgrid Labs is at the forefront of innovation and technology. Our expert team of software engineers, architects, and UI/UX designers specialises in crafting bespoke web, mobile, cloud applications, data platforms along with AI solutions and intelligent bots. Thinkgrid Labs is expanding our data practice to stand up Microsoft Fabric for a US health insurer . This role unlocks fast, reliable Bronze-layer ingestion. You’ll own Fabric Mirroring from SQL Server and other sources into OneLake, manage CDC and schema drift at scale, and design resilient, high-volume ingestion pipelines that teams can trust.
Job Title : Fabric Data Engineer
Location : Remote
Working Hours : 2 PM IST to 11 PM IST
Experience Required : 6-10 years in data engineering; 2+ years on Azure data platforms with hands-on Microsoft Fabric (Mirroring/Lakehouse/Warehouse)
Education : Bachelor’s or Master’s degree in Computer Science, Health Informatics, or Business
Preferred certifications : DP-600, DP-203, DP-500
Who are you?
- SQL & CDC Pro: Strong SQL Server/T-SQL; hands-on CDC or replication patterns for initial snapshot + incremental syncs, including delete handling.
- Fabric Mirroring Practitioner: You’ve set up and tuned Fabric Mirroring to land data into OneLake/Lakehouse; comfortable with OneLake shortcuts and workspace/domain organisation.
- Schema-Drift Aware: You detect, evolve, and communicate schema changes safely (contracts, tests, alerts), minimising downstream breakage.
- High-Volume Ingestion Mindset: You design for throughput, resiliency, and backpressure-retries, idempotency, partitioning, and efficient file sizing.
- Python/Scala/Spark Capable: You can build notebooks/ingestion frameworks for advanced scenarios and data quality checks.
- Operationally Excellent: You add observability (logging/metrics/alerts), document runbooks, and partner well with platform, security, and analytics teams.
- Data Security Conscious: You respect PII/PHI, apply least privilege, and align with RLS/CLS patterns and governance guardrails.
What you will be doing?
- Stand Up Mirroring: Configure Fabric Mirroring from SQL Server (and other relational sources) into OneLake; tune schedules, snapshots, retention, and throughput.
- Land to Bronze Cleanly: Define Lakehouse folder structures, naming/tagging conventions, and partitioning for fast, organised Bronze ingestion.
- Handle Change at Scale: Implement CDC-including soft/hard deletes, late-arriving data, and backfills-using reliable watermarking and reconciliation checks.
- Design Resilient Pipelines: Build ingestion with Fabric Data Factory and/or notebooks; add retries, dead-lettering, and circuit-breaker patterns for fault tolerance.
- Manage Schema Drift: Automate drift detection and schema evolution; publish change notes and guardrails so downstream consumers aren’t surprised.
- Performance & Cost Tuning: Optimise batch sizes, file counts, partitions, parallelism, and capacity usage to balance speed, reliability, and spend.
- Observability & Quality: Instrument lineage, logs, metrics, and DQ tests (nulls, ranges, uniqueness); set up alerts and simple SLOs for ingestion health.
- Collaboration & Documentation: Partner with the Fabric Platform Architect on domains, security, and workspaces; document pipelines, SLAs, and runbooks.
Must-have skills
- SQL Server, T-SQL; CDC/replication fundamentals
- Microsoft Fabric Mirroring; OneLake/Lakehouse; OneLake shortcuts
- Schema drift detection/management and data contracts
- Familiarity with large, complex relational databases
- Python/Scala/Spark for ingestion and validation
- Git-based workflow; basic CI/CD (Fabric deployment pipelines or Azure DevOps)
Benefits
- 5 day work week
- 100% remote setup with flexible work culture and international exposure
- Opportunity to work on mission-critical healthcare projects impacting providers and patients globally
To apply: https://weworkremotely.com/remote-jobs/thinkgrid-labs-fabric-data-engineer-remote
 
				
			