Data Engineering on Unstructured Dataset Using AWS for a US-Based Home Automation Company
															Overview
Our client is a US-based OEM producing HVAC equipment, water heaters, and boilers for residential and commercial buildings. They needed a secure and economical solution for large data set analysis.
Download Case Study
Challenges
Structure the format of 120K+ live devices that send 40GB data per day, and it is expanding
Designing a scalable, secure, and cost-effective solution with flexible architecture and
Solution
- Secure, scalable and flexible architecture having serverless computing and storage
 - Data collection, Extract-Transform-Load (ETL) and Data pipeline
 - Data Catalog and Database management
 - Project-based implementation with Infrastructure as Code (IaC)
 - CloudFormation script for Infrastructure management
 - CICD pipeline using GitHub Action
 - Data collection script to transfer MongoDb data to S3
 - Developed ETL script to convert unstructured data to a structured format
 - Dashboard development to display and monitor field devices’ data
 - SSO authentication
 - Analyze and visualize data
 
															Outcomes
- Enabled faster data transformation by dividing day execution into hour execution for time series data
 - Designed pipelines to process data for 1 year that continuously delivered meaningful insights to client
 
															
				






