Data Lake Implementation at High-end Systems Conglomerate
Data Lake Implementation at High-end
A conglomerate of 5 companies, an industry leader in the design, installation, and service of high-end systems including fire alarm systems, burglar alarm systems, access control systems and CCTV systems wanted a solution to run aggregated reports across all 5 organizations.
The solutions & services provider had to fulfil the following criteria:
- Report generation to be done automatically
- The process to be error free and independent of employees
- A scalable, flexible, centralized solution
- Solution independent of the volume of data and number of servers
- Provide optimum value while being cost efficient
- Keep the data as it is in its raw, open, unstructured/ structured format while being easily accessible
The global data lakes market was valued at US$ 7.4 Billion in 2021 and is expected to reach US$ 30.2 Billion by 2027, growing at a CAGR of 26.4% during 2022-2027.
The data lakes market is expected to be driven by rising data volume in many organizations and the opportunity to draw in-depth insights from the huge volume of data. The increased implementation of the latest technology and the rise in the adoption of IoT devices is also expected to drive the growth of the market. Other major factors driving the global data lakes market growth is the low cost of storage and the highly agile nature apart from low labor cost, low maintenance cost, and low cost of raw materials. Additionally, government initiatives such as the development of smart cities and the implementation of intelligent utility meters are expected to provide opportunities for the market.
Solutions and Services
ResolveTech (RTS) implemented a data lake solution to store and analyze data because it allows management of multiple data types from a wide variety of sources, and store this data, structured and unstructured, in a centralized repository. The Data Lake provides data points to people and democratizes data-as a single source of truth- by providing people access to the lake. This was done without putting any load on the back-end system.
The AWS Cloud provided many of the building blocks required to implement a secure, flexible, and cost-effective data lake. These include AWS managed services to absorb, store, find, process, and analyze data, both structured and unstructured. Data Lake on AWS configures the core AWS services needed to easily tag, search, share, transform, analyze, and govern specific subsets of data throughout the company.
The Data Lake on AWS utilizes the security, durability, and scalability of Amazon S3 to manage a persistent catalog of organizational datasets. It also used Amazon DynamoDB to manage corresponding metadata. RTS used power BI for visualization and Amazon Quicksight allows to understand data natural language querying, interactive dashboards, and patterns and outlier identification. RTS also used FTP or streaming data and Boomi integration platform services to get the data in the lake, format it correctly and then consumes it inside the platform.
RTS delivered the following results to the client:
- Produced the demo report within three weeks of signing on and provided access to data in the lake to all users.
- Built a data lake that helps avoid integrating all its tables, data and schemas into an enterprise software or building a customized integration layer.
- Integrated data from multiple sources, locations, companies, and systems.
- The data lake extracts data and helps import data in multiple ways
- The data lake provides fast data access and enables machine learning and predictive analytics.
- Access almost real-time reports (± 5 minutes) with updates from across their entire multi-hundred site enterprise with data about temperature issues, compressor issues etc.
- Mapped and extracted the data out of the old database and modifying it into the new format so that it could be imported.
- Break down the silos and access their data very quickly.