Book

The author chose to use the term Modern Data Systems to express all encapsulated data available in this modern time, as many tangent technologies and other competitive technologies have emerged in the past few years.

Fundamentals of Modern Data Systems” offers a comprehensive exploration of the evolving landscape of data technologies. The book begins by highlighting the transition from traditional data systems like data warehouses and ETL processes to modern data systems, which are essential in handling the volume, variety, and velocity of big data. This shift is driven by the need for more robust technological solutions to manage the increasing amount and dynamic nature of data, especially with the proliferation of IoT devices and the global expansion of internet connectivity. The book emphasizes how these transformations pose significant challenges to data infrastructure and utilization, necessitating advanced data handling and processing technologies.

The book serves as a textbook for students in the Management Analytics and Data Technologies Graduate Program at NIDA, focusing on the Modern Data Technologies course. It aims to provide students with the essential knowledge to proficiently navigate and manage contemporary data systems. The content is structured to cover various aspects of modern data systems, including data architecture, ingestion, storage, and processing. It also discusses the integration of new and old technologies in modern data systems, a critical aspect for companies striving to gain a competitive advantage in the data-driven technology era. The book is not only a resource for students but also for professionals, project managers, and engineers involved in data-related projects, offering insights into the end-to-end process of data-oriented projects.

We begin each chapter by discussing business cases related to the chapter’s content, as it is easy to get mesmerized by modern data technologies and then forget that the project’s primary goal is a financial benefit for the organization. The book is composed of seven chapters, which are

Chapter 1: Introduces Modern Data Technology concepts, including business motivations and fundamentals of data terminology. We also compare traditional and modern data technologies from a business perspective.

Chapter 2: Presents the Modern Data Landscape and Architecture, discussing business impacts, data projects for business applications, and the stages of building a data project. We explore large-scale data characterization and architecture, modern data architecture types, and modern data pipelines. It concludes with an overview of cloud computing services platforms for handling large-scale data.

Chapter 3: Discusses Modern Data Ingestion, beginning with business cases and the two main types of data ingestion (based on data structure and data velocity). It covers data integration and introduces modern data ingestion tools like Flume, Sqoop, and Kafka.

Chapter 4: Introduces Modern Data Storage, highlighting the importance of coupling data storage with its intended use. We discussed business cases, structured and unstructured data, and various data storage systems. The chapter concluded with an overview of large-scale data storage systems like HDFS, S3, BLOB, GCS, Delta Lake, and Lakehouse.

Chapter 5: Focuses on Modern Data Processing, starting with business impacts and covering large-scale processing algorithms, data preparation, and integration. We then discussed notable large-scale data processing tools like Spark, Hive, BigQuery, AWS Kinesis, and Snowflake.

Chapter 6: Explores recent data management, architecture, and services, including Data Mesh, Data Fabric, Data Observability, and Data Discovery. We also discussed tools such as dbt, datahub, and terraform.


Chapter 7: Provides a conclusion from both business and technical perspectives, featuring successful technology company case studies from Netflix, Uber, Databricks, Discovery Heroes, and Snowflake.