This course explores introductory topics pertaining to the field of developing data processing solutions–data engineering–in the context of Big Data environments. Specifically it covers concepts, techniques and technologies related to the processing and storage of Big Data datasets including MapReduce and NoSQL. It highlights the unique challenges faced when processing and storing Big Data datasets. The MapReduce data processing engine, which is the de facto framework for batch processing of large amounts of data, is also explained in detail.
The following primary topics are covered:
– Big Data Engineering – Big Data Engineering Challenges
– Big Data Storage Terminologies (including sharding, replication, CAP theorem, ACID, BASE)
– Big Data Storage Requirements
– On-Disk Storage (including distributed file system – databases)
– Introduction to NoSQL – NewSQL
– NoSQL Rationale – Characteristics
– NoSQL Database Types (including key-value, document, column-family and graph databases)
– Big Data Processing Requirements
– Big Data Processing (including batch mode and realtime mode)
– Introduction to MapReduce for Big Data Processing (batch mode)
– MapReduce Explained (including map, combine, partition, shuffle and sort, and reduce)
Duration: 1 Day
Taking the Course at a Workshop
This course can be taken as part of instructor-led workshops taught by Arcitura Certified Trainers. These workshops can be open for public registration or delivered privately for a specific organization. Certified Trainers can teach workshops in-person at a specific location or virtually using a video-enabled remote system, such as WebEx. Visit the Workshop Calendar page to view the current calendar of public workshops or visit the Private Training page to learn more about Arcitura’s worldwide private workshop delivery options.
Below are the base materials provided to public and private workshop participants.
Note that as a workshop participant, you may be eligible for discounts on the purchase of the Study Kit and Pearson VUE exam voucher for this course.
Taking the Course using a Study Kit
This course can be completed via self-study by purchasing a Study Kit, which includes the base course materials as well as additional supplements and resources designed specifically for self-paced study and exam preparation.
Visit the BDSCP Module 7 Study Kit page for pricing information and for details. Also, visit the Study Kits Overview page for information regarding discounted Certification Study Kit Bundles for individual certification tracks.
The following materials are provided in the Study Kit for this course:
Note that this Study Kit can be purchased with or without a discounted Pearson VUE voucher for Exam B90.07.
Study Kits and Study Bundles can be purchased using the online store. By purchasing and registering this Study Kit, you may be eligible for discounts on the registration of this course as part of a public workshop.
About the Text Book
This BDSCP course module covers a range of in-depth topics that are described in the course booklet and further elaborated by more detailed coverage in the associated NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence text book. This text book is included in paperback format as part of the Study Kit and may be provided in paperback as part of workshops.
Vendor-Neutral Topic Overview
Note that all BDSCP course modules are focused on vendor-neutral topics and therefore do not provide detailed coverage of any vendor-specific platforms or technologies. BDSCP courses are intentionally authored this way so as to provide an unambiguous and objective understanding of practices and technology that can be further complemented with product-specific training.
Download a printable PDF document with information about this course module and its corresponding Study Kit.