Big Data Analytics

Diese kurs ist nicht verfügbar in Deutsch (Deutschland)

Wir übersetzen es in weitere Sprachen.

Big Data Analytics

Dozent: Dr. Mohit Bhatnagar

Bei Coursera Plus enthalten

Mehr erfahren

12 Module

Verschaffen Sie sich einen Einblick in ein Thema und lernen Sie die Grundlagen.

Stufe Anfänger

Empfohlene Erfahrung

3 Wochen zu vervollständigen

unter 10 Stunden pro Woche

Flexibler Zeitplan

In Ihrem eigenen Lerntempo lernen

Auf einen Abschluss hinarbeiten

Mehr erfahren

12 Module

Verschaffen Sie sich einen Einblick in ein Thema und lernen Sie die Grundlagen.

Stufe Anfänger

Empfohlene Erfahrung

3 Wochen zu vervollständigen

unter 10 Stunden pro Woche

Flexibler Zeitplan

In Ihrem eigenen Lerntempo lernen

Auf einen Abschluss hinarbeiten

Mehr erfahren

Was Sie lernen werden

Gain a deep understanding of Hadoop and Spark ecosystems for managing big data. Become familiar with tools like Hive and Pig to query large datasets.

Kompetenzen, die Sie erwerben

Kategorie: PySpark
Kategorie: Cloud Applications
Kategorie: Big Data
Kategorie: Data Manipulation
Kategorie: Data Processing
Kategorie: Data Mining
Kategorie: Apache Spark
Kategorie: Real Time Data
Kategorie: Applied Machine Learning
Kategorie: Distributed Computing
Kategorie: Query Languages
Kategorie: Apache Hadoop
Kategorie: Databases
Kategorie: NoSQL
Kategorie: Scripting Languages
Kategorie: Machine Learning Methods
Kategorie: Analytics

Wichtige Details

Zertifikat zur Vorlage

Zu Ihrem LinkedIn-Profil hinzufügen

Kürzlich aktualisiert!

Juni 2025

Bewertungen

16 Aufgaben

Unterrichtet in Englisch

Erfahren Sie, wie Mitarbeiter führender Unternehmen gefragte Kompetenzen erwerben.

Weitere Informationen zu Coursera für Unternehmen

Logos von Petrobras, TATA, Danone, Capgemini, P&G und L'Oreal

In diesem Kurs gibt es 12 Module

The Big Data Analytics course offers a deep dive into the technologies, tools, and techniques used to process and analyze large-scale data. Learners will explore the Hadoop and Spark ecosystems, gaining hands-on experience with essential components such as Hadoop Distributed File System (HDFS), MapReduce, Pig, and Hive. The course also covers both relational (SQL) and nonrelational (NoSQL) databases, helping learners understand the appropriate contexts for each type of data storage.

A significant focus is placed on Apache Spark, known for its high-speed, in-memory data processing capabilities, which is vital for handling big data applications. Learners will also work through real-world exercises, including implementing and deploying a machine learning application that processes streaming data on the cloud. Designed for professionals with a background in predictive analytics, basic SQL, and Python programming, this course equips learners with the practical skills to manage data characterized by high volume, velocity, and variety. By the end of the course, participants will be able to derive actionable insights from big data and apply them in business contexts, contributing to improved decision-making and competitive advantage in data-driven environments.

Welcome to the Big Data Analytics course! By the end of this course, you will develop an understanding of the various technologies associated with Hadoop and the Spark ecosystem of tools and technologies. You will get hands-on experience working with core Hadoop components like MapReduce and Hadoop Distributed File System (HDFS). You will learn to write Pig scripts and Hive queries and extract data stored across Hadoop clusters. You will also learn about relational (SQL) and nonrelational (NoSQL) databases and discuss scenarios in which one is preferred over the other for data storage. You will also gain insight into the Spark ecosystem which makes running jobs across clusters very fast, thereby having several emerging applications. You will also learn a hands-on example of implementing and deploying a machine-learning application that handles streaming data on the cloud. This is an advanced-level course, intended for learners with a background using predictive tools and techniques, experience in writing basic Structured Query Language (SQL) queries, and an understanding of Python programming. The knowledge you gain from this course will help you make a career as a business analyst. You will gain skills to draw insights from data that has characteristics of high velocity, volume, and variety. The data with such characteristics is called big data and is increasingly being used by organizations for competitive advantage and decision-making.

Das ist alles enthalten

1 Video

In this module, you will learn about Big Data applications and the various components of the Hadoop ecosystem. The module also discusses the MapReduce paradigm that facilitates distributed processing of data. You will also gain an insight into the HDFS and use it for storing files. Hands-on examples are provided using Hortonworks Data Platform Sandbox, which can be installed on a Windows/Mac computer with at least 8 GB of available RAM.

Das ist alles enthalten

12 Videos4 Lektüren2 Aufgaben1 Diskussionsthema

12 VideosInsgesamt 93 Minuten

Introduction to Big Data 6 MinutenModulvorschau
Data Types and Applications3 Minuten
The Need and Evolution of Hadoop4 Minuten
The Hadoop Ecosystem6 Minuten
Hortonworks Data Platform Sandbox Installation (Desktop/Laptop)9 Minuten
Hortonworks Data Platform Sandbox Installation (Google Cloud)14 Minuten
The HDFS File System6 Minuten
Hands-On with HDFS on HDP Sandbox (Desktop/Laptop)9 Minuten
Hands-On with HDFS on HDP Sandbox (Google Cloud)13 Minuten
Distributed Computing Using YARN4 Minuten
Introduction to MapReduce 6 Minuten
Hands-On with MapReduce Using Python 6 Minuten

4 LektürenInsgesamt 180 Minuten

Essential Reading: Introduction to Big Data60 Minuten
Recommended Reading: Introduction to Hadoop Ecosystem30 Minuten
Essential Reading: Hands-On with Hadoop60 Minuten
Recommended Reading: mrjob Python Library30 Minuten

2 AufgabenInsgesamt 39 Minuten

Introduction to Big Data and Hadoop Ecosystem24 Minuten
Hands-On with Hadoop15 Minuten

1 DiskussionsthemaInsgesamt 20 Minuten

Applications of Big Data Analytics20 Minuten

This assessment is a graded quiz based on the module covered in this week.

Das ist alles enthalten

1 Aufgabe

In this module, you will learn about the Hive scripting language and its usage for mining data from Hadoop clusters. Hive provides an SQL dialect called Hive Query Language (abbreviated HiveQL or just HQL) for querying data stored in a Hadoop cluster. Hive is most suited for data warehouse applications, where relatively static data is analyzed, fast response times are not required, and when the data is not changing rapidly. Hive makes it easier for developers to port SQL-based applications to Hadoop, compared with other Hadoop languages and tools. Like all SQL dialects in widespread use, it does not fully conform to any particular revision of the ANSI SQL standard. It is perhaps closest to MySQL’s dialect, but with significant differences. Hive supports several sizes of integer and floating-point types, a boolean type, and character strings of arbitrary length. Lastly, taking a real-world data set, you will load it in the Ambari environment for analysis using HDFS and HQL. You will go through the process of creating tables, loading data, and analyzing it using a Hive Query Language.

Das ist alles enthalten

9 Videos2 Lektüren2 Aufgaben1 Diskussionsthema

9 VideosInsgesamt 66 Minuten

Recap of Basic Concepts6 MinutenModulvorschau
Introduction to Hive5 Minuten
Hive Data Types5 Minuten
HQL Commands and Uses6 Minuten
HiveQL Data Definition and Manipulation6 Minuten
Getting Started with Hive10 Minuten
Using the Hive View on Ambari7 Minuten
Practice Example on Hive8 Minuten
Challenge: Hands-On9 Minuten

2 LektürenInsgesamt 105 Minuten

Essential Reading: Introduction to Hive15 Minuten
Essential Reading: Hands-On with Hive90 Minuten

2 AufgabenInsgesamt 30 Minuten

Introduction to Hive18 Minuten
Hands-On with Hive12 Minuten

1 DiskussionsthemaInsgesamt 15 Minuten

Introduction to HIVE15 Minuten

This assessment is a graded quiz based on the modules covered this week. 

Das ist alles enthalten

1 Aufgabe

In this module, you will learn about the Pig Latin scripting language and how you can leverage it to query big data on Hadoop clusters. You will also learn about the different data types and commands available in the Pig Latin language and how they can be used to define and manipulate data in the Hadoop ecosystem. Furthermore, you will be to work on a practical example of a publicly available data set to run Pig Latin scripts for data analysis.

Das ist alles enthalten

7 Videos2 Lektüren2 Aufgaben

7 VideosInsgesamt 57 Minuten

Introduction to Pig Latin7 MinutenModulvorschau
Pig Data Types7 Minuten
Pig Latin Commands and Uses7 Minuten
Pig Data Definition and Manipulation8 Minuten
Running Pig View on Ambari6 Minuten
Example on Pig View9 Minuten
Practice Problem as a Challenge10 Minuten

2 LektürenInsgesamt 105 Minuten

Essential Reading: Introduction to Pig Language15 Minuten
Recommended Reading: Hands-On with Pig90 Minuten

2 AufgabenInsgesamt 30 Minuten

Introduction to Pig Language24 Minuten
Hands-On with Pig6 Minuten

In this module, you will be introduced to the need for NoSQL databases. You will also get introduced to HBase, a NoSQL database, and its role in the Hadoop ecosystem. You will learn about the CAP theorem and how it affects the trade-offs between choosing the different NoSQL database options available on Hadoop. You will also learn about CAP consistency, availability, and partition tolerance in detail and how they affect our choice of technology to access and manipulate data on Hadoop. Lastly, you will get insights into other emerging cloud-based NoSQL solutions.

Das ist alles enthalten

8 Videos2 Lektüren2 Aufgaben1 Diskussionsthema

8 VideosInsgesamt 59 Minuten

Introduction to Data Warehouses7 MinutenModulvorschau
Need for NoSQL Databases7 Minuten
CAP Theorem8 Minuten
Making a Choice of a Database7 Minuten
Introduction to HBase6 Minuten
Architecture of Hbase7 Minuten
HBase data model6 Minuten
Running and Setting Up Hbase on Ambari and Hands-On with Hbase7 Minuten

2 LektürenInsgesamt 135 Minuten

Essential Reading: Introduction to NoSQL Databases45 Minuten
Recommended Reading: Hands-On with HBase90 Minuten

2 AufgabenInsgesamt 30 Minuten

Introduction to NoSQL Databases15 Minuten
Hands-On with HBase15 Minuten

1 DiskussionsthemaInsgesamt 15 Minuten

Architecture of HBase15 Minuten

This assessment is a graded quiz based on the modules covered this week.

Das ist alles enthalten

1 Aufgabe

In this module, you will be introduced to the popular Apache Spark platform for Big Data processing. You will explore the key components of Apache Spark that provide significant benefits in distributed computing. You will also be introduced to the Resilient Distributed Datastores (RDD) and the Spark DataFrames. Furthermore, you will be introduced to Spark SQL and Spark Streaming.

Das ist alles enthalten

11 Videos4 Lektüren2 Aufgaben1 Diskussionsthema

11 VideosInsgesamt 69 Minuten

The Need for Spark5 MinutenModulvorschau
Spark Background and Applications5 Minuten
The Resilient Distributed Dataset (RDD)6 Minuten
Hands-On with the PySpark Library in Python8 Minuten
Working with Spark DataFrames and Spark SQL5 Minuten
Hands-On with Structured Queries on Spark7 Minuten
Need for Processing Streaming Data5 Minuten
Introduction to Spark Streaming6 Minuten
Hands-On with DStream API7 Minuten
Structured Streaming6 Minuten
Hands-On with Structured Streaming6 Minuten

4 LektürenInsgesamt 360 Minuten

Essential Reading: Introduction to Spark180 Minuten
Recommended Reading: Quick Start on Spark60 Minuten
Essential Reading: Introduction to Spark Streaming90 Minuten
Recommended Reading: Spark Structured Streaming30 Minuten

2 AufgabenInsgesamt 30 Minuten

Introduction to the Building Blocks of Spark15 Minuten
Introduction to Spark Streaming15 Minuten

1 DiskussionsthemaInsgesamt 20 Minuten

Windowing in Structured Streaming20 Minuten

This assessment is a graded quiz based on the module covered in this week.

Das ist alles enthalten

1 Aufgabe

In this module, you will learn about MLlib, which is used for making predictions on large datasets that need distributed processing. You will be working on regression and classification tasks for large datasets. Then, a hands-on exercise with streaming data from the twitter API is implemented. This is a predictive streaming application to show participants an end-to-end big data scenario.

Das ist alles enthalten

8 Videos3 Lektüren2 Aufgaben

8 VideosInsgesamt 51 Minuten

Introduction to MLlib5 MinutenModulvorschau
Regression Algorithms in Mllib5 Minuten
Solving Classification Problems with Mllib6 Minuten
Hands-On with Sentiment Analysis7 Minuten
Introduction to Google Cloud Dataproc5 Minuten
Hands-On setting up a cluster on Google Dataproc 7 Minuten
Streaming Data from Twitter API 6 Minuten
Hands-On with a Streaming Analytics Application6 Minuten

3 LektürenInsgesamt 150 Minuten

Essential Reading: Introduction to ML on Spark90 Minuten
Recommended Reading: Dataproc Best Practices Guide30 Minuten
Recommended Reading: Twitter API v230 Minuten

2 AufgabenInsgesamt 27 Minuten

Machine Learning on Spark15 Minuten
Running Hadoop and Spark on Cloud12 Minuten

This module describes the learning objectives, assignment brief, review criteria, grading criteria, and submission instructions for the Staff Graded Team Assignment for the course.

Das ist alles enthalten

1 Video

Erwerben Sie ein Karrierezertifikat.

Fügen Sie dieses Zeugnis Ihrem LinkedIn-Profil, Lebenslauf oder CV hinzu. Teilen Sie sie in Social Media und in Ihrer Leistungsbeurteilung.

Auf einen Abschluss hinarbeiten

Dieses Kurs ist Teil des/der folgenden Studiengangs/Studiengänge, die von O.P. Jindal Global Universityangeboten werden. Wenn Sie zugelassen werden und sich immatrikulieren, können Ihre abgeschlossenen Kurse auf Ihren Studienabschluss angerechnet werden und Ihre Fortschritte können mit Ihnen übertragen werden.¹

Dozent

Dr. Mohit Bhatnagar

O.P. Jindal Global University

4 Kurse124 Lernende

von

O.P. Jindal Global University

Mehr von Data Analysis entdecken

Status: Kostenloser Testzeitraum
Illinois Tech
Big Data Technologies
Kurs
Status: Kostenloser Testzeitraum
Johns Hopkins University
Big Data Processing Using Hadoop
Spezialisierung
Status: Kostenloser Testzeitraum
University of California San Diego
Big Data
Spezialisierung
Status: Kostenloser Testzeitraum
Johns Hopkins University
Data Analysis Using Hadoop Tools
Kurs

Warum entscheiden sich Menschen für Coursera für ihre Karriere?

Felipe M.

Lernender seit 2018

„Es ist eine großartige Erfahrung, in meinem eigenen Tempo zu lernen. Ich kann lernen, wenn ich Zeit und Nerven dazu habe.“

Jennifer J.

Lernender seit 2020

„Bei einem spannenden neuen Projekt konnte ich die neuen Kenntnisse und Kompetenzen aus den Kursen direkt bei der Arbeit anwenden.“

Larry W.

Lernender seit 2021

„Wenn mir Kurse zu Themen fehlen, die meine Universität nicht anbietet, ist Coursera mit die beste Alternative.“

Chaitanya A.

„Man lernt nicht nur, um bei der Arbeit besser zu werden. Es geht noch um viel mehr. Bei Coursera kann ich ohne Grenzen lernen.“

Neue Karrieremöglichkeiten mit Coursera Plus

Unbegrenzter Zugang zu 10,000+ Weltklasse-Kursen, praktischen Projekten und berufsqualifizierenden Zertifikatsprogrammen - alles in Ihrem Abonnement enthalten

Mehr erfahren

Bringen Sie Ihre Karriere mit einem Online-Abschluss voran.

Erwerben Sie einen Abschluss von erstklassigen Universitäten – 100 % online

Erkunden Sie die Abschlüsse

Schließen Sie sich mehr als 3.400 Unternehmen in aller Welt an, die sich für Coursera for Business entschieden haben.

Schulen Sie Ihre Mitarbeiter*innen, um sich in der digitalen Wirtschaft zu behaupten.

Mehr erfahren

Häufig gestellte Fragen

Access to lectures and assignments depends on your type of enrollment. If you take a course in audit mode, you will be able to see most course materials for free. To access graded assignments and to earn a Certificate, you will need to purchase the Certificate experience, during or after your audit. If you don't see the audit option:

The course may not offer an audit option. You can try a Free Trial instead, or apply for Financial Aid.
The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you purchase a Certificate you get access to all course materials, including graded assignments. Upon completing the course, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

You will be eligible for a full refund until two weeks after your payment date, or (for courses that have just launched) until two weeks after the first session of the course begins, whichever is later. You cannot receive a refund once you’ve earned a Course Certificate, even if you complete the course within the two-week refund period. See our full refund policy.

Weitere Fragen

Besuchen Sie die das Hilfe-Center für Kursteilnehmer.

Finanzielle Unterstützung verfügbar,

Diese kurs ist nicht verfügbar in Deutsch (Deutschland)

Big Data Analytics

Was Sie lernen werden

Kompetenzen, die Sie erwerben

Wichtige Details

Erfahren Sie, wie Mitarbeiter führender Unternehmen gefragte Kompetenzen erwerben.

In diesem Kurs gibt es 12 Module

Welcome to Big Data Analytics

Das ist alles enthalten

Introduction to Big Data and Hadoop

Das ist alles enthalten

Weekly Summative Assessment: Introduction to Big Data and Hadoop

Das ist alles enthalten

Introduction to Data Mining with Hive

Das ist alles enthalten

Weekly Summative Assessment: Introduction to Data Mining with Hive

Das ist alles enthalten

The Pig Scripting Languages

Das ist alles enthalten

NoSQL Databases and the CAP Theorem

Das ist alles enthalten

Weekly Summative Assessment: NoSQL Databases and the CAP Theorem

Das ist alles enthalten

Introduction to Spark

Das ist alles enthalten

Weekly Summative Assessment: Introduction to Spark

Das ist alles enthalten

Introduction to Machine Learning on Spark

Das ist alles enthalten

Term-End Staff Graded Assignment

Das ist alles enthalten

Erwerben Sie ein Karrierezertifikat.

Auf einen Abschluss hinarbeiten

MBA in Business Analytics

Dozent

von

Mehr von Data Analysis entdecken

Big Data Technologies

Big Data Processing Using Hadoop

Big Data

Data Analysis Using Hadoop Tools

Warum entscheiden sich Menschen für Coursera für ihre Karriere?

Neue Karrieremöglichkeiten mit Coursera Plus

Bringen Sie Ihre Karriere mit einem Online-Abschluss voran.

Schließen Sie sich mehr als 3.400 Unternehmen in aller Welt an, die sich für Coursera for Business entschieden haben.

Häufig gestellte Fragen

When will I have access to the lectures and assignments?

What will I get if I purchase the Certificate?

What is the refund policy?

Weitere Fragen