Nhập mã khuyến mãi ONETCOMVN được giảm 10%

Tài Khoản

Install Apache Hadoop on Ubuntu 17.10!

28/12/2020

đang xem

Tin Tức

Apache Hadoop is a big data solution for storing and analyzing large amounts of data. In this article we will detail the complex setup steps for Apache Hadoop to get you started with it on Ubuntu as rapidly as possible. In this post, we will install Apache Hadoop on a Ubuntu 17.10 machine.

Ubuntu Version

For this guide, we will use Ubuntu version 17.10 (GNU/Linux 4.13.0-38-generic x86_64).

Updating existing packages

To start the installation for Hadoop, it is necessary that we update our machine with latest software packages available. We can do this with:

sudo apt-get update && sudo apt-get -y dist-upgrade

As Hadoop is based on Java, we need to install it on our machine. We can use any Java version above Java 6. Here, we will be using Java 8:

sudo apt-get -y install openjdk-8-jdk-headless

Downloading Hadoop files

All the necessary packages now exist on our machine. We’re ready to download the required Hadoop TAR files so that we can start setting them up and run a sample program with Hadoop as well.

In this guide, we will be installing Hadoop v3.0.1. Download the corresponding files with this command:

wget http://mirror.cc.columbia.edu/pub/software/apache/hadoop/common/hadoop-3.0.1/hadoop-3.0.1.tar.gz

Depending upon the network speed, this can take up to a few minutes as the file is big in size:

Downloading Hadoop

Find latest Hadoop binaries here. Now that we have the TAR file downloaded, we can extract in the current directory:

tar xvzf hadoop-3.0.1.tar.gz

This will take a few seconds to complete due to big file size of the archive:

Hadoop Unarchived

Added a new Hadoop User Group

As Hadoop operates over HDFS, a new file system can disturn our own file system on the Ubuntu machine as well. To avoid this collission, we will create a completely separate User Group and assign it to Hadoop so it contains its own permissions. We can add a new user group with this command:

addgroup hadoop

We will see something like:

Adding Hadoop user group

We are ready to add a new user to this group:

useradd -G hadoop hadoopuser

Please take note that all the commands we run are as root user itself. With aove command, we were able to add a new user to the group we created.

To allow Hadoop user to perform operations, we need to provide it with root access as well. Open the /etc/sudoers file with this command:

sudo visudo

Before we add anything, the file will look like:

Sudoers file before adding anything

Add the following line to the end of the file:

hadoopuser ALL=(ALL) ALL

Now the file will look like:

Sudoers file after adding Hadoop user

This was the main setup for providing Hadoop a platform to perform actions. We are ready to setup a single node Hadoop cluster now.

Hadoop Single Node Setup: Standalone Mode

When it comes to the real power of Hadoop, it is usually set up across multiple servers so that it can scale on top of a large amount of dataset present in Hadoop Distributed File System (HDFS). This is usually fine with debugging environments and not used for production usage. To keep the process simple, we will explain how we can do a single node setup for Hadoop here.

Once we’re done installing Hadoop, we will also run a sample application on Hadoop. As of now, Hadoop file is named as hadoop-3.0.1. let’s rename it to hadoop for simpler usage:

mv hadoop-3.0.1 hadoop

The file now looks like:

Moving Hadoop

Time to make use of the hadoop user we created earlier and assign the ownership of this file to that user:

chown -R hadoopuser:hadoop /root/hadoop

A better location for Hadoop will be the /usr/local/ directory, so let’s move it there:

mv hadoop /usr/local/
cd /usr/local/

Adding Hadoop to Path

To execute Hadoop scripts, we will be adding it to the path now. To do this, open the bashrc file:

vi ~/.bashrc

Add these lines to the end of the .bashrc file so that path can contain the Hadoop executable file path:

# Configure Hadoop and Java Home
export HADOOP_HOME=/usr/local/hadoop
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

export PATH=$PATH:$HADOOP_HOME/bin

File looks like:

Adding Hadoop to Path

As Hadoop makes use of Java, we need to tell the Hadoop environment file hadoop-env.sh where it is located. The location of this file can vary based on Hadoop versions. To easily find where this file is located, run the following command right outside the Hadoop directory:

find hadoop/ -name hadoop-env.sh

We will get the output for the file location:

Environment file location

Let’s edit this file to inform Hadoop about the Java JDK location and insert this on the last line of the file and save it:

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

The Hadoop installation and setup is now complete. We are ready to run our sample application now. But wait, we never made a sample application!

Running Sample application with Hadoop

Actually, Hadoop installation comes with an in-built sample application which is ready to run once we are done with installing Hadoop. Sounds good, right?

Run the following command to run the JAR example:

hadoop jar /root/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.1.jar wordcount /root/hadoop/README.txt /root/Output

Hadoop will show how much processing it did at the node:

Hadoop processing stats

Once you execute the following command, we see the file part-r-00000 as an output. Go ahead and look at the content of the output:

cat part-r-00000

You will get something like:

Word Count output by Hadoop

Conclusion

In this lesson, we looked at how we can install and start using Apache Hadoop on Ubuntu 17.10 machine. Hadoop is great for storing and analyzing vast amount of data and I hope this article will help you get started using it on Ubuntu quickly.

ONET IDC

ONET IDC thành lập vào năm 2012, là công ty chuyên nghiệp tại Việt Nam trong lĩnh vực cung cấp dịch vụ Hosting, VPS, máy chủ vật lý, dịch vụ Firewall Anti DDoS, SSL… Với 10 năm xây dựng và phát triển, ứng dụng nhiều công nghệ hiện đại, ONET IDC đã giúp hàng ngàn khách hàng tin tưởng lựa chọn, mang lại sự ổn định tuyệt đối cho website của khách hàng để thúc đẩy việc kinh doanh đạt được hiệu quả và thành công.

Chia sẻ

Bài Viết Mới

Hướng dẫn fake ip bằng phần mềm SStap

VPS treo game là gì? Thuê VPS treo game giá rẻ, không lo giật lag

BitBrowser – Best Anti-Detect Browser!

Dịch Vụ Xây Dựng Hệ Thống Peering Với Internet Exchange (IXP)

Dịch Vụ Triển Khai VPN Site-to-Site & Remote Access

Dịch Vụ Thiết Lập Hệ Thống Tường Lửa (Firewall)

Dịch Vụ Triển Khai Hệ Thống Ảo Hóa & Cloud

Dịch Vụ Triển Khai Hệ Thống Ceph

Dịch Vụ Triển Khai Hệ Thống BGP Multi-Peer Cho ISP

Hướng Dẫn Chọn Dịch Vụ Thuê Địa Chỉ IPv4

Bài Viết

Bài Viết Mới Cập Nhật

Hướng dẫn fake ip bằng phần mềm SStap

Hướng dẫn Tải và cài đặt Các bạn vào Google gõ từ khóa “Download SStap” hoặc vào sẵn link https://sourceforge.net/projects/sstap/files/latest/download Sau...

10/06/2025

VPS treo game là gì? Thuê VPS treo game giá rẻ, không lo giật lag

Bạn đam mê những tựa game online và muốn cày cuốc không ngừng nghỉ, nhưng chiếc máy tính cá nhân lại không đủ “trâu”...

02/06/2025

BitBrowser – Best Anti-Detect Browser!

Good anti association effect, complete browser fingerprint modification, affordable price! Please recommend it to friends around you! BitBrowser – anti detect browser, Dorang Account Defense Association ⚙️ Function: – RPA automation – API script – Extended plug -in – Window synchronization – Support Global Proxy IP Used for: Capital monetization, crypto，E-commerce, Social Media Marketing, Shopping Price Comparison, Price Comparison, Advertising, Alliance Marketing, Agency Operation, Self-testing etc. ♾️10 Profiles for Free ♾️ Free registration link：https://www.bitbrowser.net/vi/?code=5df4f4ec WhatsApp service group : https://chat.whatsapp.com/FCQaHfHbR351GIje98OIA9 Technical service group : https://t.me/bitbrowser000

26/05/2025

Dịch Vụ Xây Dựng Hệ Thống Peering Với Internet Exchange (IXP)

Peering với Internet Exchange (IXP) là giải pháp quan trọng giúp tăng tốc độ kết nối, giảm độ trễ, tối ưu chi phí băng thông...

04/04/2025

Dịch Vụ Triển Khai VPN Site-to-Site & Remote Access

Giới Thiệu VPN (Virtual Private Network) là giải pháp quan trọng giúp bảo mật dữ liệu, đảm bảo kết nối an toàn giữa các...

04/04/2025

Bài Viết Mới

Hướng dẫn fake ip bằng phần mềm SStap

VPS treo game là gì? Thuê VPS treo game giá rẻ, không lo giật lag

BitBrowser – Best Anti-Detect Browser!

Dịch Vụ Xây Dựng Hệ Thống Peering Với Internet Exchange (IXP)

Dịch Vụ Triển Khai VPN Site-to-Site & Remote Access

Dịch Vụ Thiết Lập Hệ Thống Tường Lửa (Firewall)

Dịch Vụ Triển Khai Hệ Thống Ảo Hóa & Cloud

Dịch Vụ Triển Khai Hệ Thống Ceph

Dịch Vụ Triển Khai Hệ Thống BGP Multi-Peer Cho ISP

Hướng Dẫn Chọn Dịch Vụ Thuê Địa Chỉ IPv4

Hotline/Zalo

09.016.19.525

Nhận chương trình khuyến mãi từ ONET IDC

72 Lê Thánh Tôn, P.Bến Nghé, Quận 1, TP HCM

1001 S MAIN ST STE 600 KALISPELL, MT 59901

Điện thoại: 09.016.19.525

Email liên hệ:

[email protected]

Install Apache Hadoop on Ubuntu 17.10!

Updating existing packages

Downloading Hadoop files

Added a new Hadoop User Group

Hadoop Single Node Setup: Standalone Mode

Adding Hadoop to Path

Running Sample application with Hadoop

Conclusion

Bài Viết Mới

Cách sử dụng proxy để tăng tốc độ truy cập internet của bạn.

How do I completely wipe my hard drive in Ubuntu?: wipe, srm, scrub, shred and dd.

How to Install NextCloud on Raspberry Pi 3

Bài Viết Mới Cập Nhật

Hướng dẫn fake ip bằng phần mềm SStap

Hướng dẫn Tải và cài đặt Các bạn vào Google gõ từ khóa “Download SStap” hoặc vào sẵn link https://sourceforge.net/projects/sstap/files/latest/download Sau...

10/06/2025

VPS treo game là gì? Thuê VPS treo game giá rẻ, không lo giật lag

Bạn đam mê những tựa game online và muốn cày cuốc không ngừng nghỉ, nhưng chiếc máy tính cá nhân lại không đủ “trâu”...

02/06/2025

Dịch Vụ Xây Dựng Hệ Thống Peering Với Internet Exchange (IXP)

Peering với Internet Exchange (IXP) là giải pháp quan trọng giúp tăng tốc độ kết nối, giảm độ trễ, tối ưu chi phí băng thông...

04/04/2025

Dịch Vụ Triển Khai VPN Site-to-Site & Remote Access

Giới Thiệu VPN (Virtual Private Network) là giải pháp quan trọng giúp bảo mật dữ liệu, đảm bảo kết nối an toàn giữa các...

04/04/2025

Bài Viết Mới

CHÍNH SÁCH & ĐIỀU KHOẢN

Install Apache Hadoop on Ubuntu 17.10!

Updating existing packages

Downloading Hadoop files

Added a new Hadoop User Group

Hadoop Single Node Setup: Standalone Mode

Adding Hadoop to Path

Running Sample application with Hadoop

Conclusion

Bài Viết Mới

Cách sử dụng proxy để tăng tốc độ truy cập internet của bạn.

How do I completely wipe my hard drive in Ubuntu?: wipe, srm, scrub, shred and dd.

How to Install NextCloud on Raspberry Pi 3

Bài Viết Mới Cập Nhật

Hướng dẫn fake ip bằng phần mềm SStap Hướng dẫn Tải và cài đặt Các bạn vào Google gõ từ khóa “Download SStap” hoặc vào sẵn link https://sourceforge.net/projects/sstap/files/latest/download Sau... 10/06/2025

VPS treo game là gì? Thuê VPS treo game giá rẻ, không lo giật lag Bạn đam mê những tựa game online và muốn cày cuốc không ngừng nghỉ, nhưng chiếc máy tính cá nhân lại không đủ “trâu”... 02/06/2025

Dịch Vụ Xây Dựng Hệ Thống Peering Với Internet Exchange (IXP) Peering với Internet Exchange (IXP) là giải pháp quan trọng giúp tăng tốc độ kết nối, giảm độ trễ, tối ưu chi phí băng thông... 04/04/2025

Dịch Vụ Triển Khai VPN Site-to-Site & Remote Access Giới Thiệu VPN (Virtual Private Network) là giải pháp quan trọng giúp bảo mật dữ liệu, đảm bảo kết nối an toàn giữa các... 04/04/2025

Bài Viết Mới

CHÍNH SÁCH & ĐIỀU KHOẢN

Hướng dẫn fake ip bằng phần mềm SStap

Hướng dẫn Tải và cài đặt Các bạn vào Google gõ từ khóa “Download SStap” hoặc vào sẵn link https://sourceforge.net/projects/sstap/files/latest/download Sau...

10/06/2025

VPS treo game là gì? Thuê VPS treo game giá rẻ, không lo giật lag

Bạn đam mê những tựa game online và muốn cày cuốc không ngừng nghỉ, nhưng chiếc máy tính cá nhân lại không đủ “trâu”...

02/06/2025

Dịch Vụ Xây Dựng Hệ Thống Peering Với Internet Exchange (IXP)

Peering với Internet Exchange (IXP) là giải pháp quan trọng giúp tăng tốc độ kết nối, giảm độ trễ, tối ưu chi phí băng thông...

04/04/2025

Dịch Vụ Triển Khai VPN Site-to-Site & Remote Access

Giới Thiệu VPN (Virtual Private Network) là giải pháp quan trọng giúp bảo mật dữ liệu, đảm bảo kết nối an toàn giữa các...

04/04/2025