Integrating LVM with Hadoop and providing Elasticity to Datanode Storage

GANESH KUMAR
4 min readNov 1, 2020

--

Objective → In hadoop cluster all datanode will share 2GB of volume . Using concept of LVM we can share storage to hadoop cluster with elastic nature datanode. So in our hadoop cluster we will increase datanode sharing size to 3GB. This concept is also known as Elasticity of storage.

Implementation → So first we check total attached drive in our OS by using fdisk.

Then we create physical volume of EBS volume /dev/xvdf and /dev/xvdg by using pvcreate.

to check whether is created or not we can use pvdisplay.

But in this our drives is not allocated yet. To do so we create a volume group where we add both physical volume their also have to give a specific name to that group and we can check that whether our group is created or not using vgdisplay

Here we see that volume group dn1vg is created.

Now if we check again pvdisplay then it shows allocation option as yes. It mean our drive is created and allocated to group.

Now we create a logical volume from this volume group using lvcreate. In this we have to specify how much space we want from total space and also specify it with a different name in my case I give dn1lv.

And now we can check our logical volume by using command

> lvdisplay vgname/lvname.

We can see our logical volume(LV) is created with name dn1lv with volume group(VG) name dn1vg and we give size as 2GB to logical volume.

Now we check fdisk again then one more drive is created their and that is our logical volume dn1lv.

Now we format the logical volume and mount it in a folder. In my case I mount it in dn1 folder.

Now if check by command

> df -h

then it shows mounted folder with 2GB space.

All the same step we’ll do with datanode2 and datanode3.

Here by the report command we can see that 3 datanode are connected and sharing 2GB storage.

If we want to increase the size of our logical volume then we can do this by using Logical Volume Management(LVM) concept.

There is one command lvextend by which we can increase the size of logical volume. In my case Hadoop cluster datanodes are sharing 2GB space previously but we want all the datanodes will share 3GB space so we extend the size of logical volume.

But if check by df -h command mounted folder storage is still 2GB space.

For this we have to also format the extra space which we added by lvextend command.

If we format the complete logical volume again then we’ll loose the previous data for this we only have to format extra space which we added. For formatting the extra space we use command

> resize2fs /dev/<volumegroup_name>/<logicalvolume_name>

After the formatting the extra space we can see that the mounted folder having size 3GB.

Now by using the command

> hadoop dfsadmin -report

we can see that all datanodes are sharing 3GB storage now.

Conclusion —

Hadoop don’t have elasticity concept but by using logical volume management concept we can elastic the storage shared by the datanodes.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

GANESH KUMAR
GANESH KUMAR

No responses yet

Write a response