Coding With Fun
Home Docker Django Node.js Articles Python pip guide FAQ Policy

Apache Pig Union operator


May 26, 2021 Apache Pig


Table of contents


Pig Latin's UNION operator is used to merge the contents of two relationships. /b10> To perform UNION operations on two relationships, their columns and domains must be the same.

Grammar

The syntax of the UNION operator is given below.

grunt> Relation_name3 = UNION Relation_name1, Relation_name2;

Cases

Suppose you have two files in the /pig_data/directory of HDFS, student_data1.txt and student_data2.txt, as shown below.

Student_data1.txt

001,Rajiv,Reddy,9848022337,Hyderabad
002,siddarth,Battacharya,9848022338,Kolkata
003,Rajesh,Khanna,9848022339,Delhi
004,Preethi,Agarwal,9848022330,Pune
005,Trupthi,Mohanthy,9848022336,Bhuwaneshwar
006,Archana,Mishra,9848022335,Chennai.

Student_data2.txt

7,Komal,Nayak,9848022334,trivendram.
8,Bharathi,Nambiayar,9848022333,Chennai.

Load the two files into pig, via relationships student1 and student2, as shown below.

grunt> student1 = LOAD 'hdfs://localhost:9000/pig_data/student_data1.txt' USING PigStorage(',') 
   as (id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray); 
 
grunt> student2 = LOAD 'hdfs://localhost:9000/pig_data/student_data2.txt' USING PigStorage(',') 
   as (id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray);

Now, let's use the UNION operation to match the contents of both relationships, as shown below.

grunt> student = UNION student1, student2;

Verify

Use the DUMP operation sub-validation relationship student, as shown below.

grunt> Dump student; 

Output

It displays the following output, showing the contents of the relationship student.

(1,Rajiv,Reddy,9848022337,Hyderabad) (2,siddarth,Battacharya,9848022338,Kolkata)
(3,Rajesh,Khanna,9848022339,Delhi)
(4,Preethi,Agarwal,9848022330,Pune) 
(5,Trupthi,Mohanthy,9848022336,Bhuwaneshwar)
(6,Archana,Mishra,9848022335,Chennai) 
(7,Komal,Nayak,9848022334,trivendram) 
(8,Bharathi,Nambiayar,9848022333,Chennai)