Apache Pig is one of the tools of the Hadoop eco-system which is used to perform map-reduce operations without writing a single line of code in map-reduce format. Apache pig is a tool for which we don’t need to know any programming languages like Java or Python. Though we don’t need to have knowledge of any programming language still we need to write a script which is called Pig Latin Script or Pig Latin Language. Pig Latin is a SQL-like language that runs map-reduce jobs in the background. So, If we know Pig Latin language we don’t need to know any other programming language to work with map reduce operations.
Data types present in Pig Latin
Data types | Description | Examples |
int | It is signed 32-bit integers | 30 |
long | It is a signed 64-bit integer. | 10L |
float | It is a signed 32-bit floating point. | 30.5F |
double | It is a signed 64-bit floating point | 10.5 |
Chararray | It represents a character array (string) in Unicode UTF-8 format | ‘US Press’ |
Bytearray | This data type represents a Byte array | |
Boolean | It represents a Boolean value. | true or false. |
DateTime | It shows a date-time type | 1970-01-01T00:00:00.000+00:00 |
BigInteger | This data type represents a Java BigInteger. | 50708090709 |
BigDecimal | “Bigdecimal” represents a Java BigDecimal | 180.97376256272893883 |
Complex Types
Tuple- A list of an ordered set of fields which we called a tuple in pig Latin language.
For ex: (sam,30)
Bag- A collection of tuples is called a bag in the pig Latin language.
For ex: {(sam,30),(dev,28)}
Map- A set of key-value pairs is what is called a map in the pig Latin language.
For ex: [ ‘name’#’sam’, ‘age’#30 ]
Null Values
We have seen different pig latin data types. We can set all these types of values to null.
It’s just a placeholder. These null values can occur naturally or through some operations.
Pig Latin Operators
Here is the list of pig latin operators which we are using in this language.
To understand the working of operators we will take some examples for each operator. Let’s say we have two variables a and b where A=10 and B=20.
Arithmetic operators
Operators | Description |
ADDITION (+) | It adds two variable values. So, the output for A+B will be 30 |
SUBTRACTION (-) | It subtracts two variable values. So, output for A-B will be -10 |
MULTIPLICATION (*) | It multiplies two variable values. So, the output for A*B will be 200 |
DIVISION (/) | It divides two variable values. So, the output for B/A will be 2 |
MODULUS (%) | It divides two variable values and returns a reminder. So, the output for B%A will be 0 |
Bincond operators
Bincond operators in Pig Latin are a type of conditional operator that allows you to perform operations based on the values of columns in a data set.
Here are some of the most commonly used bincond operators in Pig Latin:
Comparison Operators
Here is the list of the comparison operators which we are using in this language.
Let’s take two variables A and B for example purpose. Here, A is 20 and B is 40.
Operator | Description | Example |
Equal | This operator checks if the values of two variables are equal or not. So, if the result is yes then it returns true else it returns false. | A == B will return false |
Not Equal | This operator checks if the values of two variables are equal or not. So, if the result is yes then it returns false else it returns true. | A != B will return true |
Greater Than | This operator checks if the values of the left variable value are greater than the correct variable value. So, if the result is yes then it returns true else it returns false. | A > B will return false |
Greater Than Equal To | This operator checks if the values of the right variable value are less than or equal to the right variable value. | A >= B will return false |
Less Than | This operator checks if the value of the left variable value is less than the right variable value. So, if the result is yes then it returns true else it returns false. | A < B will return true |
Less Than Equal To | This operator checks if the values of the left variable value are less than or equal to the right variable value. So, if the result is yes then it returns true else it returns false. | A <= B will return true |
matches
Pattern Matching – It simply checks if the value in the left side string matches with the pattern on the right end side.
str=” Pig Latin”
For ex: str matches ‘.*Latin.*’
It will return true if the string matches with the pattern and false otherwise.
Type Construction Operators
Here, is the available type of construction operators:
- Tuple Constructor Operators: ()—–Here, to construct a tuple we use this operator. For ex: – (sam,30)
- Bag Constructor Operators: {}—–Here, to construct a bag we use this operator. For ex: – {(sam,30), {Jerry,28}}
- Map Constructor Operators: []—–Here, to construct a tuple we use this operator. For ex: – [name#sam, age#30]
Conclusion
As we have gone through the Apache pig Latin language, we covered different pig Latin language types and operations in detail with examples. Hope, this article will help us understand the usage and ways to implement pig scripts. When working with big data, we can perform map-reduce operations using pig Latin language as we understood. It will help us to work efficiently with this scripting language.
If you like the article and would like to support me, make sure to:
- 👏 Like for this article and subscribe to our newsletter
- 📰 View more content on my DataSpoof website
- 🔔 Follow Me: LinkedIn| Youtube | Instagram | Twitter
that’s what i am looking for. thanks for helping me out
Good post however I was wondering if you could write a litte more on this topic?
I’d be very grateful if you could elaborate a little
bit further. Cheers!