In this tutorial, we will learn how to execute single or multiple operations on a dataframe at a lightning-fast execution time. Over this tutorial, we will learn about lambda functions and how we use lambda functions inside Map, Reduce, and Filter methods that minimize the execution time of code to a great extent.
Lambda functions are also called anonymous functions i.e. you can execute these functions without defining a function and most importantly these functions are single-line functions.
let us write a simple lambda function to get the square of a number.
JSON is called as Java script object notation. This is the most popular format used for data transfer as text and widely used across multiple databases and API’s. The best part of JSON is, it is human readable and machine generatable as well. When working with JSON files there are two tasks involved they are Serialization and Deserialization and this tutorial covers both of them.
The process of creating JSON objects by converting native python objects into JSON formatted data is called Serialization. This procedure encodes the native python data into JSON format…
In this tutorial, we will discuss about reading multiple files in to dataframes and append all the files to form a single dataframe.
To read files in a specific folder or location , we use the os package that helps us to set the working location to the target folder.
In this tutorial, we will learn some of the interesting topics that include iterating over rows and columns of dataframe, retrieve index of maximum and minimum in all columns, retrieve n largest and smallest values of selected columns and finally dealing with null values in a dataframe.
Iterating rows and columns is one of the most common requirement and the pandas library provides wonderful methods to handle this in an efficient manner.
The head of the dataframe used in this tutorial is as below.
This tutorial covers the basics of dataframe, this is an extension of my previous tutorial dataframe basics. In this tutorial, we will learn some interesting topics such as inserting a new column, dropping columns, renaming columns and finally setting index.
To insert a column into dataframe, we use the insert method as below
loc →number of the column, where we want to insert the column
column →name of the column
value → value to be inserted, it can be a single value or a list. …
In this tutorial we will discuss about Dataframe, a powerful pandas object that gives you an immense power to handle complex data. What is a Dataframe? , Dataframe is like a table or excel sheet with rows and columns. Dataframe is a 2D, size-mutable and capable of holding heterogenous data.
Dataframe is the primary pandas object for any data analyst or scientist who works with data. You can read a csv or Excel or other formatted file in to dataframe and start accessing or modify data as you need. …
In continuation to my previous blogs on read_csv method in pandas , this is the last blog covering some interesting topics that help to read csv files in a better way.
The topics covered in this tutorial are
1) Handling date columns
2) Bad input lines
3) Dealing with Missing values
4) Reading compressed files
We have a special attribute in read_csv called parse_dates that has the ability to parse the column into a date datatype. Generally, when a date-based column is read, it was treated as a string object, but with the help…
image from unsplash.com
In this tutorial, we will learn some tricks to handle data better while reading files in python using read_csv method. This is a continuation of my prior tutorials on read_csv, you can view those tutorials for some foundation on reading files and other tips and tricks.
In this learning tutorial, we will learn some advanced stuff about pandas.
Below topics would be covered in this blog.
1.Data type Specification at the time of importing data
4. Chunk Size
The read_csv method of Pandas provides us an attribute to specify the data type of…
In this tutorial, we will go through various attributes of read_csv method that would expose you to advanced capabilities of pandas library and give you the power to be able to handle most of the routine tasks while handling files in python. Please refer to my previous blog Learning for Beginners in Pandas for how to start reading files in Python. We discuss below items in this learning section.
First of all…
Reading files is a fundamental task for any data analyst. Pandas, is one such library which helps us to achieve this task more efficiently. The Pandas library has rich features that makes the life of an analyst easy. In this blog, I would like to show some tips to read files using Pandas library. And, I will also discuss some errors and solutions to avoid them. These are the most frequent challenges encountered by early pandas programmers.
The first task is to install and import the pandas library and it is very simple.
If you haven’t installed pandas library earlier…
Data Science and machine learning enthusiast