Learn Python – Python read csv file- Basic and advance

CSV File

A csv stands for “comma separated values”, which is described as a simple file layout that makes use of precise structuring to arrange tabular data. It stores tabular information such as spreadsheet or database in simple textual content and has a frequent structure for information interchange. A csv file opens into the excel sheet, and the rows and columns statistics outline the general format.

Python CSV Module Functions

The CSV module work is used to handle the CSV archives to read/write and get data from specific columns. There are unique sorts of CSV functions, which are as follows:

csv.field_size_limit – It returns the current maximum field size allowed by the parser.

csv.get_dialect – It returns the dialect associated with a name.

csv.list_dialects – It returns the names of all registered dialects.

csv.reader – It read the data from a csv file

csv.register_dialect – It associates dialect with a name. The name must be a string or a Unicode object.

csv.writer – It writes the data to a csv file

o csv.unregister_dialect – It deletes the dialect which is associated with the name from the dialect registry. If a name is not a registered dialect name, then an error is being raised.

csv.QUOTE_ALL – It instructs the writer objects to quote all fields. csv.QUOTE_MINIMAL – It instructs the writer objects to quote only those fields which contain special characters such as quotechar, delimiter, etc.

csv.QUOTE_NONNUMERIC – It instructs the writer objects to quote all the non-numeric fields.

csv.QUOTE_NONE – It instructs the writer object never to quote the fields.

Reading CSV files

Python presents a number of features to study csv file. We are describing few technique of studying function.

Using csv.reader() function

In Python, the csv.reader() module is used to study the csv file. It takes every row of the file and makes a list of all the columns.

We have taken a txt file named as python.txt that have default delimiter comma(,) with the following data:

name,department,birthday month    
Parker,Accounting,November    
Smith,IT,October    

Example

import csv    
with open('python.csv') as csv_file:    
    csv_reader = csv.reader(csv_file, delimiter=',')    
    line_count = 0    
    for row in csv_reader:    
        if line_count == 0:    
            print(f'Column names are {", ".join(row)}')    
            line_count += 1    

Output:

Column names are name, department, birthday month
  Parker works in the Accounting department, and was born in November.
  Smith works in the IT department, and was born in October.
Processed 3 lines.

In the above code, we have opened ‘python.csv’ the usage of the open() function. We used csv.reader() feature to examine the file, that returns an iterable reader object. The reader object have consisted the statistics and we iterated the use of for loop to print the content of each row

Read a CSV into a Dictionar

We can additionally use DictReader() characteristic to read the csv file without delay into a dictionary rather than deal with a listing of man or woman string elements.

Again, our input file, python.txt is as follows:

name,department,birthday month    
Parker,Accounting,November    
Smith,IT,October    

Example

import csv      
with open('python.txt', mode='r') as csv_file:    
    csv_reader = csv.DictReader(csv_file)    
    line_count = 0    
    for row in csv_reader:    
        if line_count == 0:    
            print(f'The Column names are as follows {", ".join(row)}')    
            line_count += 1    
        print(f'\t{row["name"]} works in the {row["department"]} department, and was born in {row["birthday month"]}.')    
        line_count += 1    
    print(f'Processed {line_count} lines.')    

Output:

The Column names are as follows name, department, birthday month
   Parker works in the Accounting department, and was born in November.
   Smith works in the IT department, and was born in October.
Processed 3 lines.

Reading csv files with Pandas

The Pandas is described as an open-source library which is constructed on the pinnacle of the NumPy library. It affords quick analysis, records cleaning, and guidance of the facts for the user.

Reading the csv file into a pandas DataFrame is rapid and straight forward. We don’t want to write ample strains of code to open, analyze, and read the csv file in pandas and it shops the statistics in DataFrame.

Here, we are taking a barely extra elaborate file to read, referred to as hrdata.csv, which includes statistics of employer employees.

Name,Hire Date,Salary,Leaves Remaining    
John Idle,08/15/14,50000.00,10    
Smith Gilliam,04/07/15,65000.00,8    
Parker Chapman,02/21/14,45000.00,10    
Jones Palin,10/14/13,70000.00,3    
Terry Gilliam,07/22/14,48000.00,7    
Michael Palin,06/28/13,66000.00,8    

Example

import pandas    
df = pandas.read_csv('hrdata.csv')    
print(df)    

In the above code, the three strains are ample to study the file, and solely one of them is doing the actual work, i.e., pandas.read_csv()

Output:

         Name                Hire Date     Salary      Leaves Remaining
0     John Idle              03/15/14      50000.0       10
1     Smith Gilliam          06/01/15      65000.0       8
2     Parker Chapman         05/12/14      45000.0       10
3     Jones Palin            11/01/13      70000.0       3
4     Terry Gilliam          08/12/14      48000.0       7
5     Michael Palin          05/23/13      66000.0       8