read excel file

Excel is a spreadsheet application that is developed by Microsoft. It is an easily accessible tool to organize, analyze, and store the data in tables. It is widely used in many different applications all over the world. From Analysts to CEOs, various professionals use Excel for both quick stats and serious data crunching.

read excel file

Excel Documents

An Excel spreadsheet document is called a workbook that is saved in a file with .xlsx extension. The first row of the spreadsheet is mainly reserved for the header, while the first column identifies the sampling unit. Each workbook can contain multiple sheets that are also called worksheets. A box at a particular column and row is called a cell, and each cell can include a number or text value. The grid of cells with data forms a sheet.

The active sheet is defined as a sheet in which the user is currently viewing or last viewed before closing Excel.

Reading from an Excel file

First, you need to write a command to install the xlrd module.

 
  1. pip install xlrd  

Creating a Workbook

A workbook contains all the data in the excel file. You can create a new workbook from scratch, or you can easily create a workbook from the excel file that already exists.

Input File

Python Read Excel File

Code

 
  1. # Import the xlrd module    
  2. import xlrd   
  3.     
  4. # Define the location of the file   
  5. loc = ("path of file")   
  6.     
  7. # To open the Workbook   
  8. wb = xlrd.open_workbook(loc)   
  9. sheet = wb.sheet_by_index(0)   
  10.     
  11. # For row 0 and column 0   
  12. sheet.cell_value(00)   

Output:

'NAME'

Explanation: In the above example, Firstly, we have imported the xlrd module and defined the location of the file. Then we have opened the workbook from the excel file that already exists.

Reading from the Pandas

Pandas is defined as an open-source library that is built on the top of the NumPy library. It provides fast analysis, data cleaning, and preparation of the data for the user and supports both xls and xlsx extensions from the URL.

It is a python package which provides a beneficial data structure called a data frame

Code

 
  1. import pandas as pd  
  2.   
  3. # Read the file  
  4. data = pd.read_csv(".csv", low_memory=False)  
  5.   
  6. # Output the number of rows  
  7. print("Total rows: {0}".format(len(data)))  
  8.   
  9. # See which headers are available  
  10. print(list(data))  

Reading from the openpyxl

First, you need to install openpyxl using pip from the command line.

 
  1. pip install openpyxl  

After that, you need to import the module.

You can also read data from the existing spreadsheet using openpyxl. It also allows the user to perform calculations and add content that was not part of the original dataset.

Code

 
  1. import openpyxl  
  2. my_wb = openpyxl.Workbook()  
  3. my_sheet = my_wb.active  
  4. my_sheet_title = my_sheet.title  
  5. print("My sheet title: " + my_sheet_title)  

Output:

My sheet title: Sheet