5 Best Open-Source Python Libraries for Excel in 2023
Python has emerged as one of the most popular programming languages in recent years. It has a simple and easy-to-read syntax, which makes it an ideal language for beginners. Additionally, Python offers many libraries that make it suitable for data analysis and machine learning tasks. In this blog post, we will discuss some of the best open-source Python libraries for Excel.
XlsxWriter
With the XlsxWriter module for Python, it’s easy to export Excel files in clean and efficient formats. With 100% compatibility with Microsoft Excel XLSX files, you can be confident that your data will arrive stored as intended in its destination file format including formatting and all multimedia content like charts or images.
XlsxWriter is a Python module that allows you to write files in the Excel 2007+ XLSX file format. This module has full support for all of the features of the file format, including formatting, merged cells, defined names, charts, autofilters, data validation, and drop-down lists, conditional formatting, worksheet images, textboxes, macros, and more.
import numpy as np import pandas as pd import xlwings as xw @xw.sub def get_workbook_name(): """Writes the name of the Workbook into Range("D3") of Sheet 1""" wb = xw.Book.caller() wb.sheets["Sheet1"].range("D3").value = wb.name @xw.func def double_sum(x, y): """Returns twice the sum of the two arguments""" return 2 * (x + y) @xw.func @xw.arg("data", ndim=2) def add_one(data): """Adds 1 to every cell in Range""" return [[cell + 1 for cell in row] for row in data] @xw.func @xw.arg("x", np.array, ndim=2) @xw.arg("y", np.array, ndim=2) def matrix_mult(x, y): """Alternative implementation of Excel's MMULT, requires NumPy""" return x.dot(y) @xw.func @xw.arg("x", pd.DataFrame, index=False, header=False) @xw.ret(index=False, header=False) def CORREL2(x): """Like CORREL, but as array formula for more than 2 data sets""" return x.corr() if __name__ == "__main__": # To run this with the debug server, # set UDF_DEBUG_SERVER = True in the xlwings VBA module xw.serve()
Additionally, XlsxWriter is optimized for writing large files and can be used in the memory-saving mode for even greater efficiency.
From seamless integration of Charts (including Sparklines) to data validation, auto filters, and drop-down lists, this module has everything you need to create professional-quality Excel files with Python.
It requires Python 3.4 or later and PyPy3, and it makes use of only standard libraries.
xlwings
This is a Python library that makes it easy to call Python from Excel and vice versa. It comes pre-installed with Anaconda and WinPython, and works on Windows and macOS. xlwings is open source and free and lets you automate Excel via Python scripts or Jupyter notebooks. You can also call Python from Excel via macros, and write user-defined functions (UDFs).
import xlsxwriter # Create an new Excel file and add a worksheet. workbook = xlsxwriter.Workbook('demo.xlsx') worksheet = workbook.add_worksheet() # Widen the first column to make the text clearer. worksheet.set_column('A:A', 20) # Add a bold format to use to highlight cells. bold = workbook.add_format({'bold': True}) # Write some simple text. worksheet.write('A1', 'Hello') # Text with formatting. worksheet.write('A2', 'World', bold) # Write some numbers, with row/column notation. worksheet.write(2, 0, 123) worksheet.write(3, 0, 123.456) # Insert an image. worksheet.insert_image('B5', 'logo.png') workbook.close()
xlwings enables you to use Python in Excel in basically two ways: either you script/automate Excel from Python or you write User Defined Functions (UDFs) in Python that work in Excel.
Numpy arrays and Pandas Series/DataFrames are fully supported. xlwings-powered workbooks are easy to distribute and work on Windows and macOS.
xlwings is a great tool for data analysis and data visualization.
xlrd
The xlrd library is a powerful tool for extracting data and formatting information from Excel files. However, it has a number of limitations that users should be aware of. First, the library will only read .xls files, and will not be able to open any other file type. Additionally, the library does not support charts, macros, or pictures.
import xlrd book = xlrd.open_workbook("myfile.xls") print("The number of worksheets is {0}".format(book.nsheets)) print("Worksheet name(s): {0}".format(book.sheet_names())) sh = book.sheet_by_index(0) print("{0} {1} {2}".format(sh.name, sh.nrows, sh.ncols)) print("Cell D30 is {0}".format(sh.cell_value(rowx=29, colx=3))) for rx in range(sh.nrows): print(sh.row(rx))
Password-protected files cannot be read by the library. Despite these limitations, the xlrd library remains a valuable tool for anyone looking to extract data from Excel files.
pyexcel
Python is a versatile language that has many uses, one of which is data processing. Excel is a popular format for storing data, and Pyexcel is a Python library that makes it easy to read, manipulate, and write Excel files.
records = p.get_records(file_name="your_file.xls") for row in records: print(f"{row['Representative Composers']} are from {row['Name']} period ({row['Period']})")
With just a few lines of code, you can convert data from Excel to an array or dictionary, and vice versa. Pyexcel also makes it easy to process data stored in Excel files, making it an enjoyable task. While fonts, colors, and charts are not considered part of Pyexcel’s focus, the library nonetheless provides a powerful and user-friendly way to work with Excel data.
PyExcelerate
PyExcelerate is a library that writes Excel-compatible XLSX spreadsheets. It focuses on speed. PyExcelerate allows you to write data to ranges directly instead of writing cell-by-cell, which makes the writing process faster.
from pyexcelerate import Workbook, Color, Style, Fill from datetime import datetime wb = Workbook() ws = wb.new_sheet("Sheet1") ws.set_col_style(2, Style(size=0)) wb.save("file.xlsx")
The library has full support for Unicode characters, date formatting, and number formatting.