Cookie Consent by Free Privacy Policy Generator

Python CSV Combiner - Combine CSV files in a dataframe and resave as a combined CSV file

The Python CSV Combiner script is a practical tool designed to speed up the process of merging multiple CSV files into one. It ensures your data is clean and organised by including the header row only once, eliminating the hassle of manual adjustments that take time. This Python script is ideal for those who need a straightforward solution to handle large datasets efficiently.

powershell-combine-csv.jpg


With the Python CSV Combiner script, the process is almost seamless. It automatically scans your directory, processes each CSV file, and merges them into a single file ready for analysis. If any issues arise during the process, the script provides clear feedback, ensuring you’re always in control. Just remember this script works best if all the header rows of each CSV file match. I have been using it to speed up log file analysis, especially when the access logs are split into multiple CSV files.

Code:
import os
import pandas as pd

# Set the directory containing the CSV files
directory = os.getcwd()
print(f"Current working directory: {directory}")

# List to store dataframes
dfs = []

# Loop through the files in the directory
for filename in os.listdir(directory):
    if filename.lower().endswith(".csv"):  # Check for .csv extension
        file_path = os.path.join(directory, filename)
        print(f"Processing file: {file_path}")
        try:
            # Read each CSV file into a dataframe
            df = pd.read_csv(file_path)
            dfs.append(df)
        except Exception as e:
            print(f"Failed to read {file_path}: {e}")

# Check if there are any CSV files found
if dfs:
    # Combine all dataframes into one, keeping the header only from the first file
    combined_df = pd.concat(dfs, ignore_index=True)

    # Save the combined dataframe to a new CSV file
    output_file = os.path.join(directory, "combined_log_records.csv")
    combined_df.to_csv(output_file, index=False)

    print(f"Combined CSV file created: {output_file}")
else:
    print("No CSV files found or successfully processed in the directory.")

Steps to Run:​

  1. Save the Script: Save this updated script in the same directory as your CSV files.
  2. Run the Script: Execute the script using the Python command in your terminal.

What the Script Does:​

  • Checks for .csv Files: It only processes files with the .csv extension.
  • Attempts to Read Files: If it encounters an issue reading any file, it will print an error message with the file name.
  • Combines Files: If successful, it combines all the CSVs into one file and saves it in the same directory.
 
Back
Top