CSV file breakdown

In a recent project I had a massive CSV file, and by massive I mean 2.8 million rows long. Unfortunately I needed this file broken down in to multiple files to process, so I looked to the masterful Google for help. Here is how I found to breakdown the files on a Mac and Microsoft machines.

Mac File Breakdown

To do the breakdown on the Mac machine it is super easy. If you put the large CSV file in a folder and then navigate to it in Terminal. Then in a simple move you can drop in the following command into Terminal, with replacing the {LinesCount} and {FileName} with their respective content.

split -l {LinesCount} {FileName}.csv

Windows File Breakdown

On the Microsoft side it is a little harder, but I found this good source to do the job. With this little trick, you do the same as the above. Put the CSV and a new bat file in the same location, replace the {LinesCount} and {FileName}, then run the bat file with the code below.

@echo off
setLocal EnableDelayedExpansion

set lineLimit={LinesCount}
set fileSrc= {FileName}.csv

set lineCounter=1
set filenameCounter=1

set name=
set extension=
for %%a in (%file%) do (
set "name=%%~na"
set "extension=%%~xa"
)

for /f "tokens=*" %%a in (%fileSrc%) do (
set splitFile=!name!-part!filenameCounter!!extension!
if !lineCounter! gtr !lineLimit! (
set /a filenameCounter=!filenameCounter! + 1
set lineCounter=1
echo Created !splitFile!.
)
echo %%a>> !splitFile!
set /a lineCounter=!lineCounter! + 1
)

Advertisements

Leave a message please

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s