TAGS :Viewed: 18 - Published at: a few seconds ago

[ Smallest learning curve language to work with CSV files ]

VBA is not cutting it for me anymore. I have lots of huge Excel files to which I need to make lots of calculations and break them down into other Excel/CSV files.

I need a language that I can pick up within the next couple of days to do what I need, because it is kind of an emergency. I have been suggested python, but I would like to check with you if there is anything else that does CSV file handling quickly and easily.

Answer 1

Python is an excellent choice. The csv module makes reading and writing CSV files easy (even Microsoft's, uh, "idiosyncratic" version) and Python syntax is a breeze to pick up.

I'd actually recommend against Perl, if you're coming to it fresh. While Perl is certainly powerful and fast, it's often cryptic to the point of incomprehensible to the uninitiated.

Answer 2

What kind of calculation you have to do? Maybe R would be an alternative?

EDIT: just to give a few basic examples

# Basic usage
data <- read.csv("myfile.csv")

# Pipe-separated values
data <- read.csv("myfile.csv", sep="|")

# File with header (columns will be named as header) 
data <- read.csv("myfile.csv", header=TRUE)

# Skip the first 5 lines of the file
data <- read.csv("myfile.csv", skip=5)

# Read only 100 lines
data <- read.csv("myfile.csv", nrows=100)

Answer 3

Python definitely has a small learning curve, and works with csv files well

Answer 4

There are many tools for the job, but yes, Python is perhaps the best these days. There is a special module for dealing with csv files. Check the official docs.

Answer 5

You know VBA? Why not Visual Basic 2008 / 2010, or perhaps C#? I'm sure languages like python and ruby would be relatively easier for the job, but you're already accustomed to the ".NET way" of doing things, so it makes sense to keep working with them instead of learning a whole new thing just for this job.

Using C#:

var csvlines = File.ReadAllLines("file.csv");

var query = from csvline in csvlines
            let data = csvline.Split(',')
            select new
                ID = data[0],
                FirstName = data[1],
                LastName = data[2],
                Email = data[3]
  1. .NET: Linq to CSV library.
  2. .NET: Read CSV with LINQ
  3. Python: Read CSV file

Answer 6

You say you have "excel files to which i need to make lots of calculations and break them down into other excel/csv files" but all the answers so far talk about csv only ...

Python has a csv read/write module as others have mentioned. There are also 3rd party modules xlrd (reads) and xlwt (writes) modules for XLS files. See the tutorial on this site.

Answer 7

Perl is surprisingly efficient for a scripting language for text. cpan.org has a tremendous number of modules for dealing with CSV data. I've also both written and wrote data in XLS format with another Perl module. If you were able to use VBA, you can certainly learn Perl (the basics of Perl are easy, though it's just as easy for you or others to write terse yet cryptic code).

Answer 8

I'd give awk a try. If you're running windows, you can get awk via the cygwin utilities.

Answer 9

This may not be anybody's popular language du-jour, but since CSV files are line-oriented and split into fields, dealing with them is just about the perfect application for awk. It was built for processing line oriented text data that can be split into fields.

Most of the other languages folks are going to reccomend will be much more general-purpose, so there's going to be a lot more in them that isn't nessecarily applicable to processing line-oriented text data.

Answer 10

That depends on what you want to do with the files.

Python's learning curve is less steep than R's. However, R has a bunch of built-in functions that make it very well suited for manipulating .csv files easily, particularly for statistical purposes.

Edit: I'd recommend R over Python for this purpose alone, if only because the basic operations (reading files, dropping rows, dropping columns, etc.) are slightly faster to write in R than in Python.

Answer 11

PowerShell has CSV import built in.

The syntax is ugly as death, but it's designed to be useful for administrators more than for programmers -- so who knows, you might like it.

It's supposed to be a quick get-up-and-go language, for better and worse.