New liberal_parsing option for parsing bad CSV data

Ershad Kunnakkadan

By Ershad Kunnakkadan

on November 22, 2016

This blog is part of our  Ruby 2.4 series.

Comma-Separated Values (CSV) is a widely used data format and almost every language has a module to parse it. In Ruby, we have CSV class to do that.

According to RFC 4180, we cannot have unescaped double quotes in CSV input since such data can't be parsed.

We get MalformedCSVError error when the CSV data does not conform to RFC 4180.

Ruby 2.4 has added a liberal parsing option to parse such bad data. When it is set to true, Ruby will try to parse the data even when the data does not conform to RFC 4180.

1
2# Before Ruby 2.4
3
4> CSV.parse_line('one,two",three,four')
5
6CSV::MalformedCSVError: Illegal quoting in line 1.
7
8
9# With Ruby 2.4
10
11> CSV.parse_line('one,two",three,four', liberal_parsing: true)
12
13=> ["one", "two\"", "three", "four"]
14

Stay up to date with our blogs. Sign up for our newsletter.

We write about Ruby on Rails, ReactJS, React Native, remote work,open source, engineering & design.