Parsing Comma-Separated Values or CSV files into a DataTable using Regular Expressions
By Willeus Acuña, February 11, 2004
Sometimes, you find the need to parse CSV files dynamically into a DataTable. CSV files can be simple or complex.
The simple CSV file would just have the fields separated by commas.
Complexity comes in when the data itself has
embedded commas, quotes or line breaks. This solution, written in C#, and using the .NET
Framework's Regular Expression object can handle such complex data. This has
been tested to work with CSV formatted files saved from Microsoft Excel.
To begin with, make sure that you have the following namespaces:
using System.Data;
using System.IO;
using System.Text.RegularExpressions;
Next, we build a function that parses any CSV input string into a DataTable:
public DataTable ParseCSV(string inputString) {
DataTable dt=new DataTable();
Regex re=new Regex("((?<field>[^\",\\r\\n]+)|\"(?<field>([^\"]|\"\")+)\")(,|(?<rowbreak>\\r\\n|\\n|$))");
ArrayList colArray=new ArrayList();
ArrayList rowArray=new ArrayList();
int colCount=0;
int maxColCount=0;
string rowbreak="";
string field="";
MatchCollection mc=re.Matches(inputString);
foreach(Match m in mc) {
field=m.Result("${field}").Replace("\"\"","\"");
rowbreak=m.Result("${rowbreak}");
if (field.Length > 0) {
colArray.Add(field);
colCount++;
}
if (rowbreak.Length > 0) {
rowArray.Add(colArray.ToArray());
colArray=new ArrayList();
if (colCount > maxColCount)
maxColCount=colCount;
colCount=0;
}
}
if (rowbreak.Length == 0) {
rowArray.Add(colArray.ToArray());
if (colCount > maxColCount)
maxColCount=colCount;
}
for(int i=0; i < maxColCount; i++)
dt.Columns.Add(String.Format("col{0:000}",i));
Array ra=rowArray.ToArray();
for(int i=0; i < ra.Length; i++) {
DataRow dr=dt.NewRow();
Array ca=(Array)(ra.GetValue(i));
for(int j=0; j < ca.Length; j++)
dr[j]=ca.GetValue(j);
dt.Rows.Add(dr);
}
if (dt.Columns.Count == 0)
dt.Columns.Add("NoData");
return dt;
}
Now that we have a parser for converting a string into a DataTable,
all we need now is a function that will read the content from a CSV file
and pass it to our ParseCSV function:
public DataTable ParseCSVFile(string path) {
string inputString="";
if (File.Exists(path)) {
StreamReader sr = new StreamReader(path);
inputString = sr.ReadToEnd();
sr.Close();
}
return ParseCSV(inputString);
}
And now you can easily fill a DataGrid with data coming off the CSV file:
protected System.Web.UI.WebControls.DataGrid DataGrid1;
private void Page_Load(object sender, System.EventArgs e) {
DataTable dt=ParseCSVFile(Server.MapPath("./demo.csv"));
DataGrid1.DataSource=dt;
DataGrid1.DataBind();
}
Congratulations! You are now able to parse CSV into a DataTable.
Good luck with your programming.