Using Bufsiz With Read and Write in C Not Working Correctly
Download Exam Project
Introduction
A common requirement is to have applications share data with other programs. Although in that location are interfaces available to work with, for example, Microsoft Excel data files, this approach is generally complex, involves a fair corporeality of overhead, and requires that support libraries accompany your application.
Note that the code below is a complete rewrite of the code presented when this article was first published. I decided to brand the lawmaking more robust and add together a number of new features that include support for multi-line values and the ability to change the characters used for delimiters and quotes. I also added several options to control how the CSV reader class handles empty lines.
Comma-Separated Values (CSV) Files
A much simpler style to accept your application share information is by reading and writing Comma-Separated Values (CSV) files. CSV files can easily be read and written past many programs, including Microsoft Excel.
For the most function, reading and writing CSV files is trivial. Equally the name suggestions, a CSV file is but a plain text file that contains i or more values per line, separated by commas. Each value is a field (or column in a spreadsheet), and each line is a record (or row in a spreadsheet).
Even so, there is slightly more work involved. Double quotes are used to wrap values that contain special characters such as commas, double quotes, new lines, etc. This is required to forbid those special characters from existence interpreted as value delimiters, etc. In addition, double quotes that exist within a double quoted value must appear as two double quote characters together to distinguish them from the double quote character at the stop of the value.
So this seems like a perfect job for a handy little C# form. Listing 1 shows my CsvFileReader and CsvFileWriter classes.
Listing 1: CsvFileReader and CsvFileWriter Classes
/// <summary> /// Determines how empty lines are interpreted when reading CSV files. /// These values practice not affect empty lines that occur within quoted fields /// or empty lines that announced at the end of the input file. /// </summary> public enum EmptyLineBehavior { /// <summary> /// Empty lines are interpreted as a line with zero columns. /// </summary> NoColumns, /// <summary> /// Empty lines are interpreted as a line with a single empty cavalcade. /// </summary> EmptyColumn, /// <summary> /// Empty lines are skipped over every bit though they did non exist. /// </summary> Ignore, /// <summary> /// An empty line is interpreted every bit the stop of the input file. /// </summary> EndOfFile, } /// <summary> /// Common base course for CSV reader and writer classes. /// </summary> public abstract class CsvFileCommon { /// <summary> /// These are special characters in CSV files. If a column contains whatever /// of these characters, the entire column is wrapped in double quotes. /// </summary> protected char[] SpecialChars = new char[] { ',', '"', '\r', '\n' }; // Indexes into SpecialChars for characters with specific significant individual const int DelimiterIndex = 0; private const int QuoteIndex = 1; /// <summary> /// Gets/sets the character used for cavalcade delimiters. /// </summary> public char Delimiter { go { render SpecialChars[DelimiterIndex]; } set { SpecialChars[DelimiterIndex] = value; } } /// <summary> /// Gets/sets the character used for cavalcade quotes. /// </summary> public char Quote { get { return SpecialChars[QuoteIndex]; } set { SpecialChars[QuoteIndex] = value; } } } /// <summary> /// Class for reading from comma-separated-value (CSV) files /// </summary> public class CsvFileReader : CsvFileCommon, IDisposable { // Private members individual StreamReader Reader; individual string CurrLine; individual int CurrPos; private EmptyLineBehavior EmptyLineBehavior; /// <summary> /// Initializes a new case of the CsvFileReader class for the /// specified stream. /// </summary> /// <param name="stream">The stream to read from</param> /// <param name="emptyLineBehavior">Determines how empty lines are handled</param> public CsvFileReader(Stream stream, EmptyLineBehavior emptyLineBehavior = EmptyLineBehavior.NoColumns) { Reader = new StreamReader(stream); EmptyLineBehavior = emptyLineBehavior; } /// <summary> /// Initializes a new instance of the CsvFileReader form for the /// specified file path. /// </summary> /// <param proper name="path">The name of the CSV file to read from</param> /// <param proper noun="emptyLineBehavior">Determines how empty lines are handled</param> public CsvFileReader(string path, EmptyLineBehavior emptyLineBehavior = EmptyLineBehavior.NoColumns) { Reader = new StreamReader(path); EmptyLineBehavior = emptyLineBehavior; } /// <summary> /// Reads a row of columns from the current CSV file. Returns false if no /// more data could be read considering the end of the file was reached. /// </summary> /// <param proper name="columns">Collection to agree the columns read</param> public bool ReadRow(List<cord> columns) { // Verify required argument if (columns == null) throw new ArgumentNullException("columns"); ReadNextLine: // Read next line from the file CurrLine = Reader.ReadLine(); CurrPos = 0; // Examination for finish of file if (CurrLine == goose egg) return imitation; // Examination for empty line if (CurrLine.Length == 0) { switch (EmptyLineBehavior) { case EmptyLineBehavior.NoColumns: columns.Clear(); render truthful; case EmptyLineBehavior.Ignore: goto ReadNextLine; case EmptyLineBehavior.EndOfFile: return false; } } // Parse line string cavalcade; int numColumns = 0; while (true) { // Read next column if (CurrPos < CurrLine.Length && CurrLine[CurrPos] == Quote) column = ReadQuotedColumn(); else column = ReadUnquotedColumn(); // Add together column to list if (numColumns < columns.Count) columns[numColumns] = cavalcade; else columns.Add(column); numColumns++; // Break if we reached the end of the line if (CurrLine == null || CurrPos == CurrLine.Length) break; // Otherwise skip delimiter Debug.Affirm(CurrLine[CurrPos] == Delimiter); CurrPos++; } // Remove whatever unused columns from collection if (numColumns < columns.Count) columns.RemoveRange(numColumns, columns.Count - numColumns); // Point success return true; } /// <summary> /// Reads a quoted column by reading from the electric current line until a /// endmost quote is found or the end of the file is reached. On render, /// the current position points to the delimiter or the finish of the last /// line in the file. Note: CurrLine may be gear up to cypher on return. /// </summary> individual string ReadQuotedColumn() { // Skip opening quote character Debug.Assert(CurrPos < CurrLine.Length && CurrLine[CurrPos] == Quote); CurrPos++; // Parse cavalcade StringBuilder architect = new StringBuilder(); while (true) { while (CurrPos == CurrLine.Length) { // Cease of line and then attempt to read the next line CurrLine = Reader.ReadLine(); CurrPos = 0; // Done if we reached the stop of the file if (CurrLine == null) render architect.ToString(); // Otherwise, treat as a multi-line field builder.Suspend(Environment.NewLine); } // Exam for quote grapheme if (CurrLine[CurrPos] == Quote) { // If two quotes, skip first and treat 2nd every bit literal int nextPos = (CurrPos + 1); if (nextPos < CurrLine.Length && CurrLine[nextPos] == Quote) CurrPos++; else break; // Unmarried quote ends quoted sequence } // Add electric current character to the column builder.Append(CurrLine[CurrPos++]); } if (CurrPos < CurrLine.Length) { // Consume endmost quote Debug.Affirm(CurrLine[CurrPos] == Quote); CurrPos++; // Append any boosted characters appearing before adjacent delimiter architect.Suspend(ReadUnquotedColumn()); } // Return column value return builder.ToString(); } /// <summary> /// Reads an unquoted column past reading from the current line until a /// delimiter is found or the end of the line is reached. On return, the /// current position points to the delimiter or the stop of the current /// line. /// </summary> private string ReadUnquotedColumn() { int startPos = CurrPos; CurrPos = CurrLine.IndexOf(Delimiter, CurrPos); if (CurrPos == -i) CurrPos = CurrLine.Length; if (CurrPos > startPos) return CurrLine.Substring(startPos, CurrPos - startPos); return String.Empty; } // Propagate Dispose to StreamReader public void Dispose() { Reader.Dispose(); } } /// <summary> /// Class for writing to comma-separated-value (CSV) files. /// </summary> public grade CsvFileWriter : CsvFileCommon, IDisposable { // Private members individual StreamWriter Writer; individual cord OneQuote = null; private string TwoQuotes = nil; private cord QuotedFormat = null; /// <summary> /// Initializes a new example of the CsvFileWriter class for the /// specified stream. /// </summary> /// <param proper name="stream">The stream to write to</param> public CsvFileWriter(Stream stream) { Writer = new StreamWriter(stream); } /// <summary> /// Initializes a new example of the CsvFileWriter class for the /// specified file path. /// </summary> /// <param name="path">The name of the CSV file to write to</param> public CsvFileWriter(string path) { Writer = new StreamWriter(path); } /// <summary> /// Writes a row of columns to the current CSV file. /// </summary> /// <param name="columns">The listing of columns to write</param> public void WriteRow(List<string> columns) { // Verify required argument if (columns == null) throw new ArgumentNullException("columns"); // Ensure we're using electric current quote character if (OneQuote == zippo || OneQuote[0] != Quote) { OneQuote = String.Format("{0}", Quote); TwoQuotes = String.Format("{0}{0}", Quote); QuotedFormat = Cord.Format("{0}{{0}}{0}", Quote); } // Write each cavalcade for (int i = 0; i < columns.Count; i++) { // Add delimiter if this isn't the commencement column if (i > 0) Writer.Write(Delimiter); // Write this column if (columns[i].IndexOfAny(SpecialChars) == -1) Author.Write(columns[i]); else Writer.Write(QuotedFormat, columns[i].Supplant(OneQuote, TwoQuotes)); } Writer.WriteLine(); } // Propagate Dispose to StreamWriter public void Dispose() { Writer.Dispose(); } }
Because the .NET stream classes generally seem to be dissever into reading and writing, I decided to follow that same design with my CSV code and split it into CsvFileReader and CsvFileWriter. This besides simplifies the code because neither class needs to worry about which way the file is in or protect against the user switching modes.
Starting at the top of the code is the EmptyLineBehavior enum. After conscientious review, I realized there were a few valid ways to interpret an empty line within a CSV file. So the CsvFileReader grade' constructor takes an argument of this type to specify how empty lines should be handled. Note that this does not bear on empty lines within a quoted value, or an empty line at the stop of the input file.
Side by side is my CsvFileCommon class. In that location are a few settings common to both the reader and writer classes and and so I use this abstract base class to runway the special characters within a CSV value that require the value to exist enclosed in quotes. It also provides a style to modify the characters used every bit delimiters and quotes.
The CsvFileReader class comes after that. This is the course that reads data from a CSV file and is the most complex form presented hither. I modeled its beliefs after how Microsoft Excel interprets CSV files. In that location are ii constructors: one that accepts the name of the input file and another that accepts an input Stream. As mentioned previously, both constructors too accept an EmptyLineBehavior statement to command how empty lines are handled. The ReadRow() method is used to read a single row from the input file and populate a List<string> collection with the values read. For each value, it dispatches the appropriate parsing routine based on whether or non the beginning character is a quote character.
Finally, the CsvFileWriter class writes data to a CSV file. As with the CsvFileReader grade, this class has two constructors. Call the WriteRow() method to write a single row to the target file using a drove of values. Each time WriteRow() is chosen, it checks to see if the current quote character has changed. If so, it updates the strings used to correctly format quoted output.
Both the reader and writer classes implement IDisposable, which is delegated to the StreamReader or StreamWriter grade. This allows you lot to enclose your use of either grade within a using statement to ensure the file is closed in a timely manner
Using the Code
The code was designed to be equally piece of cake as possible to use. When y'all phone call CsvFileWriter.WriteRow(), you supply a collection of the values to be written to the file. And when you call CsvFileReader.ReadRow(), the collection argument is populated with the values read in.
List 2 demonstrates using the classes.
Listing two: Sample Code to Write and Read CSV files
individual void WriteValues() { using (var writer = new CsvFileWriter("WriteTest.csv")) { // Write each row of data for (int row = 0; row < 100; row++) { // TODO: Populate column values for this row Listing<string> columns = new List<string>(); author.WriteRow(columns); } } } private void ReadValues() { Listing<cord> columns = new List<string>(); using (var reader = new CsvFileReader("ReadTest.csv")) { while (reader.ReadRow(columns)) { // TODO: Do something with columns' values } } }
Using goto Statements
One other thing that I should probably comment on is my apply of the goto keyword. In full general, using the goto keyword makes the flow of execution harder to see.
For this reason, the utilize of goto is generally avoided and rarely seen in C# code. In fact, beginning developers are generally told to avert using goto altogether in an endeavor to get them thinking about ameliorate ways to structure code.
Nevertheless, some developers seem to have this besides far and care for it almost as a religion. Developing software is not a organized religion. It'south about producing the best and most readable code. After trying several different type of loops in the code in question, I ended that the goto statement was the well-nigh efficient and easiest to read. After all, that lawmaking is non really a loop. It'south just one of many cases where it needs to go back and try something once again.
So I may take raised a few eye brows with my employ of the goto keyword, but I stand behind the conclusion.
Conclusion
That's about all at that place is to it. The fastened project includes the code above in a test projection that loads CSV files and displays them in a grid. The user can edit the grid and then save the results dorsum to a file.
This code should be helpful for anyone wanting an easy way to share information with Microsoft Excel or any other program that can read or write CSV files.
Update History
12/16/2010: Someone pointed out in that location were a couple of problems with the lawmaking. This update fixes them.
vi/26/2011: I received a few questions well-nigh how to use the code and so I've added a couple of unproblematic examples.
1/fifteen/2012: Corrected problem with WriteRow() where the first comma was not written if the first cavalcade was empty.
10/vii/2012: Completely reworked the lawmaking to exist more robust, flexible and added several new features.
Finish-User License
Use of this article and any related source code or other files is governed by the terms and weather condition of The Lawmaking Projection Open License.
Author Information
Source: http://www.blackbeltcoder.com/Articles/files/reading-and-writing-csv-files-in-c
0 Response to "Using Bufsiz With Read and Write in C Not Working Correctly"
Postar um comentário