CS 112 Resources: Cygwin, Eclipse, Windows, Linux, and End-of-Line Characters


Windows and Unix systems (Linux + MacOS X) treat end-of-line differently.

This difference can become an issue when a C++ program uses the getline() method from the <iostream> library.

In particular, if the C++ compiler is configured to process the Unix end-of-line character, and getline() is used to read a text file that uses Windows end-of-line characters, the getline() method will not work as expected. To see why, suppose that a text file contains the lines:

   Hello
   Goodbye
   
If this file was created on a Windows system, then an ifstream to it will see:
   Hello\015\010Goodbye\015\010
   
However, if it was created on a Unix system, then an ifstream to it will see:
   Hello\010Goodbye\010
   
So long as your C++ compiler is configured consistently with one's system, this makes no difference.

That is, if getline() is used to read the first line in the file and the C++ compiler is configured for Windows end-of-line characters and we read the file created on Windows, then getline() will read until it has read the CR and LF characters. and then return the string "Hello". Likewise, if the C++ compiler is configured for Unix end-of-line characters and we read the file created on a Unix system, then getline() will read until it has read the LF character, and then return the string "Hello".

The problem comes when you mix and match. For example, if the C++ compiler is configured for Unix end-of-line characters and we read the file created on a Windows system, then getline() will read until it has read the LF character, and then return the string "Hello\015". If you have a unit test that compares whether this string and "Hello" are equal, that test will fail.


CS > 112 > Resources > End-of-line Characters
This page maintained by Joel Adams.