Register FAQ / Rules Forum Spy Search Today's Posts Mark Forums Read
Go Back   MacRumors Forums > Apple Systems and Services > Programming > Mac Programming

Reply
 
Thread Tools Search this Thread Display Modes
Old Apr 13, 2011, 10:11 AM   #1
farmerdoug
macrumors 6502a
 
Join Date: Sep 2008
reading strings from a file.

simple file

UT_DATE 2011_04_13 s UT date at start of night
TSCOPE 200"_HALE s 200" Hale Telescope at Mt Palomar

simple code


Code:
#include <stdio.h>
#include <stdlib.h> 
#include "/Users/p1640/1640/C/FITS/include/fitsio.h"

int main (int argc, const char * argv[]) // argv[1] = fits file. argv[2] = header file

{
	fitsfile   *fptr;   
    FILE *header;
	char *line;
	int status;
	
	line = (char *)calloc(60, sizeof(char));
	if ((header = fopen(argv[2],"r")) ==NULL)
		printf("couldn't open header file\n");
	//if (fits_open_file(&fptr, argv[1], READWRITE, &status)) {
   //     printf("load_simple_fits_float_data: fits_open_file: ", status);
   //     return (1);
	//}
	
	while (fgets(line,60,header))
		
		//while (fscanf(header,"%s",line) != EOF)
		{
			   
			printf("%s\n",line); 
			//fscanf(header,"%s",line);
			//printf("%s",line); 
			//fscanf(header,"%s",line);
			//printf("%s",line); 
			//fgets(line,60,header);
			//printf("%s\n",line);

		   
		}
    
	
	fclose(header);
	return 0;
}
not so simple results

When I run the code using the commented out fscanf, I get a list of all the strings in the file but when I run the code using fgets I get


TSCOPE 200"___04_13 s UT date at start of night
HALE s 200" Hale Telescope at Mt Palomar
farmerdoug is offline   0 Reply With Quote
Old Apr 13, 2011, 10:59 AM   #2
chown33
macrumors 603
 
Join Date: Aug 2009
Describe what you expected to happen.
Post an example of the output you want.

I can't tell from anything you posted, what it is you want or expect.


Refer to the man pages for fgets() and fscanf().
fgets() stops at a newline, or when its max-count is exhausted.
fascanf() with %s does not do that: %s stops at whitespace.

Since whitespace is a larger class of characters than newline (e.g. whitespace includes spaces and tabs), I would expect %s to return "words" as delimited by whitespace, while fgets() will return "lines" as delimited by newlines. If you have some other expectation, please explain what it is.


Also, fgets() respects a max count, while fscanf() will not (at least not as coded). This means a string longer than 60 will overflow the buffer for fscanf(), but not for fgets().
chown33 is offline   0 Reply With Quote
Old Apr 13, 2011, 11:01 AM   #3
farmerdoug
Thread Starter
macrumors 6502a
 
Join Date: Sep 2008
I need the output to mirror the input one line at a time for further parsing.
farmerdoug is offline   0 Reply With Quote
Old Apr 13, 2011, 11:05 AM   #4
chown33
macrumors 603
 
Join Date: Aug 2009
Quote:
Originally Posted by farmerdoug View Post
I need the output to mirror the input one line at a time for further parsing.
Then use fgets().

Be sure to check the buffer for a newline in the last position. If it's not a newline, you exhausted the count, i.e. your buffer length was less than the line length.
chown33 is offline   0 Reply With Quote
Old Apr 13, 2011, 11:16 AM   #5
farmerdoug
Thread Starter
macrumors 6502a
 
Join Date: Sep 2008
The file displays correctly in text edit; Doesn't that imply the existence of new line characters?
farmerdoug is offline   0 Reply With Quote
Old Apr 13, 2011, 11:36 AM   #6
Bill McEnaney
macrumors 6502
 
Join Date: Apr 2010
Quote:
Originally Posted by chown33 View Post
Then use fgets().

Be sure to check the buffer for a newline in the last position. If it's not a newline, you exhausted the count, i.e. your buffer length was less than the line length.
The strrchr function thinks '\0' is the last character in a null-terminated string when that function searches for the last instance of the character you tell it to search for.
Bill McEnaney is offline   0 Reply With Quote
Old Apr 13, 2011, 11:42 AM   #7
farmerdoug
Thread Starter
macrumors 6502a
 
Join Date: Sep 2008
You are suggesting that I use strrchr to check what the last character is?
farmerdoug is offline   0 Reply With Quote
Old Apr 13, 2011, 11:47 AM   #8
subsonix
macrumors 68030
 
Join Date: Feb 2008
Use something more generous than 60 characters, is my suggestion. I usually go for BUFSIZ size, which is a system defined constant set to 1024. You also print a '\n' but if fgets captures less than 60 characters an eventual newline will be part of the string. That might mess up your output.
subsonix is offline   0 Reply With Quote
Old Apr 13, 2011, 11:48 AM   #9
chown33
macrumors 603
 
Join Date: Aug 2009
Quote:
Originally Posted by farmerdoug View Post
The file displays correctly in text edit; Doesn't that imply the existence of new line characters?
A. Not necessarily.
B. Apropos of what?


A. TextEdit.app will display lines that are terminated with CR's alone. It will also display CR-LF terminated lines. fgets() doesn't necessarily recognize a CR as a line-ending. It does recognize LF (i.e. classix Unix newline character).

If you don't know how your lines are terminated, you need to look at the binary data, not the text interpretation that TextEdit.app shows you. There can be several possible interpretations for some given data, and if TextEdit is set to automatically choose one, then what it shows you may not be the exact same as what's in the file.

Google hex fiend and download it. Use it to tell you what's in your file. Or read the man page for the hexdump command.

Even if TextEdit shows lines correctly, and lines are terminated by newlines, this is no guarantee that every line is less than some arbitrary number like 60. In short, if you don't sanitize your input data, your parser might misinterpret the data.


B. What is the relevance of this question to your previous posts? You hadn't previously mentioned a problem with detecting line-endings. In fact, you haven't really described what the problem is at all. Basically all you've said is that using fscanf() with %s doesn't produce the same output as fgets(), to which I have basically answered "No, they stop on different things, so the output won't be the same".

So please take a little time and describe exactly what you're trying to accomplish, post the code you expect to accomplish this with, then describe what the code produces that fails to meet your expectation.
1. Post your code and your actual data.
2. Describe what you expected to happen.
3. Describe what actually happened.

Post a zip file containing the actual data. If it contains CRs or CRLFs, then pasting it into a post will translate line endings. We need to see the actual data being read and parsed.
chown33 is offline   0 Reply With Quote
Old Apr 13, 2011, 11:49 AM   #10
balamw
Moderator
 
balamw's Avatar
 
Join Date: Aug 2005
Location: New England, USA
Quote:
Originally Posted by subsonix View Post
Use something more generous than 60 characters, is my suggestion. I usually go for BUFSIZ size, which is a system defined constant set to 1024.
This.

Plus, if you are using fgets should you pair that with puts instead of printf to avoid the same kind of termination issues chown33 is referring to.

As it stands your code is a poor man's clone of "cat" that will only work properly if each line of the input file is guaranteed to be 60 characters long or less.

B
__________________
MBA (13" 1.7 GHz 128GB), UMBP (15" SD 2.8 GHz), UMB (13" 2.4 GHz), iMac (17" Yonah), 32GB iPad 3 WiFi+LTE, 64 GB iPad WiFi, 32 GB iPhone 5, Airport Extreme
balamw is offline   0 Reply With Quote
Old Apr 13, 2011, 12:00 PM   #11
Bill McEnaney
macrumors 6502
 
Join Date: Apr 2010
Quote:
Originally Posted by farmerdoug View Post
You are suggesting that I use strrchr to check what the last character is?
I'd use it or the rindex function. My point is that in a null-terminated string, you need to check the character that's to the immediate left of the null character if there's any character there to check. In any null-terminated string, the physically last character is the null character, the '\0'.
Bill McEnaney is offline   0 Reply With Quote
Old Apr 13, 2011, 01:04 PM   #12
farmerdoug
Thread Starter
macrumors 6502a
 
Join Date: Sep 2008
I recreated the file with out a carriage return and then put it back. The file looks ok. Increasing the buffer size did not help; In fact, it made things worse. strrchr told me that there was at least one "\0", in the file.
farmerdoug is offline   0 Reply With Quote
Old Apr 13, 2011, 02:02 PM   #13
subsonix
macrumors 68030
 
Join Date: Feb 2008
Quote:
Originally Posted by farmerdoug View Post
I recreated the file with out a carriage return and then put it back. The file looks ok. Increasing the buffer size did not help; In fact, it made things worse. strrchr told me that there was at least one "\0", in the file.
Well, the point of it is that fgets() reads until '\n' or '\0'. Meaning, if your lines is not exactly 60 characters the end of the string will move relative to your fgets calls. Having the buffer "large enough" means that you will have one line per fgets call.

If fgets reads a string that is less than 60 characters and contain a new line, it will be contained in the string. I usually create a strip_newline function to deal with that.
subsonix is offline   0 Reply With Quote
Old Apr 13, 2011, 02:08 PM   #14
balamw
Moderator
 
balamw's Avatar
 
Join Date: Aug 2005
Location: New England, USA
Quote:
Originally Posted by subsonix View Post
Well, the point of it is that fgets() reads until '\n' or '\0'. Meaning, if your lines is not exactly 60 characters the end of the string will move relative to your fgets calls. Having the buffer "large enough" means that you will have one line per fgets call.

If fgets reads a string that is less than 60 characters and contain a new line, it will be contained in the string. I usually create a strip_newline function to deal with that.
Explicitly: The string read in by fgets will include both the \n and the \0 when a complete line has been read. Since you reuse the buffer, reading in a shorter line will leave the previous \n and \0 in the buffer. The first \0 tells you where the last read ended. Increasing the buffer size should just mean potentially fewer reads. If there is no \0 in the buffer, the line was longer than the buffer size.

So, you either want to strip off the \n as subsonix says, or adapt your code to handle the fact that \n is included. e.g. by using fputs instead of printf("%s\n");

The code below is basically "cat".

Code:
#include <stdio.h>
#include <stdlib.h> 

int main (int argc, const char * argv[]) 
{
        FILE *header;
        char *line;
        int status;
        
        line = (char *)calloc(60, sizeof(char));
        if ((header = fopen("testfile.txt","r")) ==NULL)
                printf("couldn't open header file\n");
        
        while (fgets(line,60,header))
                {
                        fputs(line,stdout); 
                }
        
        fclose(header);
        return 0;
}
B
__________________
MBA (13" 1.7 GHz 128GB), UMBP (15" SD 2.8 GHz), UMB (13" 2.4 GHz), iMac (17" Yonah), 32GB iPad 3 WiFi+LTE, 64 GB iPad WiFi, 32 GB iPhone 5, Airport Extreme
balamw is offline   0 Reply With Quote
Old Apr 13, 2011, 02:30 PM   #15
farmerdoug
Thread Starter
macrumors 6502a
 
Join Date: Sep 2008
According to LabView which made the file, EOF on a windows machine is cr/lf while it is just lf on a MAC. It seems that fgets in Xcode looks for cr/lf and there for isn't any good unless you specially tell LabView how to terminate a line.
farmerdoug is offline   0 Reply With Quote
Old Apr 13, 2011, 02:34 PM   #16
balamw
Moderator
 
balamw's Avatar
 
Join Date: Aug 2005
Location: New England, USA
Quote:
Originally Posted by farmerdoug View Post
According to LabView which made the file, EOF on a windows machine is cr/lf while it is just lf on a MAC. It seems that fgets in Xcode looks for cr/lf and there for isn't any good unless you specially tell LabView how to terminate a line.
So strip one both terminators off and replace if needed with the one you want.

The easiest way to do this is to find the first occurrence of \n or \r and replace it with \0. Basically what subsonix was suggesting with strip_newline. (example http://www.cprogramming.com/tutorial/c/lesson9.html)

The only challenge to this is if you get \n\r instead of \r\n.

B
__________________
MBA (13" 1.7 GHz 128GB), UMBP (15" SD 2.8 GHz), UMB (13" 2.4 GHz), iMac (17" Yonah), 32GB iPad 3 WiFi+LTE, 64 GB iPad WiFi, 32 GB iPhone 5, Airport Extreme
balamw is offline   0 Reply With Quote
Old Apr 13, 2011, 02:37 PM   #17
subsonix
macrumors 68030
 
Join Date: Feb 2008
But EOF doesn't mean end of line but end of file and EOF in this case is only the terminating condition of the loop, it doesn't effect what fgets does, only when to stop calling it. That is, keep calling fgets until the entire file is read.

Code:
#include <stdio.h>
#include <string.h>

void strip_newline(char *str) {
    char *nl = (str + strlen(str) -1);
    if( *nl == '\n' )
        *nl = 0;
}

int main()
{
    char buf[BUFSIZ] = {0};

    while( fgets(buf, BUFSIZ, stdin) ) {
        strip_newline(buf);
        puts(buf);
    }

    return 0;
}
subsonix is offline   0 Reply With Quote
Old Apr 13, 2011, 02:43 PM   #18
balamw
Moderator
 
balamw's Avatar
 
Join Date: Aug 2005
Location: New England, USA
Code:
void strip_newline(char *str) {
    char *nl = (str + strlen(str) -1);
    if( *nl == '\n' )
        *nl = '\0';
}
Would have to be extended to support \r\n in this case.

B
__________________
MBA (13" 1.7 GHz 128GB), UMBP (15" SD 2.8 GHz), UMB (13" 2.4 GHz), iMac (17" Yonah), 32GB iPad 3 WiFi+LTE, 64 GB iPad WiFi, 32 GB iPhone 5, Airport Extreme
balamw is offline   0 Reply With Quote
Old Apr 13, 2011, 03:26 PM   #19
Bill McEnaney
macrumors 6502
 
Join Date: Apr 2010
Quote:
Originally Posted by balamw View Post
Code:
void strip_newline(char *str) {
    char *nl = (str + strlen(str) -1);
    if( *nl == '\n' )
        *nl = '\0';
}
Would have to be extended to support \r\n in this case.

B
How about this if we can assume that either '\n' or '\r' will always be one place to the left of '\0' when either '\n' or '\r' occurs in str? If you need to know what character you've stripped, you can return it.
Code:
/* If you find a '\n' or a '\r', replace it with a '\0'. */

void strip_line_terminator(char *str)
{
  char *place = strpbrk(str, "\n\r");

  if (place != NULL)
    *place = '\0';
}

Last edited by Bill McEnaney; Apr 13, 2011 at 07:36 PM. Reason: To remove an extra ')'.
Bill McEnaney is offline   0 Reply With Quote
Old Apr 13, 2011, 03:30 PM   #20
subsonix
macrumors 68030
 
Join Date: Feb 2008
But that will only take care of either '\n' or '\r', unless you call it twice or put the strpbrk call in a while loop.
subsonix is offline   0 Reply With Quote
Old Apr 13, 2011, 03:46 PM   #21
chown33
macrumors 603
 
Join Date: Aug 2009
Quote:
Originally Posted by subsonix View Post
But that will only take care of either '\n' or '\r', unless you call it twice or put the strpbrk call in a while loop.
It's not necessary to find more than one line terminator. The first one found terminates the string. Any remainder of the original string is ignored/discarded, regardless of what it contains.
chown33 is offline   0 Reply With Quote
Old Apr 13, 2011, 03:52 PM   #22
subsonix
macrumors 68030
 
Join Date: Feb 2008
Yes, good point.
subsonix is offline   0 Reply With Quote
Old Apr 13, 2011, 03:55 PM   #23
Bill McEnaney
macrumors 6502
 
Join Date: Apr 2010
Quote:
Originally Posted by subsonix View Post
But that will only take care of either '\n' or '\r', unless you call it twice or put the strpbrk call in a while loop.
Good point.

Code:
void strip_line_enders(char *str)
{
  char *place;

  while ((place = strpbrk(str, "\n\r")) != NULL)
    *place = '\0';
}

Last edited by Bill McEnaney; Apr 13, 2011 at 04:00 PM.
Bill McEnaney is offline   0 Reply With Quote
Old Apr 13, 2011, 04:04 PM   #24
Bill McEnaney
macrumors 6502
 
Join Date: Apr 2010
Quote:
Originally Posted by chown33 View Post
It's not necessary to find more than one line terminator. The first one found terminates the string. Any remainder of the original string is ignored/discarded, regardless of what it contains.
Oh goodie, that means I don't need my while-loop. I love to decrease overhead.
Bill McEnaney is offline   0 Reply With Quote
Old Apr 13, 2011, 04:09 PM   #25
balamw
Moderator
 
balamw's Avatar
 
Join Date: Aug 2005
Location: New England, USA
Quote:
Originally Posted by balamw View Post
The easiest way to do this is to find the first occurrence of \n or \r and replace it with \0.
Quote:
Originally Posted by chown33 View Post
It's not necessary to find more than one line terminator. The first one found terminates the string.
Isn't that what I said?

since fgets will terminate on \n, the risk you run is if your foreign code generating the file puts out \n\r instead of \r\n. That would give you a \r at the beginning of the second and beyond lines and stripping to the first \r would give you zero length strings.

You might want to check if the first character of the string is \r and there are other characters before the \n\0.

If you control the LabView code and can make sure it uses CR/LF this is a non-issue.

B
__________________
MBA (13" 1.7 GHz 128GB), UMBP (15" SD 2.8 GHz), UMB (13" 2.4 GHz), iMac (17" Yonah), 32GB iPad 3 WiFi+LTE, 64 GB iPad WiFi, 32 GB iPhone 5, Airport Extreme
balamw is offline   0 Reply With Quote

Reply
MacRumors Forums > Apple Systems and Services > Programming > Mac Programming

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Similar Threads
thread Thread Starter Forum Replies Last Post
Need help reading&editing .cfg file Takenlife Mac Basics and Help 13 Dec 19, 2013 08:05 PM
Help reading this crash file? radco85 OS X 10.8 Mountain Lion 0 Jul 26, 2013 10:05 PM
Randomly retrieve strings from a file minipower Mac Programming 0 Oct 20, 2012 02:49 AM
Reading from Text File into an Array in PHP Kilamite Web Design and Development 1 Aug 7, 2012 05:30 AM
Need help reading the database.db file Jkb242 iPhone Tips, Help and Troubleshooting 0 Jun 19, 2012 10:37 AM

Forum Jump

All times are GMT -5. The time now is 01:00 AM.

Mac Rumors | Mac | iPhone | iPhone Game Reviews | iPhone Apps

Mobile Version | Fixed | Fluid | Fluid HD
Copyright 2002-2013, MacRumors.com, LLC