Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

dougphd

macrumors member
Original poster
Apr 28, 2016
70
0
Got this error after running through most of the files. Then I found what I thought was a fairly egress error; fixed it; and now the code runs through less than half the files.
Code:
#include <stdio.h>

#include "ID.h"


int main(int argc, const char * argv[]) {

   

   

    int i;

    stock_index *indx;

   

    indx = (stock_index *)calloc(num_of_indices, sizeof(stock_index));

    listp = open_symbol_list();

    for (i = 0; i < num_of_indices; i++)

    {

        datap = get_index_and_open(listp);

        read_data_to_struct(datap, i, &indx[i]);

    }

}

#include <stdio.h>

#include <stdlib.h>

#include <string.h>

#include "math.h"



#define list_of_indices "/Users/douglasbrenner/ANDY/DATA/data_files/files.txt"

#define home  "/Users/douglasbrenner/ANDY/DATA/data_files/"

#define results "/Users/douglasbrenner/ANDY/arisk/results/"

#define days_of_data 60000


#define num_of_indices 1466

#define last_date "20161007"

#define first_date "20061006"

#define true  1

#define false 0


#define PER_SHARE_FEE 0.0

#define PER_TRADE_FEE 0

#define initial_investment 500000

#define nextday 1

#define days_of_risk 0


FILE *listp, *datap;



typedef struct equity

{

    char name[20][days_of_data];

    char date[14][days_of_data];

    float close[days_of_data];

    int num[days_of_data];

} stock_index;



typedef struct hi_low_vol

    {

   

    char name[20];

    float high;

    float low;

    float val;

} stock_index_paras;



void read_data_to_struct(FILE *datap, int i, stock_index indx[i] )

{

    int j;

    char *line, *junk;

    line = (char*)calloc(75,sizeof(char));

    junk = (char*)calloc(10,sizeof(char));

   


    j = 0;

   // fscanf(datap,"%s",line);

    //fgets(line,75,datap);  //read first line

        do

    {

        fscanf(datap,"%s",line);

      // printf("%s\n",line);

        strcpy(indx[i].name[j],strtok(line,","));//symbol

        strcpy(junk,strtok(NULL,","));

        strcpy(indx[i].date[j],strtok(NULL,","));   //FAILS HERE

      //  printf("%s %s \n",indx[i].name[j],indx[i].date[j] );

            }

    while( (strcmp(first_date,indx[i].date[j++]))!= 0);

    printf(" %s %s\n",indx[i].name[j-1],indx[i].date[j-1] );


    fclose(datap);

    free(line);

    free(junk);

   

    return;

}


FILE * open_symbol_list()

{

   

    if ((listp = fopen(list_of_indices,"r")) == NULL)

    {

        printf("Couldn't open list of indices\n" );

        exit(0);

    }

    else

        printf("Opened list of indices\n");

   

    return(listp);

}



FILE * get_index_and_open(FILE * listp)

{

   

    char *filename,*path;

    FILE *datap;

   

    filename = (char *) calloc(75,sizeof(char));

    path = (char *) calloc(75,sizeof(char));

    if ( (fscanf(listp,"%s",filename)) != 0)

        {

          strcpy(path,home);

          strcat(path,filename);

           

          if ((datap = fopen(path,"r")) == NULL)

          {

              printf("Couldn't open index file\n" );

              exit(0);

          }

          else

          { printf("opened data file %s\n", filename);

            fscanf(datap,"%s",filename); //read header line

              //printf("%s\n",filename);

          }

      }

    free (filename);

    free (path);

    return (datap);

}
/[code]
 
Congratulations, you have once again made me say "WTF" while reading your code.

Suggestions for future posts:

Use a separate CODE tag for each file (.c, .h, etc) that you post, and give the file names
Before you post the code run it through 'ident' to fix the formatting
Include the data files (files.tx in this case)
Include the line number along with the error you are getting
Why are you deciding on num_of_indices at compile time? Why not just read/process the file until eof?
Buy a book on C and learn how to use pointers (link)
Don't put your code in a HEADER file
Run your code through 'splint' to find possible errors
Learn Python or Ruby it will save you the frustration

(Install Homebrew on your Mac to install splint or indent)
 
Homebrew didn't load on to my laptop.
I never said I was a good programmer but it is time to improve.
Python and Ruby are too slow.
 
I don't know why exactly this fails, but for the failing line:

strcpy(indx.date[j],strtok(NULL,",")); //FAILS HERE

-> strtok() can return NULL if there are no more tokens, and strcpy(destination, NULL) is undefined and may crash, normally with SIGSEGV (Segmentation Fault) or possibly EXEC_BAD_ACCESS. As such, this is one thing to check.

In general in terms of this code, I would suspect bad input. For example, the do {...}
while( (strcmp(first_date,indx.date[j++]))!= 0); loop will presumably not stop until you match first_date. So if you have an input line with no matching date, are you sure the loop will terminate before running out of tokens and having strtok() returning NULL? In this case you'd be better with error checking, for example:

char *next_token;
...
do {
.....
next_token = strtok(NULL, ",");
if (next_token != NULL) {
strcpy(indx.date[j], next_token);
} else {
/* print error */
}
} while (next_token && (strcmp(first_date,indx.date[j++]) != 0);

The other main comment is that you have lots of fixed-length buffers (20, 75, etc.) which means if you have input file contents that are longer than this you'll get buffer overruns and random hard-to-diagnose crashes. You really should look at checking the length of input lines and other strings, and use the length-limited functions such as strlcpy() or strncpy() rather than strcpy(). For example from the Linux strcpy man page:

BUGS
If the destination string of a strcpy() is not large enough, then anything might happen. Overflowing fixed-length
string buffers is a favorite cracker technique for taking complete control of the machine. Any time a program reads or
copies data into a buffer, the program first needs to check that there's enough space. This may be unnecessary if
you can show that overflow is impossible, but be careful: programs can get changed over time, in ways that may
make the impossible possible.

Unless this is a one-off program you really should do length checking other wise you're liable to have random buffer overrun crashes with different data in future. Or use an interpreted language that guarantees safe string handling such as perl, python, etc.
 
Baloney. Plus, you'll save time programming by not fighting all these rinky dink errors

I have to agree with Robvas. If this is one-off program then you will get a working program much faster with a scripting language, and this will overwhelm any run time savings in C. And if this is something performance critical you're going to run hundreds of times then you really need to do a lot more error checking or you will have endless crashes every time you have an input file that overflows one of your buffers or breaks one of your other assumptions. Plus, the scripting languages don't have the same risk of random memory corruption bugs which may silently product the wrong results as well as crashing.

I write all my stuff in scripting languages nowadays (mostly Perl), even scripts that run on very large datasets.
 
Definitely agree too. Learning C doesn't necessarily make you a better programmer.
A good progammer has to recognize the necessary tools and use them accordingly.

Writing a rock solid program in pure C isn't easy and is almost never the best solution (for a hobbyist).
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.