Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

gregor.hoch

macrumors member
Original poster
Apr 5, 2011
51
6
Hi,

I am getting an EXC BAD ACCESS Error when my c++ (I am using XCode) runs with certain files. The program extracts the annotations and highlights from a pdf file and is based on the poppler library. Now I am getting an EXC BAD ACCESS Error when I am working with files created with a certain PDF Viewer (files from other viewers work fine). Below is the (reduced) main function. The error occurs when I use the variable 'text' in the line:
Code:
outputString=std::string(fileName->getCString())+opS+text+"\n";
When I exclude the variable everything works. Breakpoints suggest that the variable has been optimized away by compiler even though it's used in the line above.

Code:
GooString *getContents() const { return contents; }
char *getCString() const { return s; }

Code:
int main(int argc, char *argv[]) {
    PDFDoc *doc;
    GooString *fileName;
    GBool outputToFile= gFalse;
    std::ofstream outputFile;
    std::string outputString("");
    std::string opS(" ; ");
    Object info;
    char *text;
    
    // read config file
    globalParams = new GlobalParams();

    // get filename
    fileName = new GooString(argv[1]);

    // get pdf document
    doc = PDFDocFactory().createPDFDoc(*fileName, NULL, NULL);

    // iterate through pages
    for (int page = 1; page <= doc->getNumPages(); ++page) {
                
        // get current page
        Page *currentPage= doc->getPage(page);

        // get annotations
        Annots *annots=currentPage->getAnnots(doc->getCatalog());              
                
        // number of attachments
        int n_annots=annots->getNumAnnots();   
                    
        // iterate through annotations
        for (int a = 0; a < n_annots; ++a) {            
            
            // get annotation
            Annot *annot=annots->getAnnot(a);
                        
            // get content of annotation
            text=annot->getContents()->getCString();                 

            // create output line
            outputString=std::string(fileName->getCString())+opS+text+"\n";          

            // output annotation information    
            printf("%s",  outputString.c_str());
            
        }        
    }
 
    return 0;
}
 
Last edited:
I can only offer a guess. The compiler might not be executing this line until "text" is needed:

Code:
text=annot->getContents()->getCString();

and further, the following line may not be executed until needed in the above line:
Code:
 Annot *annot=annots->getAnnot(a);

So, if you removed the "text" variable from the output, none of that ever gets called.

What is the value of "a" when you get the bad exec? How many annotations are in the pdf?

I don't know anything about pdfs internal structure, or the library you are using, but I might guess that for some reason with some files there are no annotations, but you are being returned a non-zero value.
 
You seem to assume only text annotation types. Yet, not all annotation types have content. For those that don't have content, Annot::initialize(XRef*, Dict*, Catalog*) will set Annot::content to NULL. So Annot::getContent() can return NULL, yet your code doesn't check for this fact.
 
You seem to assume only text annotation types. Yet, not all annotation types have content. For those that don't have content, Annot::initialize(XRef*, Dict*, Catalog*) will set Annot::content to NULL. So Annot::getContent() can return NULL, yet your code doesn't check for this fact.

HA, now it works! Thanks so much!
I changed the line to
Code:
if(annot->getContents()!=0) text=annot->getContents()->getCString();
and set text=""; before that.

I only use 'Text' and 'Highlight' annotations so I thought that they always have defined content field even when it is "". I just excluded this part of the code and the extraction of the highlighted text to make it shorter for the post.

also thanks to wlh99!
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.