Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

nashyo

macrumors 6502
Original poster
Oct 1, 2010
299
0
Bristol
I need to learn how to interact with a HTML webpage via objective c, and render that webpage so that it's suitable for the iphone's limted screen size.

I relise UIWebView would be a great choice, but I only want specific information to show, such as the title, body text, and some images.

I have done a lot of reading around the subject and I believe I need to locate tags within the source HTML code, and assign their object contents to objects within my objective c code. After aquiring the objects I can then assign them to custom UIViews however I like.

Am I on the right track here? Can anyone guide me in the right direction? I'm not looking for any code from you, just some direction.

Thanks
Rob
 
What are you trying to do exactly ? There are better mechanisms out there than parsing HTML, unless you have no control over the source. If you have control over the source, look into JSON or valid XML, there are libraries and frameworks out there for both, and they are easier to deal with than HTML to extract information.

If you want to parse any webpage and offer a "simplified" view, then I'm sorry to say that's going to be quite a bit of work, especially considering that HTML on the Web is quite often very malformed.
 
What are you trying to do exactly ?

I'm generating an RSS Feeder app that presents a table view as the inital view controller. When a feed is selected, I want the webpage to load in a new view and I want the webpage to appear re-rendered, similar to how it re-reneders in Safari when the 'Reader' option is chosen by the user.

The webpage I'm interested in, is not compatible with the 'Reader' feature of Safari.

Any ideas would be appreciated. Thanks.

----------

There is an app called medical news, which is an RSS Feed. The image below shows what happens when a feed is selected.

24oqzx2.png


It looks cool, and it's what I want to study next. How does the webpage change like this, so that it looks cool on an iPhone screen?
 
I don't know how that app works but you should probably look at the RSS feed. It might be that it generates html based on the info from the RSS feed, which it displays in the web view.
 
If i'm correct in what I think you're trying to do you may find some help by looking at this link...

http://stackoverflow.com/questions/3541615/whats-the-best-approach-for-parsing-xml-screen-scraping-in-ios-uiwebview-or

Though personally, I'd probably re-format the feed using my own web server and then connect the iPhone to that instead of the initial feed. It would save

a) Processing power on the iPhone
b) Would be a little more forgiving if the html page changes as you don't have to release a new app you can fix it on your server end
 
If i'm correct in what I think you're trying to do you may find some help by looking at this link...

http://stackoverflow.com/questions/3541615/whats-the-best-approach-for-parsing-xml-screen-scraping-in-ios-uiwebview-or

Though personally, I'd probably re-format the feed using my own web server and then connect the iPhone to that instead of the initial feed. It would save

a) Processing power on the iPhone
b) Would be a little more forgiving if the html page changes as you don't have to release a new app you can fix it on your server end

Thanks for this. Very useful.

A website that has a pdf file attached has source code that looks like this:
Code:
<span class="AdxAttachment"><a href="/Default.aspx?DN=0cd4cc21-7d85-4f4b-8a0b-638600859952" target="_blank" class="title">View a PDF of this factsheet</a><div class="summary"></div></span>

I want to extract this: /Default.aspx?DN=0cd4cc21-7d85-4f4b-8a0b-638600859952

Is there a way of doing it with minimal parsing of html?
 
Last edited:
A website that has a pdf file attached has source code that looks like this:
Code:
<span class="AdxAttachment"><a href="/Default.aspx?DN=0cd4cc21-7d85-4f4b-8a0b-638600859952" target="_blank" class="title">View a PDF of this factsheet</a><div class="summary"></div></span>

I want to extract this: /Default.aspx?DN=0cd4cc21-7d85-4f4b-8a0b-638600859952

Is there a way of doing it with minimal parsing of html?

Break it down and try the simple way first:
1. Find the substring "href=" (use method rangeOfString:)
2. Look at the next character.
3. If it's double-quote or single-quote, find the next instance of that character.
4. Extract the part between the quotes.

You should know how NSRange works when applied to NSString.

Oh, and if the parsing algorithm fails, return the URL of some builtin error page, so you know it failed and can go look at the HTML that caused the failure, and improve your algorithm.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.