[OGo-Developer] parsing HTML

Wolfgang Sourdeau developer@opengroupware.org
Wed, 14 Feb 2007 15:37:22 -0500


> It's better to use
> 
> -createXMLReaderForMimeType:@"text/html"
> 
> although the result is the same in this case (unless you provide an 
> alternative driver which is also capable of parsing HTML). It should  
> also be 
> noted that libxml is particularly good at reading b0rked HTML  and 
> producing 
> a reasonable representation of it.

Thanks.


I am currently testing it, but it looks like SaxObjC chokes on 
character entities ("é",  ...), is there a reason why it would 
be the case?


Wolfgang