I am trying to scrape text values from a website. I have been able to parse the url. I am new to XPath in R. So I am not sure how to pull out all the text values that has tag as
'<p class="MsoNormal" align="justify"> text </p>.'
How do I specify the path to the the specific tag and get the text value. This is what I am trying right now.
pizzaraw<-xpathSApply(pizzadoc, "//p[@class='MsoNormal']", xmlValue)
Is this the right approach. R seems not responding to the code.
Its difficult to know what is wrong given that your example is not self-contained but here is a self-contained one that works:
Lines <- '<html> <p class="MsoNormal" align="justify"> text </p> </html> ' library(XML) root <- htmlTreeParse(Lines, asText = TRUE, useInternalNodes = TRUE) doc <- xmlRoot(root) xpathSApply(doc, '//p[@class="MsoNormal"]', xmlValue, trim = TRUE) ##  "text"