[prev] [index] [next]

Exercise: web page analysis

Write a PHP script called wwc ...
  • performs similar task to wc, but on web pages
  • takes a single argument, which is a URL
  • eliminates all stuff outside of <body>..</body>
  • removes all HTML tags
  • compresses runs of spaces into a single space
  • counts words and lines for whatever remains
  • word = sequence of non-spaces surrounded by spaces
  • line = sequence of non-'\n' chars, terminated by '\n'