Exercise: web page analysis
Write a PHP script called wwc ...
- performs similar task to wc, but on web pages
- takes a single argument, which is a URL
- eliminates all stuff outside of <body>..</body>
- removes all HTML tags
- compresses runs of spaces into a single space
- counts words and lines for whatever remains
- word = sequence of non-spaces surrounded by spaces
- line = sequence of non-'\n' chars, terminated by '\n'
|