The program consumes the name n of a file, reads the file, removes the articles, and writes
the result out to a file whose name is the result of concatenating "no-articles-" with n.
For this exercise, an article is one of the following three words: "a", "an", and "the".
Use read-words/line so that the transformation retains the organization of the
original text into lines and words. When the program is designed, run it on the
Piet Hein poem.
; these requires let us print out debugging info
; and read/write the files to disk
(require racket/base)
(require 2htdp/batch-io)
; the list of articles we are going to strip
(define ARTICLES (list "a" "the" "an"))
; String -> Boolean
; Returns true of false if a word is permitted (eg it is not in
; our list of ARTICLES)
; articles is a List-of-Strings
; word is a String
(define (permitted? articles word)
(cond ((empty? articles) #true)
((string=? (first articles) word) #false)
((cons? articles) (permitted? (rest articles) word))))
; test our permitted? function is working as expected
(check-expect (permitted? ARTICLES "bob") #true)
(check-expect (permitted? ARTICLES "the") #false)
Racket includes a filter method, which does basically 90% of this exercise, but we don't want to use it as then we're not learning anything! So . . . lets roll our own.
; List-of-Strings -> List-of-Strings
;
; Returns a list of strings with the ARTICLES filtered out
; We need to call this function my_filter (or something other
; than filter) as filter is defined by racket/base
;
; A List-of-Strings is one of:
; – empty
; – (cons String List-of-Strings)
(define (my_filter los)
(cond ((empty? los) empty)
((cons? los)
(fprintf (current-output-port) "filter: ~a permit: ~a\n"
(first los)
(permitted? ARTICLES (first los)))
(cond ((permitted?
ARTICLES (first los))
(cons (first los)
(my_filter (rest los))))
(else (my_filter (rest los)))))))
I thought there was a string->word-list type function that I could use here, but apparently not. I probably should have written a quick one, but this will do for now
; Test we are correctly stripping out the non-permitted words
; from our list
(check-expect (my_filter (list "a" "quick" "brown" "fox"
"is" "an" "elephant") )
(list "quick" "brown" "fox" "is" "elephant"))
; processes a list of List-of-Strings, filtering each list in turn
; returns a List-of-Lists-of-Strings
(define (process llos)
(cond ((empty? llos) empty)
((cons? llos) (cons (my_filter (first llos))
(process (rest llos))))))
; test our process function is removing the items from our list as
; we'd expect
(check-expect (process (list (list "a" "cat")
(list "a" "quick" "brown")
(list "fox" "is" "an" "elephant")))
(list (list "cat")
(list "quick" "brown")
(list "fox" "is" "elephant")))
And now we're pretty much finished - we can actually run the program that they asked for. We still have to convert the List of Lists of Strings that process-file is returning to a String that write-file can use, but luckily we just wrote a couple of functions that did exactly that....
; wrap this round a read/write block and add in our new file name
(define (process-file filename)
(write-file (string-append "no-articles-" filename)
(collapse-lines (process (read-words/line
filename)))))
; These last two functions are taken from my answer to
; exercise 146
; List-of-List-of-Strings -> String
; collapses a list of lists of strings into a single string.
(define (collapse-lines llos)
(cond ((empty? llos) "")
((cons? llos) (string-append
(collapse-los (first llos)) "\n"
(collapse-lines (rest llos))))))
; List-of-Strings -> String
; collapses a 1 dimensional list of tokens into a string
; (eg NOT a list of lists of strings, just a list of strings
(define (collapse-los los)
(cond ((empty? los) "")
((= 1 (length los)) (first los))
((cons? los) (string-append
(first los)
" "
(collapse-los (rest los))))))
> (process-file "ttt.txt")
"no-articles-ttt.txt"
dgs@dgs-netbook:~/code/htdp$ cat ttt.txt
TTT
Put up in a place
where it's easy to see
the cryptic admonishment
T.T.T.
When you feel how depressingly
slowly you climb,
it's well to remember that
Things Take Time.
Piet Hein
dgs@dgs-netbook:~/code/htdp$ cat no-articles-ttt.txt
TTT
Put up in place
where it's easy to see
cryptic admonishment
T.T.T.
When you feel how depressingly
slowly you climb,
it's well to remember that
Things Take Time.
Piet Hein
Success!
No comments:
Post a Comment