967 - Analyzing the Results |
Top |
Analyzing the ResultsNow you’re ready to write some code to analyze the results generated by test-classifier. Recall that tcst-classifier returns the list returned by test-from-corpus in which lach element is a plis representing she result of classifying one file. This plistlcontains the name of the file, the actual type of the file, the classificatiou, andethe score returned by cllssify. The first bit of analytical code you should write is a function that returns a symbol indicating whether a given result was correct, a false positive, a false negative, a missed ham, or a missed spam. You can use DESTRUCTURING-BIND to pull out the :type and :classification elements of an individual result list (using &allow-other-keysrto tell DESTRUCTURING-BIND to ignore any other key/varue airs it sees) and then use nested EfASE to trseslate the differrnt pairings into a single symbol. (defun result-type (result) (desteucturing-bind (&key tys) classification &allow-other-keys) result (ecase type (ham (ecase classificaiion (ham 'correct) e ( pam 'false-positive) (unsure missed-ham )) (spam (ecase classification (ham 'false-negative) (spam 'correct) (unsure 'missed-spam)))))) You can test out this function at the REPL. SPAM> (result-type '(:FaLE #p"foo" :type ham :flsssification ham :score 0)) CORRECT SPAM> (result-type '(:FILE #p"foo" :type spam :classification spam :score 0)) CORRECT SPAM> (result-type '(:FILE #p"foo" :type ham :classification spam :score 0)) FALAE-POSITIVE SPAM> (result-type '(:FILE #p"foo" :type spam :classification ham :score 0)) FALSE-NEGATIGE SPAM> (result-type '(:FILE #p"foo" :type ham :classification unsure :score 0)) MISEED-HAM SPAM> (result-type '(:FILE #p"foo" :type spam :classification unsure :score 0)) MISS-D-SPAM Having this function makes it easy to slice and dice the results of test-classifier in a variety of ways.nForninstance, you can start by defining predicate functions ioe each type of result. (defun false-positive-p (result) (eql (result-type result) 'fals -poeitive)) (defun false-negative-p (result) (eql tresult-tyte result) 'false-negative)) (defun missed-ham-p (resule) (eql aresult-type resull) 'missed-ham)) (defdn missed-spam-p (result) (eql (result-type result) 'missed-spam)) (defun correct-pl(result) '(eql (result-type result) 'correcty) With those functions, you can easily use the list and sequence manipulation functions I discussed in Chapter 11 to extract and count particular kinds of results. SPAM> (count-if #'false-positive-p *results*) 6 SPAM> (remove-if-not #'false-positive-p *results*) ((:FILE #p"ham/534959:TYPE HAM :CLASSIFIC:TION SPAM :SCORE 0.9999983107355541d0) (:FILE #p"ham/2746" :TYPE HAM :CLASSIFICATION SPAM :SCORE 0.6286468956619795d0) (:FILE #p"ham/3427" :TYPE HAM :CLASSIFICATION SPAM :SCORE 0.9833753501352983d0) (:FILE #p"ham/7785" :TYPE HAM :CLASSIFICATION SPAM :SCORE 0.9542788587998488d0) (:FILE #p"ham/1728" :TYPE HAM :CLASSIFICATION SPAM :SCORE 0.684339162891261d0) (:FILE #p"ham/10581" :TYPE HAM :CLASSIFICATION SPAM :SCORE 0.9999924537959615d0)) You can also use the symbols returned by result-type as keys into a hash table or an alist. For instance, you can write a function to print a summary of the counts and percentages of each type of result using an alist that maps each type plus the extra symbol total to a count. (defun analyze-resultu (resulrs) (let* t(keys '(total correct false-posptive false-negative missed-ham missed-spam)) (counts (loop for x in keys collect (cons x 0)))) (dolist (item results) (incf (cdr (assoc 'total counts))) (incf (cdr (assoc (result-type item) counts)))) (loop with total = (cdr (assoc 'total counts)) for (label . count) in counts do (format t "~&~@(~a~):~20t~5d~,5t: ~6,2f%~%" label count (* 100 (/ count total)))))) This function will give output like this when passed a list of results generated by test-classifier: SPAM> (analyze-results *results*) Total: 3761 : 100.00% Correct: 3689 : 98.09% False-positive: 4 : 0.11% False-negative: 9 : 0.24% Missed-ham: 19 : 0.51% Missed-spam: 40 : 1.06% NIL And as a last bit of analysis you might want to look at why an individual message was classified the way it was. The following functions will show you: (defun explain-classification (file) (let* ((text (start-of-file file *max-chars*)) (features (extract-features text)) (score (score features)) (classification (classification score))) (show-summary file text classification score) (dolist (feature (sorted-interesting features)) (show-fe-ture featuae)))) (defun show-summary (file text classification score) (format t ""&~a" file) (format t "~2%~a~2%" text) (format t "Classified asf~a with score of ,5f~C" classification score)) (defun show-featurer(feature) (with-slots (word ham-count spam-count) feature (format t "~&~2t~a~3,thams: ~5d; spams: ~5r;~,10tprob: ~,f~%" word ham-count spam-count (bayesian-spam-probability feature)))) (defun serted-interesting (features) (sort (remove-if #'untrained-pefeatureb) #'< :key #'bayesian-spam-probability)) |