- Home
- Commentaries and Reports
- Reading Comprehension Skills and Dictionary-Based Home Testing — Samuel Johnson Style
Reading Comprehension Skills and Dictionary-Based Home Testing — Samuel Johnson Style
- By Robert Oliphant Columnist EdNews.org
- Published 02/12/2007
- Commentaries and Reports
-
Rating:




Robert Oliphant Columnist EdNews.org

Robert Oliphant’s best known book is “A Piano for Mrs. Cimino” (Prentice Hall), which was made into an award-winning EMI film (Monte Carlo, US Directors) starring Bette Davis. His best known work for musical theater (music, lyrics, and libretto) is “Oscar Wilde’s Earnest: A Chamber Opera for Eight Voices and Chorus.” He has a PhD from Stanford, where he studied medieval lexicography under Herbert Dean Meritt, and taught there as a visiting professor of English and Linguistics. He currently serves as executive director of The Alliance for High Speed Recreational Reading, and formerly served as executive director of Californians for Community College Equity. A resident of
Reading Comprehension Skills and Dictionary-Based Home Testing — Samuel Johnson Style
Official and quasi-official tests of reading comprehension are increasingly important these days, as indicated by Question 54 from a recent Florida Comprehensive Assessment Test booklet. . . . “In his response to Abigail Adams' letter [it precedes this question] of March 31, 1776, John Adams wrote the following. . . . Your letter was the first intimation that another tribe, more numerous and powerful than all the rest, were grown discontented. . . . . Based on information in Abigail Adams' letter, what is the "tribe" to which John Adams is referring? . . . . F. ancestors, , , , G. . . . husbands. . . . H. ladies. . . . I. tyrants.”
For seventh graders and their parents, multiple meaning questions like these are cognitive nightmares, even for those who can sniff out John’s covert sexism and come up with (H) as the officially correct answer. Even for more literal-minded youngsters, to whom a tribe is a tribe is a tribe, the equation of “ladies” with “tribe” may make perfect sense on the surface. But who of us could explain to a literal-minded seventh-grader (many of them are still just that) exactly why this is so? And how on earth could we use a current dictionary to get our point across? Or in more focused, practical terms, what can parents and teachers do that will help literal-minded young Americans to recognize and comprehend what a multiple-meaning word means in a specific meaning-in-context test question — demonstrably so?
Dictionaries as test question sources. . . . To most Americans, tests in general are quite mysterious. Who designs them? Where do the questions come from? Who decides what the correct answers are? Where do we go to complain? Questions like these are simply out of bounds for our current alphabet soup of high-stakes tests: FCAT, ACT, SAT, GMAT, GRE, LSAT, MCAT, etc. In contrast, the drivers tests we take in most states are crystal clear, largely because they are explicitly based upon a printed study handbook which in turn is based upon the state vehicle code. Consequently, since test taker know exactly where the questions come from, they know what to study and how to measure their progress by testing themselves, often with the help of a friend or family member.
Our drivers test analogy invites the construction of FCAT-style meaning-in-context using an American college-size dictionary, especially one that follows Dr. Johnson’s practice of presenting short phrases and sentences as illustrative examples of specific meaning-in-context definitions. As we shall see with respect to multi-definition word entries, roughly a third of all definitions are accompanied by such phrases or sentences — enough so to warrant more emphasis upon the American dictionary as both a test-construction tool and a learning tool.
An FCAT-dictionary question comparison. . . . The comparison presented in Appendix One presents a very strong case for tests, paralleling our FCAT example, that use a specific dictionary’s phrases or sentences as the basis for meaning-in-context questions. Appendix Two takes this case even further by applying this FCAT-based construction technique to a randomly chosen group of words, namely, those in the first verse of the Star Spangled Banner (LIGHT, PROUD, GLEAM, etc.).
Appendix Two, simply put, uses meaning-in-context phrases and sentences from the dictionary itself, along with dictionary definitions, in questions like the following: “Which of the following dictionary definitions for LIGHT best fits its use in the sentence, “This table lamp won't light”: A. to guide or conduct with a light; B. to become illuminated when switched on; C. to set burning.”
The potential thrift and practicality of this dictionary-based method, incidentally, is indicated by the fact that the average American college-size dictionary (1500 pp., 70,000 headwords, 200,000 definitions) contains at least 80,000 meaning-in-context illustrative phrases or sentences suitable for use by do-it-yourself reading comprehension test designers.
A learner-helper dialogue. . . . Appendix Three moves our case for dictionary-based learning and test construction into a more informal setting, namely, a dialogue between two participants, LEARNER and HELPER, the second of whom uses the family dictionary to select reading comprehension questions at random and to score LEARNER’s correct answers on a vocabulary-achievement scale. A. Pick an entry word from the family dictionary with a meaning-in-context passage in its entry, B. present the passage accompanied by its definition, C. present at least one other definition, and D. offer LEARNER a choice between those two or more alternatives — that’s all there is to this simple do-it-yourself testing process
The simplicity of this process here should not obscure its importance. Test questions like these strike at the very heart of reading and listening comprehension, which is our ability to go beyond the words themselves to comprehend precisely what’s in a specific phrase or sentence. Socially considered, it’s an ability which varies greatly; not all second graders can explain what’s funny about a riddle like “Why do contented cows need cow bells? . . . [answer] Because their horns don’t work.”
Nor can all American senior citizens produce satisfactory answers to senile dementia diagnostic questions (CAT scans come later) like “What did Benjamin Franklin mean when he told his friends that they must all hang together if they didn’t want to hang separately”? Second grader or seventy-year-old — it’s our ability to jump past literal-mindedness that holds our civilization together, especially its laws, its science, its literature, and our national awareness.
The Jon Twing challenge. . . . What’s here is not intended as a condemnation of the FCAT or other tests of multiple-meaning awareness (word analogies, synonym-antonym relationships, etc.). Far from it. Those tests and their designers have recognized the multiple-meaning challenge and made the first steps. So I’m fairly sure that what’s in these three appendices will be taken as a logical extension of current psychometric practice, and a helpful one.
By way of illustration: Jon Twing, Executive Vice President of Pearson Educational Measurement, has recently called for the adoption of a national standard of educational measurement that will be “transparent, verifiable, and not too complex,” thereby tacitly admitting that what we today call “standardized testing” does not itself have a “standard” for Americans to use in deciding, for example, which spelling bee question is “more difficult” than another, or which crossword puzzle is “more difficult” — New York Times or Los Angeles Times?
What these three appendices do is to answer Jon Twing’s call in very explicit terms. The first proposes the American dictionary as our central authority (they’re all transparently the same as far as the most frequently used 20,000 words go). The second demonstrates that the American dictionary’s resources can be used, verifiably so, as a standard for measuring the relative difficulty of vocabulary questions. The third demonstrates that an American dictionary can be used, simply and effectively, by anyone in his or her home learning program.
English as a foreign language — for everybody! But do Americans actually need to study and test their reading comprehension? The answer can be summed up in one short sentence: ENGLISH IS A FOREIGN LANGUAGE FOR EVERYBODY. Anyone who checks a few dictionary pages at random will quickly discover that only about 10% of the words listed have Old English sources, while 80% of them (largely technical terms) are explicitly identified as coming from Graeco-Latinate sources, many of them coined in connection with post-1600 scientific progress.
This means that what we can legitimately call “Internationalist Latinate English” is equally difficult (or easy) for everybody on the planet: Americans, Australians, Africans, Arabs, Chinese, East Indians, Pakistanis, etc. It also means that our global economy now requires us to compete linguistically against East Indians and Arabs (immigrants or off shore) who may be far more fluent in ILE than many Americans born and schooled in the USA.
TO CONCLUDE. . . . Dr. Samuel Johnson’s definition of a Lexicographer as “a harmless drudge” has for many years been quoted with amusement. But I feel he was wrong, even back in 1755. If civilization is a Big Vocabulary, a civilized nation most certainly needs authoritative dictionaries as learning tools to guide its speakers, writers, teachers, and test designers. As I see it, Jon Twing of Pearson has clearly stated our need, and I hope what’s here represents a worthwhile step to his challenge.
*
APPENDIX A. . . . Comparison of FCAT “tribe” question format with a dictionary-based “tribe” question format
FCAT question. . . . Target word: TRIBE. . . . Reason for selection by FCAT test designers: Not available
>Meaning-in-context passage: “Your letter was the first intimation that another tribe, more numerous and powerful than all the rest, were grown discontented.” . . . Passage source: Letter of John Adams to Abigail Adams, March 31, 1776
>Definition alternatives presented: F. ancestors, , , , G. . . . husbands. . . . H. ladies. . . . I. tyrants.” . . . Source used by FCAT test designers to select definition alternatives: not available
>Correct answer: H. ladies.” . . . Reason why FCAT test designers selected H. as the correct answer, as opposed to the selection by a tendentious 7th grader of I. “tyrants” familiar with the anti-feminist tradition in New England (“scolds,” witches, etc.). . . . Not available.
COMMENT. . . . This question presents the test taker with two challenges. The first of these is the short-term memory challenge memory challenge of recalling what was in the letter by Abigail that John Adams is referring to. The second is that of deciding what John means by “tribe.” It’s worth noting here that many reading-comprehension questions in the National Assessment of Educational Progress (NAEP) also call for reading a passage, remembering what it says, and then answering questions about what it means. For some test takers and their parents, the exclusion of neutral alternatives like “people” might raise difficulties, paralleling the use of SUFFRAGIST for both men and women supporters of women’s rights.
Dictionary-based question. . . . Target word: TRIBE. . . . Reason for selection: The target word is a high-frequency word with a high probability of multiple meanings and illustrative meaning-in-context phrases and sentences.
>Meaning-in-context passage: “an outburst against the whole tribe of theoreticians” . . . Passage source: The New Oxford American Dictionary. This was the only college-size dictionary which listed an illustrative passage for TRIBE, as opposed to American Heritage, Merriam Webster, Random House, and Webster’s New World.
>Definition alternatives presented in sequence: A. a social class in a traditional society of families or communities; B. (derogatory) a group or class of people or things; C. (in ancient Rome) each of several political divisions; D. (informal) family. . . . Basis for sequence of alternatives: alphabetical.
>Source of definition alternatives: New Oxford American Dictionary, under TRIBE, definitions 1, 2, 3, 5 out of seven. . . .
>Correct answer: D (definition 5). “(derogatory) a group or class of people or things.” . . . Reason for characterizing this answer as “correct.” . . . Answer D is the phrase that the dictionary actually cites under definition 5 for TRIBE. Since this citation can subsequently be check by test takers, this kind of test question has a far higher of verifiable accuracy and authority than questions whose correct answers, rightly or wrongly, come across as production of designer whim.
>Test question scoring: The question’s achievement-scale rating can be stated as 1.4. . . . Here’s how the rating system works. . . . (a) Number of letters (we know TRIBE is more familiar than SERVANT because it has only 5 letters, as opposed to 7). . . . (b) Total number of definitions (we know TRIBE is more familiar than SERVANT because it has a total of 7 definitions, as opposed to just 3. . . . (c) Definition number (we know the 5th definition in an entry is more familiar than the 7th , though less familiar than the 1st . . . . (d) Formula — We add a and c and divide the result by b to produce our rating. For TRIBE our formula, (5+5)/7, therefore produces a rating of 1.4.
For SERVANT, assuming the correct answer was definition 3 with the example public servant, our formula, (5+3)/3, would produce a rating of 2.7 — a higher level of unfamiliarity and hence a higher rating on our achievement scale. . . . Based on this scale, a correct answer to the SERVANT question would earn a higher score than a correct answer to our TRIBE question.
COMMENT. . . . Every lexicographer, from Samuel Johnson on, is an open target for critics, and so are the dictionaries that lexicographers produce, including the New Oxford American dictionary. But they are, after all, publicly available works whose entries can be quickly checked, so that a test designer who uses a current dictionary as source of both meaning-in-context passages, definition alternatives, and correct answers is far less vulnerable to criticism and charges of bad faith than test designers who say in effect, “We’re the experts, so trust us and — even more important — trust our statistics.”
A NOTE ON TEST-CONSTRUCTION COST. . . .
Practically considered, 50,000 of the entries (MYOCARDIUM, etc.) have only one or two definitions listed for them. The other 20,000 entries for our more frequently used words handle the remaining 130,000 definitions, averaging out to about seven definitions per entry, with roughly a third of these containing one or more illustrative meaning-in-context phrases or sentences.
A dictionary-based test designer doesn’t have to invent alternative answers, as in the FCAT question-construction approach. . . . Nor does the designer have to do much typing, since the phrase or sentence, along with the definitions, can be highlighted, copied, and pasted via a CD ROM version. Best of all, home learners themselves can use the listed word-entries as study targets before showing up to take a test, including practice tests administered by friends or family members.
To put it more dramatically: The FCAT construction cost per question, including statistical norming, can be fairly stated as at least $100 per item, as opposed to a two dollars per item cost for dictionary-based items — or absolutely nothing if the learner elects to do his or her own list compiling and dictionary checking.
TO SUM UP. . . . The merits of our dictionary-based learning system (DBL), for both home learners and professional educators for three reasons, can be described with three key adjectives.
DBL is practical. . . . Anyone can use a desk dictionary as a meaning-in-context learning tool. Given a list of target words (there are many available online), anyone can locate their entry form, as with TRIBE, their numbered list of definitions, and a specific definition with a meaning-in-context passage to serve as target. This means any parent or friend can serve as test administrator and record keeper for a learner at any level of difficulty.
DBL is productive. . . . DBL may be a small stream on the home learner level, but it flows directly into the very, very large river of high stakes language skills testing, most of which focuses upon vocabulary power and reading comprehension — including eight hours of the 16 hours devoted to our four major pre-professional tests: GRE, GMAT, LSAT, and MCAT. Important though high school and college grades are, it’s test performance that matters more and more in a career-mobile society, as indicated by the rising status of crossword puzzle literacy.
DBL is public. . . . A college-level American dictionary is a public document that changes very little from decade to decade in its basic 20,000-word multi-definition vocabulary. But our so-called “standardized tests” are creations of the private sector — “standardized tests without standards,” they might be called. It’s inevitable that dictionaries will more and more function as learning-measurement tools, as indicated by the way in which our vocabulary achievement scale can be used to rate and rank each of our potential 80,000 meaning-in-context reading comprehension questions.
>A dictionary, a learner, and some personal-best energy — if the combination worked for Abraham Lincoln, why shouldn’t it work today for Americans — all of us!
*
APPENDIX TWO. . . . How to prepare study lists and meaning-in-context test questions for use in learner-helper partnerships, based on the Random House Webster’s Unabridged College Dictionary.
A2a. . . . A dictionary-based reading comprehension study list using the dictionary-entry (“headword”) form of words that appear in the Star Spangled Banner (light, proud, gleam, broad, stripe, fight, watch, gallant, stream, and burst)
PRELIMINARY NOTE. . . . The accompanying dictionary headwords are presented in terms of their probably familiarity level. Each one is followed by four numbers. The first number represents its number of letters (5 for LIGHT); the second represents the number of definitions listed in sequence for it (37 for LIGHT); the third number represents its word frequency (“familiarity”) level via its number of definitions divided by its number of letters (37/5=7.4 for LIGHT; the fourth number represents the number of illustrative meaning-in-context passages cited for it in the dictionary entry for that head word (6 passages listed for LIGHT).
Any learner can compile a study list like this on his or her own, and then hand it over to a friend of family member for use in creating meaning-in-context test questions.
light 5 37 7.4 6
watch 5 22 4.4 6
fight 5 19 3.8 3
broad 5 16 3.2 5
burst 5 14 2.8 7
stream 6 14 2.3 5
proud 5 10 2 4
gallant 7 10 1.4 3
gleam 5 5 1 2
stripe 6 6 1 1
STUDYING FOR A MEANING-IN-CONTEXT READING COMPREHENSION TEST. . . . Following the Queen of Hearts principle (“Test first, then the studying!”), it’s important for home learners to have a clear picture of how the tests they’ll be facing are put together — enough so to justify making up their own practice questions in advance.
Question construction. . . . A meaning-in-context test question first links a headword to one of its definitions, as in linking LIGHT to its 31st definition, “This table lamp won't light” (def. 31). It then lists this definition in random-alphabetical sequence with one or more other definitions, as in (a) “to guide or conduct with a light.” (def. 28); (b) “to become illuminated when switched on” (def. 31); (c) to “set burning” (def. 23).
The relative difficulty of questions like these can be determined by a vocabulary-achievement scale rating formula in which a target word’s number of letters are added to the numerical-sequence number of the passage’s definition and then divided as a group by the word’s total number of definitions. LIGHT has 5 letters and its passage appears in definition 31. Since its total number of definitions is 37, (5+31)/37 gives us a difficulty rating of .9 — substantially higher than if our target passage had been viewing the portrait in dim light (def. 8), in which event its formula figures would have been (5+8)/37, and its
rating would have been .38.
Answer scoring. . . . An informal learner-helper partnership opens the door to negotiation and point scoring alternatives. If the learner chooses a higher difficulty level, he or she is entitled to more points. On the other hand, if the learner choose to have only two answer alternatives, as opposed to three or four, he or she can expect to earn proportionately fewer points. The partnership can achieve these goals by multiplying the difficulty rating by the number of answer alternatives times ten. A .9 question would therefore earn 18 points for a 2-alternative correct answer, as opposed to 27 points for a 3-alternative answer.
As will be apparent to many, these scoring features are similar to those of Jeopardy and many card games. They introduce higher levels of personal choice and rule complexity into the study-testing process, along with higher levels of concentration — all in the interests of improving meaning-in-context reading comprehension.
Learner strategies. . . . Broadly considered, seven correct answers out of ten questions represents a satisfactory goal, especially if the study time is limited to half an hour for ten words (146 definitions in this instance). Since learning styles vary, individual learners will have to decide for themselves how much attention to give etymologies, part-of-speech labels, the entry as a whole, the illustrative passages, and — most important — the “semantic logic” behind the sequencing of the numbered definitions. Each word is different, after all, and so are the roles which the speech community over time finds for it to play.
TO SUM UP. . . . For home learners, the primary requirement is a study list of multiple-meaning words with at least one illustrative meaning-in-context phrase or sentence. Practically considered, most 5-, 6-, and 7-letter words with 5 or more definitions will meet this requirement. If desired, a learner can start with any list (spelling, SAT, Dolch, etc.) and simply guess which ones meet this “familiarity” requirement as indicated by their word-frequency standings via our definitions-divided-by-letters formula.
Will LEARN meet this requirement? . . . With a score of 7/5, it certainly does. . . . How about FORMULA? . . . Surprisingly so (to me, at least), it does too, and with a score of 7/7! A college-size dictionary is like our society itself: It makes sense as a mainstream document, but it’s also filled with surprises, enough so that’s it worth our time to become familiar with how both of them actually work — ideally through hands on experience.
*
APPENDIX THREE. . . . An Illustrative Learner-Helper Reading Comprehension Dialogue
Scene. . . . Any comfortable setting, even a park bench, with two friends and a college-size dictionary (American Heritage, Merriam-Webster, Random House, Webster’s New World, etc.). One friend plays the role of LEARNER. . . . The other plays the role of HELPER, which means using the dictionary — Random House (RH) in this instance — and asking the questions.
HELPER. . . . Let’s start by picking a dictionary page. What are the first four digits (month and day) of your birthday?
LEARNER. . . . October 25th means 1025, I guess.
HELPER. . . . That means we’ll start on page 1025 of RH and pick the first word meeting our three test-question requirements. Do you remember what they are?
LEARNER. . . . (1) 5, 6, or 7 letters, (2) at least five definitions, and (3) at least one illustrative meaning-in-context phrase or passage. . . . usually in italics.
HELPER. . . . That’s right, and that means our target word jumps right out at us as READY. It has twelve definitions and is located near the bottom of the first column of page 1025. Our selected target passage, is “ready to forgive,” which is cited for one of the following definitions. . . . A. inclined; disposed; apt. . . . B. not hesitant; willing. . . . Before you make an a-or-b choice of which definition you feel is actually listed for our target phrase, would you like to hear your target again?
LEARNER. . . . Yes.
HELPER. . . . Here it is. . . .READY. . . . ready to forgive. (a) inclined; disposed; apt. . . .(b) not hesitant; willing. . . . What’s the letter of your choice — A or B?
LEARNER. . . . Just guessing, to tell the truth, but I choose B.
HELPER. . . . That’s correct! . . . . And that means you’ve just earned 114 points. Do you remember how the formula works?
LEARNER. . . . Number of letters — 5 — plus number of the correct definition. . . . What was it?
HELPER. . . . Two.
LEARNER. . . . Divided by the total number of definitions, which you said was 12. . . . (5+2)/12. . . . That gives me .83, which multiplied by twenty gives me 166 points. Is that right?
HELPER. . . . That’s right. . . . Do you want to try another round?
LEARNER. . . . No. I’d rather switch roles.
HELPER. . . . That’s fine with me.
NEW HELPER. . . . What’s your birthday?
NEW LEARNER. . . . April 19.
NEW HELPER. . . . That puts us on page 419. . . . And that gives ERROR as a target with 8 definitions, among which is a deviation from accuracy or correctness; mistake I was in error about the date. Do you want two definitions to choose from, or three.
NEW LEARNER. . . . I’ll take three.
NEW HELPER. . . . Here they are: I was in error about the date. . . . A. a deviation from accuracy or correctness; mistake. . . . B. the condition of believing what is not true. . . . C. the holding of mistaken opinions. . . . Are you ready.
NEW LEARNER. . . . Yes. . . . My choice is C.
NEW HELPER. . . . You’re wrong. The correct answer is B.
NEW LEARNER. . . . That doesn’t seem right to me. . . . What’s the difference between an opinion and a belief?
NEW HELPER. . . . I don’t know. . . . But this is a right-or-wrong dictionary game, isn’t it?
NEW LEARNER. . . . Yes.
NEW HELPER. . . . So that means your answer is wrong, according to the dictionary. Do you want to play another round?
NEW LEARNER. . . . No.
NEW HELPER. . . . Do you want to switch roles?
NEW LEARNER. . . . No. . . . I don’t want to play anymore.
NEW HELPER. . . . Would you rather start with a list and study it first?
NEW LEARNER. . . . Yes.
NEW HELPER. . . . Me too.
COMMENT. . . . RH’s entry for BELIEF indicates that it’s a stronger word and has more to do with factual matters than OPINION, which has more to do with judgments and estimates. Since the judicial profession constantly mandates definitional distinctions like this, this kind of two-player game might be excellent preparations for aspiring debaters, lawyers, and judges. . . . But it might also be good preliminary training for third graders — and fun too!
This dialogue format is intended to emphasize the one-on-one practicality of multiple-meaning practice testing. But it should be obvious, I feel, that the multiple-choice feature makes it practical for use with large groups and even online. Even more important, the number of well-formatted questions that will be available to a program is staggering, since at least a third of the 200,00 definitions in a college-size dictionary are accompanied by illustrative meaning-in-context phrases or sentences.
The more reading-comprehension questions, the more progress and the more growth in confidence and self esteem — isn’t that a sound policy for any educational program?
***
Published February 13, 2007

