ARC | Teaching By Science

Compañía estadounidense de lectura

The American Reading Company ofrece varios programas de lectura en línea para los grados K-12. El programa básico posiblemente podría definirse como un programa de alfabetización equilibrada; sin embargo, la mayor parte del programa se centra en la escritura y la comprensión, incluso en el jardín de infancia. El plan de estudios que enumeran en su sitio web incluye la conciencia fonémica, pero solo en el jardín de infantes y solo en la primera unidad. No pude encontrar ninguna mención de fonética o morfología. Por esta razón, diría que el programa parece ser más un lenguaje completo que una alfabetización equilibrada. El programa enfatiza el uso de lectores nivelados y la instrucción de escritura correspondiente.

Idealmente, usaríamos un metanálisis revisado por pares para evaluar la eficacia de un programa de lectura. Sin embargo, tal programa de lectura no existe en este caso. Pude encontrar dos estudios sobre el tema.

El primer estudio fue un ECA, realizado por Grace, et al, realizado en 2020, en estudiantes de 1.° y 2.° grado. Este estudio comparó el uso de su programa con un enfoque habitual. Los autores no mencionaron ningún hallazgo para la decodificación y la escritura, a pesar de realizar pruebas en esta área, y solo indicaron que se encontraron resultados equívocos. Por esta razón, codifiqué estos tamaños de efecto como 0. El estudio incluyó a 1589 estudiantes y pude encontrar un tamaño de efecto medio de .15.

El segundo estudio fue un estudio cuasi-experimental realizado por DuCette, et al en 1999, en los grados 1-3. El estudio incluyó a 3110 estudiantes. El estudio también hizo que los estudiantes intentaran leer 100 libros en un grado. Este estudio mostró un resultado medio de .63. Incluyeron tamaños de efecto para el general y el grado 3, pero no para los grados 1 y 2.

It is important to note that in the above balanced literacy guidelines, there is no mention of teaching phonics with a scope and sequence, or decodables. I have seen many interpret this paper as to suggest that phonics should be taught within a balanced literacy context, as needed by the individual student and their needs. Therefore, the definition of balanced literacy provided by (Pressley, 2001) does not meet the (NRP, 2000) definition of a systematic phonics program, but rather of a Whole Language program. In my opinion the theoretical difference between Balanced Literacy and Whole Language is practically meaningless.

In my personal experience examining balanced literacy programs, these approaches typically emphasize cueing instruction over decoding instruction, rely on leveled texts over decodable texts, and teach phonics embedded within fluency instruction, opposed to explicitly in isolation. For example, while a systematic phonics approach might teach the /ph/ grapheme and its associated /f/ sound, followed up with related fluency practice. Whereas, a balanced literacy or whole language approach might wait for a student to struggle with a word using the <ph> grapheme and then teach it.

Therefore, in order for me to classify the ARC program as a systematic phonics program and improve their grade, I would want to see the program teach phonics based on an explicit scope and sequence, as well as include decodable texts. In an attempt to accurately answer this question, I interviewed members of the ARC team, on multiple occasions, for hours at a time, I reviewed their updated materials, and interviewed teachers who had used the program.

Teacher Feedback

I interviewed several anonymous teachers who had used the ARC program. Truthfully these teachers were very angry. They felt the program lacked sufficient decoding instruction, encouraged “three cueing”, did not use decodable texts, and encouraged word guessing. I spoke about these concerns with the ARC leadership team. They acknowledged that in the past the ARC program encouraged students to look at the picture to identify unknown words. However, they also claimed adamantly that this was no longer the case. The teachers I interviewed had used the program recently; however, they were not able to confirm for me whether or not their resources were the updated ones. ARC shared with me testimonials from educators that had used their program. Some of these educators shared that they liked that the program had a strong focus on writing, knowledge building, authentic texts and included systematic phonics instruction.

Scope and Sequence

Within their updated program there are over 60 individual graphemes explicitly taught, as well as additional blends and analytic word families. Similar to the UFLI scope and sequence, the sequence spirals so that concepts are taught multiple times. The phonics scope and sequence also comes with a phonemic awareness sequence that explicitly teaches multiple phonemic awareness concepts. I was also pleased to see that their scope and sequence for phonemic awareness focuses mostly on segmenting, blending, and isolating. This factor is important to note, as many other programs focus on onset rhyme, manipulation, and deletion, which have less research to support their efficacy.

Texts Selection

In regards to text selection the ARC program now makes use of decodable texts, leveled texts, and predictive texts. While, using a combination of texts might be an unpopular choice, (Shanahan, 2019) [lead author for the NRP report] points out that the research on this topic is limited. He and many other scholars have suggested that using a combination of text types may present the most benefits for early readers. However, he also notes that highly predictable texts encourage guessing and may be less beneficial for students. In my interview with members of the ARC leadership team, they indicated that some predictive texts were used in the ARC program, not for the goal of increasing word recognition, but for increasing student knowledge of background knowledge vocabulary.

That said, their decodable texts are not likely what many mean when they use the term decodable. Typically the term decodable text, refers to a book, which is targeted towards a specific lesson or weekly unit, within a phonics scope and sequence. For example, in the Pedagogy Non Grata reading program, the first week's lessons target the letters “s,a,t,p,i,n” and the corresponding decodables only use these letters.That said, while this is colloquially what is meant by the term decodable, I am not yet aware of any scientific research showing this is necessary for teaching students how to read. Moreover, as I have seen Dr. Nell Duke point out on social media, any word is decodable, if a student can decode it. With the ARC decodables the words are not scaffolded by each lesson, or week, but rather by larger unit structures. For example, the first unit of the ARC kindergarten phonics program includes 25 letters, graphemes, or blends. While this is an unusual choice, that might lead some to believe the ARC decodables are not decodable, I am not aware of any substantial research that can be pointed to discredit the practice.

Phonics Lessons

Whole class phonics instruction was less explicit than I would have liked. Rather than explicitly explain phoneme grapheme correlations, phonics lessons featured short texts, with a number of examples of the grapheme being explored. Teachers were then encouraged to explore the relationship between the grapheme and the sound. However, small group phonics lessons, designed to support students with specific decoding needs, were far more explicit and in my opinion, much better. The program does include some instruction on analytic phonics, blends, and high frequency words. All of these choices are less common with modern structured literacy programs. However, to the best of my knowledge there is no research that can be pointed to, to suggest that the ARC creators are “wrong” for including these elements. That said, these design choices may lead to some resistance with structured literacy advocates. For the purposes of transparency, I have attached examples of ARC phonics lessons in the references section, with permission from the American Reading Company.

What Does the Arc Research Show?

Evidence for ESSA:

John Hopkins University did a review of the Evidence for ESSA research. They evaluated one study and examined a sample of 792 students. They rated the study rigor as strong, and found a mean effect size of .14, which according to Cohen’s guide is negligible. That said, there is debate on how low an effect size has to be, to be truly negligible , (Kraft, 2005) suggested that rigorous RCT, with effect sizes above .05 should be considered moderate. However, we would disagree with that finding at Pedagogy Non Grata. Our review of studies reviewed by Evidence for ESSA, rated strong, and with large sample sizes found a mean effect size of .13 (k=33) [.13, .21]. None of these studies were negative and only one of these studies showed an effect size below .05.

Our Evaluation:

To evaluate the efficacy of ARC, we conducted a review of ARC studies. We searched their company website, Education Source and the ERIC database. On the company website we found 15 studies. However, only 2 of these studies used an experimental model to examine the efficacy of the program. The other 13 studies were single group design and thus excluded. On the Education Source database, we located 3 articles, of which none were experimental. On the ERIC database, we found 22 studies, with the search terms “American Reading Company”. However, none of these studies were specific to the American Reading Company program.

In order to examine the efficacy of the ARC program, we examined the two experimental studies conducted. The first study was conducted by Abigail Gray, Philip Sirinides, Ryan Fink, and Brooks Bowden. Their study used an RCT design to evaluate the effects of ARC for kindergarten students. “Data were collected from 71 classrooms (treatment and control) in 21 schools, encompassing 1,589 students in two kindergarten cohorts.” Each cohort received the program for one school year. Treatment group instruction was compared with business as usual instruction and examined across multiple standardized tests, including the WRMT, KRMS, AIMSweb, KTEA. The study authors calculated their own effect sizes using the Glass Delta formula. “Glass's delta, which uses only the standard deviation of the control group, is an alternative measure if each group has a different standard deviation.” (Social Science Statistics, n.d.). The authors calculated effect sizes both for the students intended to be treated (ITT) and the students actually treated (TOT). We tabulated the (TOT) effect sizes in the below chart to model the findings of the Gray 2021 study.

The second study was conducted by Dr. Joseph DuCette for Temple University. This study used a quasi-experimental design to examine the efficacy of using the ARC program in conjunction with the 100 book challenge. In the 100 book challenge, the students are given a library of ARC books and reading logs. The students are then encouraged to read at least 100 books each. The study included 3317 grades 1-3 students from 12 schools. Results were measured with the Stanford-9 standardized test. We calculated effect sizes for this study using Cohen’s d. Effect sizes were calculated by dividing the mean difference between the treatment group and the control group and dividing it, by the pooled standard deviation. SDpooled = √((SD12 + SD22) ⁄ 2). The effect size was calculated by the first review author and then replicated by the second review author to insure validity. The study showed an effect size of .54 for reading comprehension, and .80 for general reading achievement. Students in the treatment group were also 16.59% more likely to be reading at grade level, than in the control group (p = .024).

What Do These Results Mean?

To the best of my knowledge there have been two experimental or quasi-experimental studies on ARC. However they show rather opposite results. The (DuCette, 1999) study showed a mean effect size of .67 suggesting a large result and the (Gray, 2021) study showed a much lower effect size of .10. There are three possible ways to interpret this difference:

The studies show different results, because they’re looking at different grades. Arc is not effective for kindergarten, but it is effective for grades 1-3.
The (Gray, 2021) study is a RCT, more recent, and thus more rigorous. Therefore, the (Gray, 2021) results are more reliable.
The DuCette study was examining the impact of the 100 book challenge, with ARC, it is therefore impossible to tell if the (DuCette, 1999) outcomes should be attributed to ARC or the 100 book challenge. Therefore the (Gray, 2021) study is more reliable.
There is no way to know which study is more reflective of ARC, without more research. Therefore, the least problematic interpretation should be based on a mean result, between the two studies. Likely surprising to none of my readers, I lean towards the third option.

In order to evaluate the efficacy of the ARC program, we took an unweighted mean effect size for each assessment outcome and charted it in the below graph. For context, effect sizes below .20 are considered negligible, between .20 and .39 as moderate, between .40 and .79 as moderate, and .80 or above as large. However, the specific interpretations of effect sizes can be subjective, see (Kraft, 2018) and (Hansford, 2023). The following results suggest that the ARC program has a moderate impact on reading outcomes.

Discussion:

The mean results of these studies are moderate.That said the results of the (Gray, 2021) study are negligible (according to Cohen’s guide), which was the more rigorous of the two studies evaluated. However, the (Gray, 2021) study was a large scale RCT, with a standardized assessment. Research of the John Hopkins study data-base by LXD research and Pedagogy Non Grata show that education studies with this design do typically show smaller effect sizes, on average .17 (Hansford & Schechter, 2023). If we base our interpretations on the (Hansford & Schechter, 2023) findings, these effect sizes could be alternatively analyzed as small but significant for the (Gray, 2021) study, high moderate for the (DuCette, 1999) study and well above average overall. That said, with only two experimental studies, showing very different results, it is difficult to find conclusive trends from this research. However, given that ARC has made substantial changes to their programing overall, it is in our opinion, better to evaluate the program based on the qualitative aspects of the curriculum and not their research findings.

While critics of the ARC program do suggest that the program still has room to improve, specifically around their phonics programming and their decodable texts; overall the changes made have been far more substantial than those made by other (previously balanced literacy) companies. Moreover, the changes made do qualify the program for the systematic phonics classification and the program is now inarguably research based. I think it is important to acknowledge these changes, to better incentivize companies to continue to positively develop their products.

Final Grade: B+

Two studies showed a mean effect size of .40, on standardized tests & the program’s updated principles are evidence-based. These are the criteria of Pedagogy Non Grata, for an A- grade. However, given the controversy surrounding the program, and the fact that the high effect size results came from a study that looked at both the impact of ARC and the 100 book challenge at the same time. I am uncomfortable, giving the program the A grade. That said, I look forward to seeing more research on this topic and hope that future studies show a large magnitude of effect for the program.

Qualitative Grade: 9/10

The ARC program includes the following evidence-based types of instruction: explicit, phonemic awareness, systematic phonics, morphology, vocabulary, spelling, fluency, and comprehension.

Want to know more about our grading system? Click here: Grading System

Review Limitations:

This review was conducted for the Pedagogy Non Grata blog and does not constitute peer-reviewed research. People using this review to make purchasing decisions should consult multiple sources of information, including What Works ClearingHouse and Evidence for ESSA.

Written by Nathaniel Hansford

Reviewed by Elizabeth Reenstra

Last edited 2024/02/21

References:

Gray, A,. Sirinides, P., Fink, R., & Bowden, B. (2020). Zoology One Efficacy Evaluation. Consortium for Policy Research in Education.

Ducette, J. (1999). An Evaluation of the “100 Book Challenge Program”. Temple University.

Kraft, M. A. (2020). Interpreting Effect Sizes of Education Interventions. Educational Researcher, 49(4), 241-253. https://doi.org/10.3102/0013189X20912798

Shanahan, T. (2019). How Decodable Do Decodable Texts Need to Be?: What We Teach When We Teach Phonics. Shanahan on Literacy. https://www.shanahanonliteracy.com/blog/how-decodable-do-decodable-texts-need-to-be-what-we-teach-when-we-teach-phonics#:~:text=We%20want%20beginning%20reading%20texts,words%20or%20that%20are%20so

Social Science Statistics. (n.d.). Effect Size Calculator. https://www.socscistatistics.com/effectsize/default3.aspx

Hansford, N,. Schechter, R,. Reenstra,. E & Aitchison, P. (2023). What is the Best Language Program? Teaching by Science. https://www.teachingbyscience.com/what-is-the-best-language-program

Lesson Samples:

Compañía estadounidense de lectura

Formulario de suscripción