SYMPOSIUM: Exploring New Possibilities for Assessment In and Out of Schools

Juliette Lyons-Thomas; Danielle Herro; Cassie Quigley; Jessica Andrews; Girlie Delacruz; Jeffrey B Holmes; Yoav Bergner; Samuel Abramovich; Ilya Goldin; Eric Tucker; Kenneth Wright; Seth Jones

Open Conference Systems, ITC 2016 Conference

Juliette Lyons-Thomas, Danielle Herro, Cassie Quigley, Jessica Andrews, Girlie Delacruz, Jeffrey B Holmes, Yoav Bergner, Samuel Abramovich, Ilya Goldin, Eric Tucker, Kenneth Wright, Seth Jones

Building: Pinnacle
Room: 3F-Port of San Francisco
Date: 2016-07-03 03:30 PM – 05:00 PM
Last modified: 2016-06-30

Abstract

Introduction

In order for assessments to truly be useful in the 21st century, they need to reach far beyond the traditional role of summative, paper and pencil-type standardized tests. This symposium will explore the various activities of a group of early career scholars brought together to explore the future of assessment in the service of education.

Contributions

There are four contributions to this symposium. The first paper, Innovations in Generating Evidence of Learning, provides the context of the symposium by reporting a detailed overview of the fellowship program that brought together the contributors of the symposium. The second paper examines an innovative measurement tool used to assess, and ultimately promote, collaboration in Science, Technology, Engineering, Art and Mathematics. The third paper investigates innovative school design and the various attitudes toward and purposes of assessment, focusing on the use of assessment to promote student learning rather than collect summative student results. Finally, the fourth paper of the symposium explores the fluidity of assessment and its purpose in the context of makerspaces.

Conclusions

The information presented in this symposium represents a diverse set of assessment strategies and viewpoints. Together, they offer promising possibilities in using assessment to contribute to, rather that simply quantify, learning.

*****

Paper 1: Innovations in Generating Evidence of Learning
Juliette Lyons-Thomas, Regents Research Fund

Introduction

In early 2013, a call for applications was released that encouraged emerging scholars to apply for a mentoring fellowship in the area of educational assessment and learning. The fellowship was named in honor of the seminal and innovative assessment work of Edmund W. Gordon, and the recipients of the award were brought on as Fellows for the next three years.

Objectives

One of the main goals of the Fellows was to continue the work, from an interdisciplinary perspective, of the Gordon Commission on the Future of Assessment (2012). The Commission, which sought to study the current uses of educational assessment, considered how assessment would change in order to continue to serve students and educators in the next century, and generate recommendations on future models of assessment. The goals of the Fellows were facilitated by three mentors with diverse expertise in educational assessment.

Methodology

The 12 recipients of the fellowship were varied in their backgrounds, research interests, home institutions, and views on the future of assessment. Over the past 3 years, the Fellows and their three mentors met biannually to discuss new directions and challenges of educational assessment in the 21st century. Meetings regularly included invited guest speakers and the planning of collaborative projects focused on the impact of new technologies on learning and assessment, advancements in the learning sciences, and the reconceptualization of the purposes of assessment.

Results and Conclusions

This paper provides an overview of the events of the past two years, as well as future directions of the Fellows and recommendations concerning the investigation of assessment for the purposes of learning. The other papers included in this symposium focus specifically on the collaborative projects of the fellows.

Paper 2: Co-Measure: Developing a Rubric to Assess Student Collaboration in STEAM
Danielle Herro, Clemson University; Cassie Quigley, Clemson University; Jessica Andrews, ETS; Girlie Delacruz, LRNG by Collective Shift; Jeffrey Holmes, Arizona State University

Introduction

Increasingly, educational efforts are aimed at enhancing studentsâ€™ collaboration skills, recognizing that collaborative skills are crucial for digital age learners. Although many initiatives are aimed at designing and implementing curricula to promote collaboration, little effort has been devoted to assessment tools utilizing new advances in measurement practices particular to complex skills (Mislevy et al., 2013). Efforts to measure collaborative problem solving (CPS) typically occur in contexts such as healthcare, business, or higher education (Thomson, Perry & Miller, 2007), with little research in K-12 schooling. This work addresses ways to define and assess CPS in K-12 schools.

Objectives

We detail the development and iterative refinement of a measurement tool, Co-Measure, created to assess CPS among middle school students working in STEAM (Science, Technology, Engineering, Art and Mathematics) learning activities. Similar to problem-solving in organizations, in STEAM activities students work collaboratively to effectively divide labor, incorporate information from multiple sources, perspectives, and experiences, and enhance the quality and creativity of solutions to problems (OECD, 2015).

Design

Following the steps of evidence-centered design (Mislevy, 2011), we reviewed literature on collaboration to define the domain, and collected and analyzed classroom video of 6th grade students collaborating during STEAM units to identify observable behaviors indicative of effective collaboration in STEAM contexts. A STEAM task analysis was completed to determine the types of situations necessary to obtain sufficient evidence of collaboration.

Results/Conclusion

The project has yielded a measurement tool, Co-Measure, for teacher use to assess student CPS in STEAM activities. Indicators of student collaboration in STEAM activities include joint problem solving through inquiry-rich approaches, types of peer interaction, authentic approaches, transdisciplinary thinking, and positive communication. The team is currently validating and refining the tool in middle school classrooms with teachers to gauge its effectiveness in assessing studentsâ€™ collaboration skills; current results will be reported.

Paper 3: Innovative School Design and Assessment FOR Learning: Personalized Math as a Case Study
Ilya Goldin, 2U; Eric Tucker, Brooklyn Laboratory Charter School; Juliette Lyons-Thomas, Regents Research Fund; Kenneth Wright, North Carolina State University

Introduction

In contrast to popular implementations of assessment as infrequent, standardized, and high-stakes, we examine a school where students engage in assessment practices daily. The goal is to support school staff with accurate, timely and multidimensional information on each student, and to empower students to engage with assessment as a tool of argument and self-improvement.

Objectives

Given the key role of assessment in education, it is important to understand how changes in assessment practices influence the experiences of students and staff.

Design

Researchers with expertise in psychometrics, mathematics education, personalized and digital learning, and ethnography conducted applied research at the school. The researchers collected and synthesized data by observing multiple instructional and assessment interactions within the school, and interviewing students and staff.

Results

We identified a number of innovative assessment practices, including a centralized digital learning platform whereby teachers could view and update the profile of a student, thus sharing their perspectives and assessments of the students in a dynamic and transparent manner. This information was available to the students, enabling them to understand, engage with and even dispute their assessment records.

We also identified obstacles on the path to comprehensive assessment FOR learning. For instance, teachers have skills and comfort with the use of digital tools on a spectrum from novice to expert. Furthermore, current assessments only cover a small portion of the competencies, skills, dispositions, and abilities that are most critical to future learning.

Conclusions

The notion of assessment FOR learning presents a compelling vision for the future education. A step-wise, deliberate process of designing and operationalizing systems of assessment consistent with the notion of assessment FOR learning will be necessary to achieve this potential. There are numerous challenges to designing a comprehensive approach to education through assessment FOR learning, and likewise numerous practical obstacles to implementation.

Paper 4: Assessment in the Making: Envisioning Assessment of Learning in Makerspaces and Fablabs
Yoav Bergner, ETS; Samuel Abramovich, University of Buffalo; Marcelo Worsley, University of Southern California; Girlie Delacruz, LRNG by Collective Shift

Introduction and Objectives

Makerspaces and digital fabrication labs (FabLabs) are drawing attention as both formal and informal spaces for interdisciplinary, collaborative, and self-directed learning (Educause, 2013) and as incubators of engineering and entrepreneurship (Gershenfeld, 2005). The role of digital fabrication in learning settings is rooted in theories of experiential education (Dewey, 1938), constructionism (Harel & Papert, 1991; Kafai & Resnick, 1996), learning by design (Kolodner, 1995), and project-based learning (Thomas, 2000). However, to date, there has been very little discussion about assessment of (and for) the kind of learning that takes place in educational makerspaces. The purpose of this work is to investigate the nature of learning that takes place in makerspaces from an assessment perspective with the eventual goal of creating a framework that allows educators and assessment experts to design valid assessments for makerspaces.

Design/Methodology

The purpose of assessment in makerspaces can be manifold, comprising formative assessment to provide feedback for learners; program evaluation, perhaps for the purpose of experimental summary or financial justification; and summative assessment of individual student work and growth. As early stage work on evidence-centered assessment design (Mislevy, Steinberg, and Almond, 2003), we have conducted interviews with instructors and designers of school-based and extracurricular makerspaces and fab labs. The goal of these interviews has been to identify the claims of interest to various stakeholders, the current approaches to evidence identification and accumulation, and any perceived gaps between the need for and availability of appropriate assessment.

Results and Conclusions

We report the results of this work-in-progress, showing the breadth of constructs that stakeholders identify in educational makerspace settings. We also discuss how existing paradigms of measurement from extremely fine-grained log-file analysis to portfolio assessment may come to play a role in assessments that support learning in these innovative environments.

Paper 5: Teachersâ€™ Assessment Literacy and Conceptions of Assessment in a Performative Culture
Kim Koh, University of Calgary; Gavin Brown, University of Auckland; Eugene Kowch, University of Calgary; Olive Chapman, University of Calgary; Bryan Szumlas, Calgary Catholic School District

The implementation of assessment for learning practices in teachersâ€™ daily classroom instruction cannot be realized if teachers do not perceive that improved teaching and student learning are the true functions of assessment. Teachersâ€™ conceptions of assessment tend to be ecologically rational; that is, aligned with the social, cultural, and policy priorities of their school context. Therefore, when performativity on high-stakes, standardized testing is a priority of both school and system, assessment is likely to be perceived as mainly serving accountability functions. Assessment for learning practices can also be impeded if teachers have a low level of assessment literacy. The purpose of this study is to examine preservice and inservice teachersâ€™ assessment literacy and conceptions of assessment in a high performative culture. ApproximatelyÂ 436 Canadian teacher candidates and teachers were administered the Canadian Assessment Literacy Inventory (CALI) and the Brown Teachersâ€™ Conceptions of Assessment abridged inventory (TCoA-IIIA). The CALI is adapted by the first author for use in the Canadian context. It consists of 25 multiple-choice items to measure preservice and inservice teachersâ€™ levels of assessment literacy. The TCoA-IIIA consists of 27 6-point Likert items, which measure four major conceptions of assessment: assessment improves teaching and learning, assessment makes students accountable, assessment makes teachers and schools accountable, and assessment is irrelevant. Confirmatory factor analysis is used to validate the factor structures of the two sets of data. In addition, the relationship between teachersâ€™ assessment literacy and conceptions of assessment will be examined using structural equation modeling. The full data analysis will be completed for our presentation at the ITC conference in July 2016. The studyâ€™s findings will inform planning and review of assessment curriculum in teacher education and professional development programs in Canada and other high performative countries that share similar reform visions in assessment.

Paper 6: Evaluating the Validity of Student Grades based on Instructor-Constructed Multiple-Choice Items
Gavin Brown, University of Auckland; Hasan Abdulnabi, AMA International University

Introduction

Multiple-choice questions (MCQs) are commonly used in higher education assessment tasks for their ease and efficiency in covering instructional content in a short time. In contrast to most testing programs, statistical analysis of items and IRT-score determination is rare in higher education. Studies evaluating MCQs used in higher education assessments have found many flaws in the writing of items relative to conventional best practice, resulting in misleading insights about student performance and contaminating grading decisions.

Objectives

To evaluate instructor-written MCQs used operationally to award grades and determine (a) item quality, (b) impact on test score, and (c) impact on grades.

Methodology

Secondary analysis of 100 instructor-written MCQs used in an undergraduate midterm test (50 items) and final exam (50 items), making up 50% of the course grade was conducted. Data were obtained from 380 students enrolled in one 1st-year undergraduate general education course. Item difficulty, discrimination, and chance properties were determined using four statistical models (i.e., CTT, 1PL, 2PL, & 3PL) and contrasted to the conventional raw score approach. Software used for analyses included: SPSS version 21 for CTT, SPSS v21 with the Rasch model extension for 1PL, and Hansenâ€™s ICL software for 2PL and 3PL.

Results

For each model and test, the effect on individual student assessment (grades) and course grades was evaluated. The mid-term test had significantly more problematic items than the final exam. The 3PL model kept most of the items, produced more reliable test scores, and resulted in more students passing the course than all three alternatives and the raw score approach.

Conclusions

The analyses show that many MCQs were flawed, despite judgement-based quality assurance procedures. Consequently, many students received an indefensible score and course grade. Higher education institutions need to integrate item analysis and standard setting processes before score and grade determination is conducted.

An account with this site is required in order to view papers. Click here to create an account.