Journal of Research in Education Sciences (Dec 2018)
108 課綱第四學習階段國語文閱讀素養 Development and Validation of an Online
Abstract
儘管聽、說、讀、寫都是108 學年度預定實施之國語文新課綱的重要元素,然由基本理念可窺知,閱讀能力是新課綱非常重要的關鍵。為發展符合新課綱第四學習階段閱讀素養所需線上評量,本研究目的為建立新課綱第四學習階段國語文素養導向評量發展模式並檢視其成效,從評量架構的建立、文本的挑選或撰寫原則、命題原則等三個方面探討,並分析素養導向試題心理計量特性以支持素養導向評量發展模式的有效性。援用三角檢證法,經由三場評量架構專家會議、20場命題工作坊,本研究取得三向度評量架構與16 項評量指標,建構五項文本的挑選或撰寫原則,10 項素養導向試題之命題原則,以及23 篇文本之201 題閱讀素養導向試題之效度證據。試題置於i-Assessment研究團隊所開發的線上平台進行預試。經三梯次預試,對象來自全臺灣58 校1,685 位國中七、八、九年級學生。以試題難度、鑑別度、Rasch模式之適配度呈現試題心理計量特性,分析軟體為R,Package 為CTT 和TAM。17 篇文本 (73.91%)試題之平均鑑別度指數達 .30 以上。選擇題平均通過率介於 .45~ .74 之間,難度分布廣;半數建構題為高層次評鑑與省思題,其鑑別度指數平均 .37、平均通過率 .24,屬高鑑別度、高難度試題,符合期待。試題具備良好心理計量特性,併同歷程中專家肯定的效度證據,共同支持素養導向評量發展模式的有效性。最後透過範例試題,提供教學現場作為命題實務參考。 Reading is one of the key elements of the 12-year basic education curriculum guidelines for elementary schools, junior high schools, and general senior high schools: Language and Literature Category - Chinese Language and Literature. This study developed an online reading assessment module in line with the new curriculum guidelines. A literacy-based assessment module was established to construct an assessment framework, text selection or writing principles, item writing principles, and reading literacy-oriented items. The psychometric characteristics of the items were also investigated to support the validity of the module. The principles and items were validated through the triangulation method. By conducting assessment framework meetings and item writing workshops, a 3-dimensional assessment framework, 17 indicators, 5 text writing principles, and 10 item writing principles were obtained. The Online Assessment for Science Literacy platform was adopted for 23 testlets with 201 items for pilot test administration. The participants were 1,685 junior high school students from 58 schools in Taiwan. Item difficulty, discrimination, and infit mean square statistics were analyzed using the CTT and TAM packages in R. The average discrimination of 17 testlets (73.91%) was more than .30. The average p-value of multiple choice items ranged from .45 to .74. The average discrimination of half of the constructed response items was .37 and the average p-value was .24, which revealed high discrimination and high difficulty The sample items were also provided as a reference for preparing assessment designs for schools in the future.
Keywords