The Pedagogical Competence in Developing Assessment Instruments of Physics High School Teachers in Merauke District

This study aims to map the ability to develop assessment instruments owned by physics teachers, especially in the Merauke district. The method used in this research is descriptive quantitative. Respondents who were involved in this study were 14 respondents who were physics teachers in Merauke. The data collection instrument used was a questionnaire containing 25 statements. The data of the research results were analyzed using the Ideal Standard Deviation. Overall, physics teachers’ competence in constructing and analyzing items is in the “good” category, with an average of 90.5. However, some assessment indicators need to be improved, such as reviewing objectives of constructing test instrument, content validation, revising based on validator’s suggestions, and analyzing the items. Article History Received: 19-01-2021 2 Revised: 23-02-2021 Accepted: 20-04-2021 Published:.07-06-2021 2017


Introduction
The teacher serves as a facilitator in learning. Therefore, teachers must prepare all learning equipment before starting learning activities ( Dinata et al., 2017;Dinata, Sakman, et al., 2020). One of the essential learning equipment for teachers to prepare is the instrument of assessment. The instruments can be used to assess and measure learners' ability for cognitive, affective, and psychomotor aspects (Nyoman & Putu, 2014). The learning assessment must be based on assessment techniques that correspond to the learning material's characteristics (Mukarramah et al., 2015). The assessment is carried out to map the achievements of a learning process (Ali et al., 2018) and can be used to make future learning decisions (Walvoord, 2010). Through the assessment, teachers can control and monitor the development of students' abilities  on cognitive , affective, and psychomotor aspects (Ralmugiz et al., 2020). Also, assessment can map students' skills in the classroom (Riadi, 2017).
There is a statement that the teacher's value can be a determinant of the learner's future. The value is a number given without a meaning, but the teacher's number describes the extent of the students' ability (Yuniati & Prayoga, 2019). Therefore, the assessment used by teachers must meet predetermined standards . The standard in question is the instrument used to meet valid criteria, reliable, and item characteristic (Aiken, 1985;Mardapi, 2012;Retnawati, 2014).
The valid criteria are divided into 3 (three) types: valid in content, empirical, and construct. Content criteria or known as the validity of content, is carried out by experts who judge the developed items with various aspects such as material, content, and language . Furthermore, empirical validity is obtained from the results of assessment instrument trials to respondents following the objectives of the assessment (Mardapi, 2012).
If the empirical validity analyzes the suitability of the assessment instrument to a predetermined standard, then the construct validity analyzes each item's suitability with the ability to be the goal of measurement (Retnawati, 2014).
The item characteristics consist of several aspects: difficulty level, discrimination power, distractor options, and factor guessing (Santoso et al., 2019). Usually, to analyze the question items can use a variety of calculations and special software. Various software can help teachers analyze assessment instruments developed, such as Iteman, Quest, Winstep, Bilog MG, Multilog, Parscale, and others. The software in question has certain conditions in determining the suitability of instruments developed with standards that must be met (Desy Kumala Sari, 2020).
The subject that can train specific skills when studying them is physics. Various skills can be measured in physics learning, such as problem-solving skills (D. K. Simbolon et al., 2019), critical thinking (P. A.C. Pri Ariadi Cahya Dinata, Sari, et al., 2020), scientific argumentation, symbolic language, causal relationships, and others. These skills cannot be measured using the same assessment instruments. For example, in studying Newton's law, students' ability can be measured by arguing scientifically and solving them. However, to measure these two skills cannot be measured using the same question item. It is because each item of the question has its measurement purpose. A good question is a matter of measuring only one skill. Thus, a physics teacher must develop an assessment instrument that is following the objectives of his assessment.
The ability to develop assessment instruments is an indicator of the pedagogical competence that a teacher must possess. A teacher can establish pedagogical competence through the Teacher Professional Education Program (PPG Program). Physics teachers in Merauke District have relatively low pedagogical competence (Bahri et al., 2020; Reski & Sari, 2020;Supriadi et al., 2018). The low pedagogical competence of physics teachers in the Merauke district should be improved through the Teacher Professional Education program. However, in the Merauke district, there is only one university implementing the PPG program. The college also has not provided quotas for physics subject teachers. This problem is an essential concern for the government and education practitioners in the Merauke district.
One effort that can be b made is conducting training to train the competencies that teachers must have. Before implementing the training, it is necessary to know the mapping of teacher competency levels in the Merauke district. This mapping can be used as a reference for planning training for teachers in the Merauke district. The focus of this research is the pedagogic ability of teachers in developing assessment instruments. Therefore, to obtain valid data related to teacher pedagogic competence in developing assessment instruments, it is necessary to conduct a survey first. The goal is to become the basis for future researchers in carrying out research related to the pedagogic abilities of physics teachers in the Merauke district.

Research Method
This research was descriptive research using survey method as the data collection technique. The respondents in this study were high school physics teachers who taught in Merauke. A questionnaire consisted of 25 statements used to collect the data. The measured aspects are arranging and analyzing the question items. These two aspects are then described into 9 (nine) indicators: (1) Determining objectives of constructing test instrument, (2) Searching the appropriate theory, (3) Arranging the indicator of test instrument items, (4) Setting the test instrument items, (5) Reviewing the instrument's content validation.
(6) Revising the instrument based on the validators' suggestions, (7) testing the items, (8) analyzing the items, and (9) assembling the test instrument. The data analysis technique used was descriptive quantitative. The Likert scale was used to measure teacher competence, which is then converted into ideal standard deviations to determine teachers' competence. The interpretation of criteria competency (Widoyoko, 2009) presented in Table 1.  Table 1 is a standard criterion used to determine physics teachers' competence level in arranging and analyzing question items. The analysis results of each indicator are then described to discuss the advantages and disadvantages of physics teachers in developing question items.

Finding and Discussion
The data obtained were analyzed using the ideal standard deviation. Overall, the ideal standard deviation for polls containing 25 statements is 16.67, and the ideal average score is 75. The average of all respondents obtained was 90.5. By the competency criteria of physics teachers in arranging and analyzing the question items in Table 2, it is concluded that teachers' competence is in the "good" category. It means that the average teacher has carried out activities to compile and analyze the question well. Based on this analysis results, then analyzed more in-depth to determine the weaknesses that physics teachers still have in arranging and analyzing the question items. Poor Based on the results of data analysis, it was obtained information that the first indicator, "determining objectives of constructing test instrument", has an average value of 12 in the "sufficient" category. This result is following the criteria presented in Table 3.
Based on the criteria obtained, only a small number of teachers develop their questions and exercises. The teachers still take questions from textbooks and other resources to train and assess learners in learning. It is good because it uses valid questions in the textbook, but existing questions causes unclear assessment objectives. Besides, not all of the questions that are available in the textbook can measure students' abilities, such as ability or creation (Anggraeni, 2016). Especially in physics, the assessment direction becomes unclear so that the cognitive value obtained can only come from the value of understanding concepts or problem-solving. It needs to be an essential concern for teachers.
Besides, teachers are still developing questions to find out the achievements of learning outcomes. It means that teachers have not set questions for learning improvement needs and a measuring tool to know the level of students' ability (Kurniawan et al., 2017). Furthermore, teachers' questions are not entirely to measure High Order Thinking Skill (HOTS). Whereas in the curriculum used today, students are required to solve problems that require HOTS. This indicator shows that the overall teacher assesses learners' ability only on cognitive aspects. However, the value is given also comes from psychomotor and affective elements. It is a positive value that the assessment should come from the measurement results of cognitive, psychomotor, and affective aspects (Asyhari & Hartati, 2015).

Table 3. Competency Level Criteria of First and Fourth Indicator Interval Criteria
Excellence Good Sufficient Bad Poor The second indicator, "searching the appropriate theory", has an average of 13. This result is "excellent" according to the criteria in table 4. However, the questionnaire results revealed that all physics teachers in Merauke had not used the latest sourcebook in developing the. Teachers are still using some sourcebooks over the last ten years. It affects the renewal of context and examples in the development of assessment instruments in physics subjects. On the other hand, when the teacher gives an illustration in the question, the teacher provides a contextual example. It is a positive value because graphics use contextual measures that make learners understand it better (Fayakun & Joko, 2015;Palittin & Hallatu, 2019;Supriyadi et al., 2020).

Table 4. Competency Level Criteria of Second and Third Indicator Interval Criteria
Excellence Good Sufficient Bad Poor The third indicator, "arranging the indicator of test instrument items," produces an average of 13 in the "excellent" category according to the criteria in table 4. This criterion is supported by the information obtained that all subjects have followed the proper stages when making an indicator of questions. Firstly, they adjust the learning indicators, create a grid of questions, and then make indicators of the question. Thus, physics teachers in Merauke can be said to have understood the sequence in making question indicators by existing procedures.
The fourth indicator, "setting the test instrument items," obtained an average of 17 with the category "excellent" according to the criteria in table 3. Overall, teachers' creation of question instruments is carried out under existing rules, namely guidelines for making multiple-choice questions and descriptions, preparing questions following the previously developed grids, and creating rubrics suitable for the form of multiple-choice questions. Furthermore, the fifth indicator, namely "reviewing the content validation", has an average of 3 with the category "enough" according to the criteria in table 5. It must be an essential concern because most physics teachers have not engaged other physics teachers to validate the questions developed. The question is made only for training and examination purposes without knowing the validity of the question. It becomes the teacher's weakness because the question as a measuring instrument needs to be validated to fit the objectives of the assessment (D. K. Sari et al., 2019;.

Table 5. Competency Level Criteria of Fifth, Sixth, Seventh and Ninth Indicator Interval Criteria
Excellence Good Sufficient Bad Poor The indicator "revising based on validator's suggestions" is related to the previous indicator. If most teachers do not validate with their peers, they cannot revise according to the input. The sixth indicator has an average of 2.5, which is on the "bad" criteria according to the requirements in table 5. Furthermore, the seventh indicator, "testing the items", has an average of 4 with "good" criteria according to the requirements in table 5. Some teachers test the questions developed. While others directly use the questions created to measure the ability of learners (Reski & Sari, 2020;Supriyadi et al., 2018). The eighth indicator, "analyzing the items," has an average of 23 on the "sufficient" criteria according to the requirements in table 6. It means that physics teachers who conduct trials on developed questions perform the analysis only using manual calculations. The teachers also analyzed the difficulty level of the question items but still used the estimates in their determination. It is noteworthy because "estimates" cannot be used as standards for making quality grading instruments (D. K. Sari et al., 2019). Other information obtained is that most physics teachers do not know the software to analyze questions that use classic test theory or item response theory. Physics teachers need to be introduced to the software to make it easier to study the question's details. Excellence Good Sufficient Bad Poor The ninth indicator was "assembling test instrument", with an average of 4 with "good" criteria according to Table 5. It means that the teachers are good at arranging question items in assessment instruments in question packages. Based on the elaboration of all indicators measured to know the competence of physics teachers in compiling and analyzing the problem, most teachers have not determined the assessment instruments' objectives appropriately. Physics teachers still use the questions in the old textbooks. Also, only a small number of teachers try and analyze the questions developed. It is a crucial concern to improve the teacher's ability to compile and analyze the question items.

Conclusion
The conclusion obtained from this research shows that physics teachers' competence in preparing and analyzing problems is in the "good" category with an average of 90.5. However, several assessment indicators are still in the "sufficient" category, namely, determine objectives of constructing test instruments, content validation, and analyzing the items. Furthermore, the first indicator is in the "bad" category, i.e., revise based on the validator's suggestions. It means that although teachers' overall competence in preparing and analyzing questions is said to be fair, there is still low competence in specific indicators. Thus, it is necessary to implement activities that can help teachers improve the competence of arranging and analyzing the question items.

Suggestion
Based on the research results, it is necessary to conduct training related to constructing and analyzing questions using applications to make it easier for teachers. The training can be in training activities for developing a question item from the Papua Province education office. Besides, by providing a particular Teacher Professional Education program for teachers of physics subjects. Also, it is necessary to give the question bank a standard evaluation tool that has been declared valid and reliable to measure learners' ability to facilitate teachers in learning activities.