A NEW PASSENGER-ORIENTED PERFORMANCE MEASUREMENT FRAMEWORK FOR PUBLIC RAIL TRANSPORTATION SYSTEMS

Customer perception of the quality of service provided by the operator and the level of satisfaction are one of the key parameters to monitor the performance. This paper presents a practical approach for monitoring public transportation system performance by focusing on the passengers’ evaluations. The paper first outlines the development of a systematic framework for an objective and participatory monitoring of transportation systems performance. A Passenger-Oriented Performance IndeX (POPIX) has been developed by using 22 indicators with 6 different measures defined as time, cost, accessibility and transfer, comfort, safety – security and quality of service. The proposed framework allows the investigation of the performance changes of a particular transportation system and enables the performance comparison of different systems directly from the customer point of view. The POPIX methodology has been presented and an example application of the suggested method is provided for the selected Railway Systems. The shifted POPIX concept has also been developed for more reliable trend analyses. The case study highlights that the measures of cost, accessibility and transfer and comfort have lower performance scores and the Metro System performs better than the Tram and Light Rail Systems.


INTRODUCTION
The performance of transportation systems affects not only the traveller's preferences on mode of transportation, but also the choice of route or departure time of the trip.The quality of service reflects the passenger's perception of the system performance and therefore, this perception should be one of the prominent factors in systems management.The service quality has a positive and significant effect on customer satisfaction [1].Satisfaction is a consumer's response to the evaluation of the perceived quality with pre-purchased expectations (or some norm of performance) and the actual performance of the product as perceived after its consumption [2].
Satisfaction can also be defined as a function of perceived performance, expectations and prior satisfaction [3].Passenger satisfaction is directly related to the expectations of service quality and the actual level of service.Therefore, measuring the satisfaction and the importance of measures and combining them is essential for monitoring the performance of transportation systems [4].Furthermore, it is expected that customers who are satisfied with the performance of a particular service would prefer to purchase that service again when it is desired.Certainly, this is similar for the service provided by public transportation operators.
A performance measurement system is required to monitor the efficiency and effectiveness of the system and evaluate the impacts of the service provided.A comprehensive performance measurement program should set guidelines, measures, detect problems, monitor process for improvement, and document the accomplishments [6].Furthermore, such programs are practical for operators in assisting them to better understand the passenger demands and to modify the transportation service accordingly [7].
In European Standard EN 13816, the quality loop approach has been accepted as the standardized measurement procedure for public transportation quality [8].The concept of the service quality loop dis-tinguishes between passenger view and operator view where the passenger side includes perceived and expected quality while the operator side focuses on the targeted and delivered quality.There have been some studies focusing on public transportation and rail system service quality monitoring such as multi-criteria approach applied to estimate the overall performance index for the quality of services for passengers in the Hellenic Railways and a customer satisfaction index has been developed to evaluate the bus transport service quality in Cosenza, Italy [9,10].

Performance measurement and classification
There have been numerous studies developed of the studies on performance measurement and classification of measures.Concurrently, performance measures should be identified accordingly to goals and objectives of the operators.Through the selection of appropriate measures in a context of distinctly determined structure, the system performance monitoring process will in fact lead to success.Otherwise, operators will spend their resources on investigating ambiguous measures which may have a limited effect on overall performance.
Many transportation agencies measured their system performance in the USA, and later reported that some (or most) of the measured data do not reflect the achievement of objectives [5].However, there is no consensus in theory and practice on not only the selection of appropriate measures but also on the classification of measures for specific objective.Various aspects exist on classification of performance measures.A group of studies stated different approaches for classification of performance measures corresponding to different system attributes and priorities [8, 11 -16] Transportation system attributes are mainly classified under main measures and components and the main service aspects are characterized as efficiency, effectiveness and impacts [11,12].A different classification proposed system performance, level of service, descriptors of systems, impacts, costs and income, trip-making behaviour and cost effectiveness and efficiency as main measures of a system [15].A comprehensive guidebook, TCRP Report 88, was published by TRB in 2003 on developing performance measurement system approaching a distinctive classification direction.In this report the system attributes are sorted as primary and secondary measures [16].
Performance measurement methods and measures are defined by various institutions.CER (Community of European Railway and Infrastructure Companies), UIC (International Union of Railways), CIT (International Rail Transport Committee) and the member countries of the European Union follow per-formance and quality monitoring standard for public transportation.CER sets the measures to be monitored including the actual and perceived quality, punctuality and reliability, safety and security, travel comfort, train cleanliness, on-board staff, customer information during the journey, as well as cleanliness, staff and customer information in stations, with the main emphasis given to journey speed, reliability and information, which have become the parameters of highest importance for the assessment of passengers expectations [17].In the European Standard EN 13816, eight measures, which are availability, accessibility, information, time, customer care, comfort, security, and environmental impact, are defined for benchmarking purposes such that an organization is required to identify service quality targets from a range of criteria listed in the standard [8].Thus the term targets and objectives vary among organizations.
Since 1970s extensive consideration has been given to performance measurement studies which are mostly carried out by either operators or transport authorities.In each study, the measures are specified according to the course of the study addressing different objectives.Some of the performance measurement programs aim for different target groups for different purposes for their specific performance measurement programs.For instance, passenger satisfaction is focused on the performance measurement program carried out in Michigan, USA where the service attributes of the transportation system mostly related to the comfort and convenience of the system are examined.In another operator-oriented study in Sydney, Australia, the efficiency measures are taken into consideration while the impacts of transportation system on society are investigated through mobility, accessibility, reliability, equity, livability and sustainability measures in San Diego, USA [16].

Passenger-oriented performance monitoring in public transportation systems
Numerous notable customer satisfaction surveys and satisfaction benchmarking studies have been conducted in several European countries during the last two decades [18 -20].Since 2005 Passenger Focus has been conducting National Passenger Surveys (NPS) in order to monitor the performance of the Train Operating Companies (TOC).System attributes are regularly monitored and stored in an overall passenger carrier database.In the NPS, the criteria, indicators, and threshold values for TOCs are established by regulators.If the overall performance of the operating company declines in comparison to previous years, the relevant TOC is requested to decrease its fares, as compensation to the passengers [18].
In the United States, with the passing of the Government Performance and Accountability Act of 1993, the performance measurement programs have gained a legislative and regulatory base [19].Currently, there are 35 -40 states that practice performance measurement in transportation services [20].The United States Department of Transport (USDOT) outlined the draft strategic plan for years 2010 to 2015, entitled Transportation for a New Generation.Within the plan, USDOT has identified the performance measures related to achieving strategic goals of safety, state of good repair, economic competitiveness, livable communities, environmental sustainability, and organizational excellence [19].
In the process of selection and adoption of performance measures, the local geography, demographics, and policy objectives of the system play an important role.The selection of the appropriate measures and indicators also depend on the availability of the data.Hard, or quantitative, measures are fact-based and can be measured directly.Soft, or qualitative measures are intangible and must be measured indirectly.On the other hand, soft measures are less accurate than the hard measures, which means that it is difficult for the agencies or operators to appropriately analyze the performance of their system.
The first objective of this research is to propose a systematic classification of qualitative performance indicators and measures and then develop an alternative tool to measure the public transportation systems performance by using customer satisfaction and customers' judgment on importance of system attributes.The methodology used in this research aims to establish an integrated and applicable index that reflects the public railway system performance perceived by passengers by considering not only the satisfaction levels but also the importance levels of the attributes.The developed customer-oriented performance measurement framework is applied by using customer satisfaction survey data to determine the performance of the public rail systems.Furthermore, the annual performance changes of public rail systems are compared with each other and performance trends are analyzed.

CUSTOMER SATISFACTION SURVEYS AT ISTANBUL RAILWAY SYSTEMS
The Istanbul Transportation Corporation (Istanbul Ulasim AS) (IU) is an enterprise of the Istanbul Metropolitan Municipality and was established in 1988 for operating the public rail transportation systems and managing all other public transportation activities within the city [25].To assess the user satisfaction level, IU has been conducting annual passenger satisfaction surveys since 2005.Passengers are asked to rate 14 to 22 different attributes which change throughout the years [23 -28].Generally, the questionnaires are designed to find out two aspects of the system, which are satisfaction and importance.The common public transportation system criteria in the surveys are fares, reliability, cleanliness, comfort, security and security [29].
In Istanbul, a 6-point Likert scale is used.The response options include "certainly dissatisfied", "mostly dissatisfied", "somewhat dissatisfied", "somewhat satisfied", "mostly satisfied", and "certainly satisfied" while importance level classes remain the same as satisfaction.
The passengers who have just passed through the turnstiles are asked to answer the questions at sta- The existing light rail system connecting the major districts of Aksaray -Yenibosna is 18 km long with 15 stations.The first phase of the line was opened in 1989 and subsequent extensions have been made.The journey takes 26 minutes and the trains depart every 4 minutes.The maximum hourly passenger capacity is 24,000 for each direction.The average number of passengers riding daily between Aksaray and Yenibosna is 170,000.There are 450 journeys taking place between 06:00 a.m. and 12:00 p.m.
The tram line Eminönü -Zeytinburnu was 11.2 km long and had 20 stations until 2010.The headway is 2.5 minute yielding 450 departures daily.The journey between the two terminal stations takes 40 minutes.The tram runs between 05:30 a.m. and 12:00 p.m.The daily number of passengers carried is approximately 150,000 [30].

PERFORMANCE MEASUREMENT FRAMEWORK
The passenger-oriented performance index (POPIX) can be determined through the Passenger Satisfaction Surveys and the same numerical 6-point Likert Scale adopted for this purpose.The proposed stepwise methodology developed for the valuation of the rail-way lines performance given in Figure 2. The steps can be defined as: the classification of measures and indicators from the questionnaire, determination of each measure and indicator satisfaction and importance scores, calculation of each indicator performance index, calculation of each measure performance index and calculation of POPIX, %POPIX.For the calculation of Shifted POPIX and %Shifted POPIX an additional step was introduced as grouping the common indicators and subsequently calculating the indicator performance index.The following steps of Shifted POPIX are similar to the POPIX methodology.
The questions asked in the Passenger Satisfaction Surveys varied over the years and some indicators were dropped while some others were added to the questionnaire.Therefore, the overall satisfaction which is determined as the average satisfaction scores of indicators, the comparison of the satisfaction level would be biased for two reasons: 1) When the number of indicators presumed to have higher satisfaction scores is added to the questionnaire, the overall satisfaction score is expected to be greater in comparison to the previous years; 2) While the type of indicators changes through years, it would not be meaningful to evaluate the overall satisfaction trend due to the change in composition.
For the reasons mentioned above, initially, the classification of indicators with respect to their relevance to the measure enables more systematic foundation in POPIX methodology.The classification gives the framework robustness against the possible impacts of variation of indicators in years by limiting the weight of an indicator within the predefined measure.Furthermore, the use of satisfaction levels alone is not decisive when the importance of each attribute is considered, hence the importance levels should be incorporated as the weight of an attribute for monitoring the system performance from a passenger point of view.Therefore, in POPIX, the satisfaction levels and importance levels are utilized equally without any refining by considering the pure passenger judgments in order not to disrupt the orderly process.
To overcome the fuzziness, the measures are constituted with examining numerous reports and research papers by considering the relevance of indicators with the corresponding measure.Therefore, six measures are selected from a different number of indicators for each year.However, the number of indicators in each measure is not constant, due to the fact that survey questions were changed each year by IU (Table 2).The "Time" measure is formed with three indicators: waiting time, commuting time, and reliability, where all of these indicators are evaluated by passengers.The "Cost" measure has a single indicator which is the fare and has existed all the years.The "Accessibility and Transfer" measure was constituted with three indicators: accessibility to stations, transfer dis- The next step of the framework is to determine the indicator and subsequently measure performance indices.The index is based on the passenger satisfaction of each attribute and the importance of the attribute perceived by the passenger.The importance scores are regarded as weights and indicator performance index of each indicator is calculated such as: where: IPIj -Indicator Performance Index of the j th indicator; m -Number of respondents; Ii -Importance score of the i th respondent; Si -Satisfaction score of the i th respondent.Following the calculation of IPI, the Measures Performance Index (MPI) is calculated by taking arithmetic mean of IPI scores where: MPIk -Measure Performance Index of the k th measure; IPIj -Indicator Performance Index of the j th indicator; n -Number of IPIs of the k th measure.As 6-point Likert scale was used in the inquiry form for satisfaction and importance scores by IU, either of the IPI and MPI scores can take values between 1 and 36.The IPI scores are calculated as the sum of the multiplied importance and satisfaction rates divided by the number of respondents, and this forms the MPI which is determined as the sum of IPI scores divided by the number of indicators.Given the fact that the mathematical basis is simple, the scores reflect the actual perceived performance individually by taking into account the satisfaction and importance rates of each passenger on any attribute concurrently.The POPIX scores of any system can be calculated similarly with the IPI and MPI as: where: POPIX -Passenger-oriented Performance Index; MPIk -Measure Performance Index of the kth measure; o -Number of MPIs.As IPI and MPI, the max value of POPIX is 36.An example calculation of IPI and MPI for the Metro line can be seen in Table 3.
To increase the understanding of this index, it can be normalized to a 100-point scale simply by multiplying POPIX value by 100/36.Not only POPIX but also IPI and MPI values can be expressed in 100-point scale, as given in Table 4.

COMPARISON OF ANNUAL PERFORMANCE CHANGE
The comparison of the public railway performance throughout the years brings out the need of elaborating the proposed method.The %POPIX scores may demonstrate meaningful information about the performance, as mentioned previously, it would be highly subjective to compare the performances of those not having the same composition.In addition to this fact, a structural defect in POPIX methodology can be encountered during the interpretation of performances in successive years such that the importance rate of an indicator may increase relatively more in comparison with the satisfaction rate for different years.
For instance, if the average importance rate of the "fare" is 4.2/6(0.70)and the satisfaction rate is 5.5/6(0.92)for the first year and becomes 5.6/6(0.93)and 4.5/6(0.75),respectively, the IPI would be 23.1/36(0.64)for the first year and 25.2/36 (0.70) for the latter.Despite an increase in IPI, there is a distinct decrease in the satisfaction level which is around 15% (from 0.92 to 0.75).Such change in importance and satisfaction rates is not observed in this data set.On the other hand, it would be reasonable to interpret such variations; since the importance level gains a significant enough weight to have an impact on the increase in overall performance, the attribute relative contribution is worth investigating.The satisfaction of the "security" indicator has relatively low importance for the year 2006 for the Metro line; however, after a failed terrorist bombing attempt, around the Mecidiyekoy Metro Station in 2007, the perception of the importance of "security" changed dramatically from around 92% to around 98%.According to the customer satisfaction surveys, although the satisfaction level remains constant at around 75%, the IPI score increased due to the increase in importance.The contribution of a single indicator to the overall system performance is limited because of several indicators and the structural design of the model.
To solve the benchmarking problem, another index is defined as the Shifted POPIX which enables an annual comparison by utilizing each year's importance rates with each year's satisfaction rates in a matrix form.In this way, the effects of weights on overall performance could be interpreted.Therefore, common indicators are grouped under the same measures as performed previously.Twelve common indicators marked in Table 2 are available in each survey year and are used to calculate the Shifted POPIX.
The calculation steps in the shifted POPIX scores are essentially the same; however, the difference occurs in determination of IPIs where the average rates of satisfaction and the importance are multiplied because of the different number of respondents for each year.The IPI is calculated: where: IPIab -Indicator Performance Index based on the a th Year Satisfaction and the b th Year Importance Rate; Sia -Satisfaction score of the i th respondent in the a th year; Ijb -Importance score of the j th respondent in the b th year;  M -Number of respondents in the i th Year; N -Number of respondents in the j th Year.As the number of respondents is not the same for all the years, IPIs are calculated by using arithmetic average of the satisfaction and importance.Meanwhile, even if the number of respondents is equal it is not logical to multiply somebody's importance rate with someone else's satisfaction rate to calculate IPI.
The same procedure is followed as conversion from POPIX to %POPIX for %Shifted POPIX calculations.The %Shifted POPIX for Metro is given in Table 5, where highlighted scores indicate the %POPIX with the common indicators.Results are given for the railway systems of Istanbul as an implementation of the methodology described above reflects that Metro line has better % Shifted POPIX scores over other railway systems for all the years, although the performances are approximate for the last two years analyzed (Table 6).Common IPI100 scores are given for the Metro Line for the years 2005 to 2007 in Figure 2 and Figure 3 shows the common IPI100 scores of 2007 for all the railway systems.Surprisingly, when % Shifted POPIX is examined the Tram has the highest score for 2006 and 2007 even though the differences are very small.The satisfaction levels for Light Rail and Tram Lines slightly decreased from 2005 to 2006 and increased to the level higher than 2005 in 2007.However, the %POPIX and Shifted POP-IX results do not reflect the same figure as satisfaction for the same systems whereas an increasing trend is observed due to the changes in importance levels of measures and indicators of the system.After 2005, the POPIX scores significantly decrease for the Metro Line due to an increase in demand.However, the index gains higher score for 2007 in comparison to the score of 2006 for the Metro Line.For the Light Rail and the Tram Lines, it would be acceptable to say that the performance was generally improving and achieved their highest scores in 2007.The lowest performance scores are attained mostly in "cost" and "comfort" measures and the lowest score as an indicator is the "occupancy rate" for all the years and the lines analyzed.The results are compatible with the actual occupancy rate of the rail lines and the general public opinion on fares of the systems.
The results indicate that the lowest performance is observed in "Occupancy Rate" for all the systems analyzed.In spite of the fact that the capacities of the systems are definite, the systems may be operated more efficiently with better scheduling.With respect to the low performance of "Fare", as alternative to flat fare policy, the distance or congestion based pricing should be assessed considering the social equity.High performance of "Travel Time" is highly important in boosting new railway projects.One of the reasons for high performance in travel time could be the highly congested traffic on roads and low frequency of other modes.However, the average travel time of the Tram line is increasing due to the high demand.The tram operates in mixed traffic between Sirkeci and Sultanahmet stations where a rearrangement could be considered as a separate right of way.The "Air-conditioning inside the Trains" has one of the lowest performances in all rail systems.This condition is directly related to the "Occupancy Rate", whereas there are some practical solutions that can be considered as placing rotating fans to the ceiling of cars like in most of the Japanese public rail cars.With regard to the questionnaire design, the questions related to the "Transfer Distance" and "Ease of Making Transfers" omitted after obtaining low performance should be included again in order to trace the trend.Moreover, the monitored system attributes should not be changed in the years and the measurement of relative weights of indicators and measures should also be considered.The relative weight of an indicator can be specified by simply asking the respondents to assign a number to each indicator out of 100.Another method would be to utilize the choice modelling in estimating the weights by conducting a stated preference survey and asking to rate each indicator from a scalar list.

CONCLUSION
The objective of this paper was to develop a framework to study the passenger-oriented performance monitoring by using satisfaction and importance ratings obtained from passenger satisfaction surveys.In order to overcome the variations in the number of system attributes, initially the indicators are classified and measures are defined accordingly.A Shifted POPIX method is developed to solve the benchmarking problem granting a trend analysis by considering each year's importance rates with each year's satisfaction rates in the matrix form.
The use of framework is easy to implement and effective in determining each indicator and/or measure performance.Such an approach can provide early warning to management with respect to the predefined thresholds for each indicator/measure.
According to the analyses, the Istanbul Ulasim AS, IU, has been operating the rail systems successfully allowing high performance in travel time and service despite having relatively low performance in fare, accessibility and safety-security.In particular, this framework study can be easily implemented in monitoring the rail transportation system performance and submitted to IU as a research proposal for consideration.
When the importance scores are excluded from the passenger surveys, the results reflect only the satisfaction and not the overall performance.Satisfaction is critical for understanding public transport from the customer perspective.Passenger evaluations are subjective to reflect the actual conditions and a high level of satisfaction does not necessarily suggest a superior system or vice versa.Alternatively, given the fact that satisfaction is a relative concept, satisfaction scores should be evaluated within their own context [31].For the overall performance evaluation of a public transportation system from the user perspective, a unique score is crucial.Averaging the satisfaction levels by increasing the number of indicators in the years may result in a misleading conclusion.Furthermore, the frequency and consistency of surveys are critically important to determine the presence of low-score measures [32].Even though the measures are weighted through the passenger perceptions it is also important to incorporate the expert opinions when evaluating the system performance of any public transport mode.
The integrated passenger-oriented performance measurement framework for public rail systems developed in this study provide a theoretical account and empirical basis to evaluate the operational services, and the framework can also be used by independent organizations for regulation purposes.In addition, benchmarking the performances of IU with other mega-city rail system performances would be helpful to identify the key attributes for improving the performance.

Figure 1 -
Figure 1 -Istanbul Railway Systems in 2005 -2007 tance, and ease of making transfer for the years 2005 and 2006 though the accessibility to stations could be the only indicator for the year 2007.The indicators of the "Comfort" measure present a variation for all the years.For the year 2005 the indicators were: occupancy rate, cleanliness of vehicle and temperature inside vehicle; vibration level and noise level were added to the indicators of the previous year, for the year 2006.Finally, in 2007, the noise was dropped from the indicators of 2006 and the rest remained the same.The safety indicator was added to station security for the "Safety and Security" measure after the year 2005 and persisted until the end of the analysed period.The most prominent change in terms of measure structure occurred in the "Service" measure.The indicators have been subjected to change for each year.Thus, politeness and helpfulness of the station staff and information announcements are added to the indicators of 2006 compared to the year 2005 and the courtesy and helpfulness of station staff in waiting areas, escalators, automatic vending machines, announcements in trains, announcements at platforms and signboards and instructions are included as indicators for the year 2007.

Table 1 -
Descriptive statistics of customer satisfaction surveys tions and the questionnaire forms are filled out by the survey personnel.The sample size is 2006 which is considerable even though some of the demographic and socio-economic characteristics are coherent with the previous studies.On the average for all the rail systems analyzed, the population is spread as follows: 76% of the interviewed passengers are male and the employment characteristics show that 69% of the passengers are employed and 19% of the passengers are students (Table1); 59% of the employees are workers and 22% are businessmen.About 61% of the passengers are married, while 61% do not own a car to commute.The majority of passengers, 99% went to school for education, and 44% are high school and 25% are university graduates.There are 76% of passengers younger than the age of 35 which is compiled from the Turkish Statistical Institute's labour force participation rate data.Most of the interviewed passengers, around 66%, belong to low-middle income class where the household income is less than 1,500 Turkish Lira, TL (1 TL was 0.743 US$ in 2005, 0.700 US$ in 2006, and 0.769 US$ in 2007).The metro line began operation in April 2000.The line was operated through 6 stations from Taksim to 4. Levent until 2009.The travel time between these stations is 12 minutes, with 5 minutes headway.Hourly passenger capacity is 70,000 for each direction.The line was recently extended in both directions and now operates between Sishane -Haciosman having 16.5km in total length and containing 12 stations.

Table 2 -
Classifications of measures and indicators * Indicates a common indicator for all years

Table 3 -
MPIs and IPIs of Metro LineA.S. Kesten, K. S. Öğüt: A New Passenger-Oriented Performance Measurement Framework for Public Rail Transportation Systems

Table 5 -
%Shifted POPIX scores for Metro Line

Table 4 -
IPI100 and MPI100 values of railway systems