AUTOMATIC PAVEMENT CRACK RECOGNITION BASED ON BP NEURAL NETWORK

,


INTRODUCTION
China is experiencing a rapid growth of the highway mileage.The safety of highways has become an increasingly urgent mission for pavement management agencies.Crack is the most common surface distress of asphalt pavements, which can be classified into four categories: transversal, longitudinal, blocking and alligator crack.It is very important to detect the crack in time to reduce pavement maintenance cost and extend pavement service life in a fast but accurate way.Nowadays, manually visual investigation and semi-automatic inspection based on video or image acquisition technologies are still the most popular means of pavement surface detection and evaluation.Recently, there are some studies [1][2][3][4] trying to develop automatic crack detection models.However, all these models require a large amount of feasible data to ensure the visibility of the crack, which is commonly determined by the crack intensity or the gray scales of the crack pixels.As the crack intensity may not appreciably differ from the image background due to interference from image noises, an effective image processing system is often needed in practice.
The development of fully automatic pavement detection has turned out to be an urgent need for pavement management.For the past two decades, a great deal of studies have been conducted to study the pavement image processing algorithms and to improve the image grabbing conditions.However, fully automatic detection remains a great challenge due to unsatisfactory accuracy and reliability of test results.Such issues are mainly caused by the complex pavement environment and the lack of a universal image processing methodology.Therefore, a complete framework of image processing is proposed in this paper to make the automatic crack detection possible.This framework includes two parts: image enhancement and image segmentation.The first part is the foundation of the entire procedure, which has great influence on the complexity of the image segmentation algorithm and also affects the accuracy of the crack detection.
For asphalt pavement images, the image enhancement usually means denoising and intensity correction.Speckle noise and random noise are the major part of the image noises.The former comes from irregular pavement stains (e.g. oil spot, dirt and wheel mark) and the fragmentized pavement distress, while the latter is mainly caused by the pavement surface texture.Although these noises have specific sources, the conventional denoising methods can still be used, such as median filtering, Gaussian filtering, mathematical morphology denoising, and so on [5][6][7].The case for the image intensity correction is relatively more complicated.Non-uniform background intensity is a universal phenomenon for pavement images because of the artificial lighting, which is commonly used to eliminate shadows of roadside objects as well as detection vehicle itself, or to support night work.Besides, as result of the line scanning imaging, the image combination will cause local disconformities of the image background.These problems will weaken the crack information in the image and therefore they have to be solved through specific correction algorithm.Cheng [8] believed that a pavement image was made up of three kinds of signals: background, noise and distress, and the intensity of the background should fluctuate slowly and continuously.Thus, a benchmark background gray scale was determined and the gray scales of all background pixels needed to be corrected to the benchmark scale.The result of Cheng's method seemed to be good but it was kind of ambiguous on how to determine the benchmark scale.Similarly, Gao [9] proposed a bilinear interpolation method to correct the image background.According to this method, a set of background pixels was constructed by sampling the whole image background.Then the corrected background was obtained by bilinear interpolation and surface fitting based on this set.This method was logically feasible, but how to get the typical background set was still a matter for argument.Different from Cheng's and Gao's methods, Koutsopoulos [10] used multiple images to achieve the background correction, but all images must be obtained in the same or quite similar environment.Both images with and without distress were required.Then the average image of all images without distress was considered as the background, and the targeted distress was extracted by subtracting the background from the image with distress.This method was easy to understand and fast in calculation.But the precondition was relatively hard to meet and the calculation process may cause gray scales overflow [11].
After enhancement, the image will be segmented to extract the crack information.Two categories of methods can be used here: thresholding and edge detection.The thresholding method requires to find a threshold to separate the distress and background pixels, such as Otsu method [12], regression method [13], relaxation method [14], and so on.The key of these methods is to find a proper feature parameter to describe the image characteristics.Lots of statistical parameters could be used such as histogram, mean value of the gray scales and standard deviation.Besides, maximum entropy and Fisher criterion can also be used in image segmentation [15,16].The edge detection method is based on various operators, such as Roberts, Sobel, Krisch, Gauss Laplace and Canny, etc. Different operators have different preconditions, so their effects cannot be judged directly.Generally speaking, Sobel and Canny may do better in most cases, especially the 8-Sobel [17,18].
After enhancement and segmentation, the crack information in the image was extracted from the background as much as possible.Now the image is ready for crack recognition.Kinds of feature parameters could be used to classify the images including the crack width or length, cracking area, cracking rate, average curvature of cracking edge, area ratio of the external rectangle and histogram peak difference [1-5, 13, 19-21].These parameters can be applied singly or in combination according to image quality and desired recognition accuracy.The heuristic method is a good choice to recognize pavement images based on feature parameters [22][23][24][25][26], among which artificial neural network (ANN) is one of the most prospective methods because it is a powerful empirical modelling tool to solve complex non-linear problems [27][28][29].BPNN is the most widely used network due to its simplicity and great power to extract useful information from samples [30].Zhang [31] used BPNN to forecast box office revenue of movies.The results showed that the multi-layer BPNN had a great potential as a prediction tool.Wong [32] attempted to evaluate the stitching defect of a garment by use of wavelet transform and BPNN.The results showed that BPNN was quite flexible in combination with other methods.Thus, this study also attempts to use BPNN to classify the pavement images with linear and alligator cracks.
The structure of this paper is organized as follows: Section 2 provides the framework of the whole image Li, Sun, Ning, Tan: Automatic Pavement Crack Recognition Based on BP Neural Network processing procedure; Section 3 introduces the preprocessing methods; Section 4 suggests the extracting process of the crack information which includes image screening and segmentation; The automatic image recognition based on BPNN is proposed in Section 5; and lastly, the conclusions are presented in Section 6.As the characteristics of the pavement images will have influence on the effect of image processing methods, to specify the discussion about the methods proposed in this study, the image resolution in this work is 832 by 576 pixels and the cracks are with an accuracy of 1.5 mm.

IMAGE PROCESSING FRAMEWORK FOR CRACK DETECTION
A complete image processing process is closely related to the image characteristics so it should be designed optimally.An image processing framework of crack detection is proposed in this study, as shown in ing operation to eliminate environmental interference as much as possible.
After preprocessing every image is partitioned to a plurality of working units in the step of image screening and these units will be further classified according to whether or not they contain cracks.Only the units with cracks can go to the next processing stage, image segmentation, which separates the useful crack information from the redundant background.
The noises in the image have been eliminated as much as possible through the above operations.Thus, the image can be recognized in the next step to translate the graphical information into the pavement crack data which can be read and understood directly.At the same time, the units without cracks are marked as image background, which will be output as the intact pavement data.Finally, the crack data and the intact data will be combined to describe the corresponding pavement condition.To make the discussion clear and specific, it is assumed that the gray scale of the cracks in the image is always lower than the background.

IMAGE PREPROCESSING
Image preprocessing is generally not a single treatment but a scheme including multiple treatments.The design and sequence of these treatments are closely related to the pavement image features, which can be summarized as follows.
-Feature 1: the image background illumination is not uniform.There are three reasons for this phenomenon.Firstly, the environmental light intensity is uneven, especially in the case of artificial lighting.Secondly, there is usually a performance difference among the sensitive units of the image forming component for image capturing devices.
Thirdly, now the line scanning camera is commonly used to acquire pavement cracking images, but the corresponding image matching operation cannot combine images smoothly and seamlessly as expected.The local disconformities in image background still exist.-Feature 2: the cracking region (image area containing cracks) in the image accounts for only a small portion, which makes the useful information vulnerable to various interferences.-Feature 3: the type and direction of the cracks are often irregular and even random.
Comparatively, Feature 1 has the greatest impact because the influence of non-uniform illumination is systematic and inevitable.Feature 2 is universal, especially when the pavement condition is in good status.Feature 3 is actually a constraint, which requires that the preprocessing treatments cannot be specific for crack type and direction.Based on the considerations above, the image preprocessing method proposed in For pavement crack images, useful information exists only in the region containing cracks, but this region is usually quite small compared to the whole image, especially in the circumstances when the pavement condition is in good state.It means that most parts of the image belong to the useless background which should be excluded from the processing process to improve work efficiency.However, the interference from the image shooting environment brings so much noises that it is difficult to distinguish between the useful image region and the useless background.Therefore, it is necessary to develop an image preprocess-this study is composed of three parts, as shown in Figure 2. The background correction is for Feature 1.The image smoothing and gray scale histogram transformation are mainly for Feature 2. All these treatments have nothing to do with the crack type and direction, so the requirement of Feature 3 can be satisfied.Besides, the whole preprocessing procedure does not depend on the form of the pavement distress indeed, so it is actually suitable for all kinds of distress.

Image background correction
Based on Cheng's method [8], a simplified algorithm is taken in this study to correct the image background illumination, as shown in Figure 3.An image can be divided into multiple submatrices and the mean value of gray scales within each submatrix is considered as its representative gray scale.The size of the submatrix depends on the image quality and the desired processing efficiency as well as the size of the original image.In this study, the size of the submatrix is 30 by 30 pixels.Search for the submatrices is performed by rows and columns.If the gray scale increases or decreases beyond a given threshold, the representative gray scale should be recalculated and the average gray scale of its adjacent submatrices will be the new value.For the case of this study, the threshold mentioned above is exactly the average gray scale of the adjacent submatrices.The experiment shows that the searching sequence has little effect on the processing result.If the image background illumination is uniform, the gray scales of all submatrices should be roughly equal.Thus the background illumination can be corrected by adjusting the gray scales of all submatrices to the same level.This level is exactly the benchmark gray scale of the image background in di is the difference value between the gray scale of submatrix i and the benchmark gray scale; and i is the serial number of a submatrix, which is determined by the relative position of the submatrix in the image.The medium value of the gray scales of all submatrices is taken as the benchmark gray scale in this study.Consequently, the background correction could be done by simply calculating di for each submatrix, and then adjusting the gray scale of every pixel according to the corresponding di .Taking a crack image for example to display the correction effect is shown in Figure 4.
The original image in Figure 4 has non-uniform background illumination and obvious stains as well as seams caused by image matching and combination.The background intensity becomes uniform after correction, and the cracking information is not lost too much at the same time.The change of gray scales can be found in Figure 4(c) and Figure 4(d).A row of pixels in the image is selected randomly to illustrate the difference before and after the background correction.The trend of gray scale curve reflects the intensity variation in the selected row.The steep fall indicates that a crack may exist.The overall trend of the curve tends to be stabilized after correction, and the degree of the steep fall does not change significantly.It means that the background correction does not cause too much loss of cracking information.Besides, for the current image capture technology, most of the cracks in pavement images all belong to median or severe level, so the slight information loss caused by the background correction operation could be neglected in practice.

Image smoothing
The background correction eliminates most of the noises caused by non-uniform illumination and other interferences, but some random noise still exists, which may affect the result of image segmentation.Thus, the Gaussian smoothing is used in this study for further denoising.Taking the corrected image shown in Figure 4(b) for example, the effect of the Gaussian smoothing could be examined in Figure 5.The image histogram curve after background correction is usually unimodal.Now the peak value of the curve increases slightly while the curve tail is almost unchanged.It means that the gray distribution in the image is more concentrated and the low gray range containing cracks remains at the same time, which will favour the image segmentation.

Histogram transformation
Histogram transformation is the last step of image preprocessing, which is used to enhance the cracking information and weaken the useless background.The proportion of crack pixels in an image is usually less than 50%, even in the image of pavement patching.As the crack pixels often have low gray scales, the effective gray range for pavement distress analysis should be [0, P), where P is the peak value of the histogram curve, as shown in Figure 6.Accordingly, [P, 255] is the invalid gray range and pixels in this range could be neglected.
where I is the gray scale before transformation; P is the peak value of the histogram curve before transformation; and Inew is the gray scale after transformation.The cracking information of the pavement image has been enhanced as much as possible after preprocessing.However, as the proportion of the cracking area in an image is quite small, there are still lots of pixels only representing background and noises, which should be involved in the calculation as little as possible.Therefore, image pixels can be classified into two categories, the cracked and the uncracked ones.This operation is based on image partition, which divides an image into several processing units and has the following benefits: -Partly separating the cracked regions from the uncracked ones so that pixels involved in the image segmentation can be reduced and the processing efficiency improved.-Decreasing the probability that disconnected cracks appear in the same region.Thus it will be possible to use identical algorithms to process different image units.-Narrowing the processing area to reduce the influence of noises.The size of the unit depends on both the size of the original image and the processing accuracy to be achieved.It can be determined through pilot calculation or experience considering image resolution and quality.

Units classification
Only the unit containing cracks needs to be further processed, whereas other units will be marked and classified as image background.Thus, the issue is how to determine if there are cracks in a unit.The light intensity of the units has become approximately even after preprocessing.However, there are still some pixels with low gray scales in the unit with cracks, as shown in Figure 8, while the intensity of the uncracked unit is relatively more uniform.The range of gray scales is different between these two types of units, so the extreme difference can be used to sort them.Extreme difference is the difference value between the maximum and the minimum gray scales of an image unit.A threshold method is used here to judge if a unit is cracked, which can be determined by pilot calculation.In this study, the threshold is set to 130, as shown in Figure 8.Although the determination of the threshold involves manual intervention, the method is still quite efficient in practice.As images after preprocessing have uniform background and close gray scales, the value of the threshold will not fluctuate too much with the change of the pavement environment.

Image segmentation
For images after preprocessing and screening, the Otsu method can obtain good segmentation results.It is assumed that threshold t can divide an image into foreground and background.The proportion of pixels belonging to the foreground is w t ^h, while the one for the background is w t 1 -^h.The corresponding average gray scales are u t 1 ^h and u t 2 ^h, and the overall average gray scale of the whole image can be described by Equation (2).
And the variance between the foreground and the background is g t ^h.The threshold corresponding to the maximum g t ^h is the final threshold t * , as shown in Equation (3) and Equation ( 4), which can segment the image to the maximum.

g t w t u t u w t u t u w t w t u t u t
The segmentation result of the Otsu method can be found in Figure 9.A linear and an alligator crack image are used as examples.The effect of the alligator crack image seems better, because it has higher proportion of crack pixels and thus has higher ability to suppress noise interference.

AUTOMATIC IMAGE RECOGNITION BASED
ON BP NEURAL NETWORK

Procedure of image recognition
Image recognition is to translate the segmented image into pavement data which can be understood and calculated directly.The key of image recognition is the selection of feature parameters, which should describe the morphological characteristics of pavement cracks reasonably and exclusively.The feature parameters are not only relevant to the characteristics of pavement cracks but also involve the sequence of crack detection.A procedure of image recognition is proposed in this study as shown in Figure 10.
The asphalt pavement cracks can be divided into four categories in the general case: transversal, longitudinal, blocking, and alligator cracks.The blocking crack usually exists in a large area.But for the limited coverage of one pavement image, the blocking crack often appears as the transversal or longitudinal crack.Thus, the pavement cracks for image recognition do not include the blocking crack.In order to reduce the complexity of the feature parameters, a binary tree is used to classify different kinds of cracks, as shown in Figure 10.The first step is to distinguish between linear and alligator crack, and the next is to divide the linear crack into transversal and longitudinal crack.Thus, there are only two sets of feature parameters and each set only classifies two kinds of cracks.

Feature extraction for BPNN inputs
Although the image processing above has eliminated the interferences as much as possible, it is still rather difficult to completely distinguish the linear and alligator crack in an image.Thus BPNN is used due to its great classification capacity.In this study, two feature parameters are chosen as BPNN inputs and their value ranges are both [0, 1].(1) Minimum external rectangle (MER) of the crack's major part The coverage shape of linear and alligator cracks on the pavement is significantly different, thus the MER of the entire cracking area can be used to distinguish them.For linear or alligator cracks, the ratio of edge lengths (short edge to long edge) of MER should be close to 0 or 1, respectively.However, it is hard to determine MER accurately because there are always lots of fractures along the crack curve as well as some noise.Although the crack connection operation can simplify this problem, the connection process is quite time-consuming because the calculation involves pixelby-pixel operation and crack direction determination.In this case, a simplified method is proposed in this study to find MER.It is assumed that the global feature of a crack can be represented by its major part.Thus MER of the entire cracking area can be represented by MER of its major part.To specify this method, the major part of a crack is defined as the maximum connected curve or cracking area, which has most pixels among all the connected curves or cracking areas.The major part of a crack is actually a set of pixels and all pixels in this set are adjacent to each other, so it can be considered as a whole.Besides, it includes most of the neighbouring pixels of a crack, so it has the greatest possibility to represent the original crack.
(2) Cracking rate (CR) of the image MER reflects the morphological feature of the cracks.Besides, the numeral feature can also be used to describe the difference between these two types of cracks, which is CR of the whole image.In the strict sense, CR is the proportion of the pixels representing cracks in the image.However, there is still some random noise in the segmented image, which cannot be precisely distinguished from the cracks at the pixel level.In spite of this, CR can still be used here because the order of magnitude of the noise is the same as in all segmented images.It could be considered that a small amount of random noise will not affect the classification result in the case of this work.
Therefore, the feature parameters of linear and alligator cracks can be constructed as Equation (5), where, i is the serial number of the image; Si is the ratio of the edge lengths (short edge to long edge) of MER based on the major part of the crack; and Ci is CR of the image.
Si can be calculated through four steps.Firstly, label the different parts of the crack with serial numbers.Each part is a connected curve or cracking area.Secondly, search for the major part, which has most pixels.Thirdly, find MER of the major part.And lastly, determine the short and long edge of the MER and then calculate Si .Ci can be obtained by simply calculating the proportion of the pixels with gray scale equal

Structure of BPNN
The design of the BPNN structure is difficult and complex, especially to determine the number of hidden layers and their node numbers.Lippmann [33] and Cybenko [34] claimed that any form of classification problem could be solved by two hidden layers in the BPNN.Karsoliya [35] suggested that two or three hidden layers should be used in case of problems involving arbitrary decision boundary to arbitrary accuracy with rational activation functions.Boger [36] proved that the number of hidden layer nodes was 2/3 (or 70% to 90%) of the size of the input layer, while Berry [37] estimated that the number of hidden layer nodes should be less than twice the number of nodes in the input layer.Although several methods have been used until now, the exact formula for calculating the number of hidden layers as well as the number of nodes has still not been provided.Many factors will contribute to the size of the hidden layer, such as the input and output layer, the complexity of the activation function, the neural network architecture, the training algorithm and most importantly the training samples database.
Li, Sun, Ning, Tan: Automatic Pavement Crack Recognition Based on BP Neural Network In this study, a BPNN with three layers is used to implement crack recognition based on the theories above.The input and the output layer both have two nodes and the hidden layer includes ten nodes according to experiments and previous studies.The basic structure of the BPNN system is shown in Figure 12.The output of BPNN is sent to a classifier, which translates it to the classification data of each input.zontal direction is defined as the direction angle in this paper.If the direction angle appears in the marked region in Figure 13, it is a transversal crack, otherwise it is a longitudinal crack.

Classification effect of the direction angle
The first step to calculate the direction angle is to locate the axis of the crack curve.However, the crack curve is usually irregular and even fractured, and the width of the crack is often non-uniform either.Thus, it is kind of difficult to determine the axis accurately.In order to solve this problem, the assumption for MER calculation can also be used here that the major part of a crack can represent the entire crack.Accordingly, the direction angle of the major part could be taken as the final feature parameter.For the image in Figure 11(a), its direction angle is 172°, as shown in Figure 14, so it is a transversal crack.

Effects of image recognition based on BPNN
A collection of 400 pavement images obtained by Automatic Road Analyzer (ARAN) in Northern China was used as the BPNN data set.The ratio of linear and alligator crack images was half and half.Among these images, 60% were chosen as the training set, 20% as the validation set and the last 20% as the test set.Three sets of images were all independent and determined randomly.The classification result of the test set is listed in Table 1.The alligator crack image has a higher rate of correct classification, because it is relatively less sensible to noise than the linear crack image.The total classification accuracy was up to 92.5%, which means that the feature parameter in Equation ( 5) is effective and the BPNN structure in Figure 12 is reasonable.The linear crack can be further divided into two categories: transversal crack and longitudinal crack.These two types of cracks mainly differ in the direction of the crack curve, so the direction angle can be used as the feature parameter.As shown in Figure 13, the angle between the axis of the crack curve and the hori- As there is a clear range difference for the direction angles of different types of linear cracks, the value of the direction angle can be used directly to classify the linear cracks.Twenty-five transversal crack images and twenty-five longitudinal crack images obtained by ARAN in Northern China have been selected randomly to test this classification method.The result is listed in Table 2.The transversal crack seems to have a higher rate of correct classification.It is perhaps because the degree of the transversal crack is more serious in the case of pavement images used in this study.

Comparison with other crack recognition methods
The rates of correct classification for alligator, transversal, and longitudinal cracks are 97.5%,100% and 88.0%, respectively.Compared to some crack recognition methods used in previous studies [1,13,19,21,38,39], the method proposed in this paper is effective for all three kinds of cracks and the results are quite acceptable.The methods in previous studies have different degrees of drawbacks as shown in Table 3, and the drawbacks could be summarized as follows: -The method is specific and only works for some kind of crack.-Too complex to operate fast and high computing power is required.-The classification result is too simple to use in practice or the image samples are very small.The methods provided by Fukuhara [19] and Li [39] could only judge whether a crack existed and the results were not good enough.Other methods worked well for only particular types of cracks except for PNN and Logit model.PNN seemed to have acceptable classification results but its hidden layer had 60 nodes, which was too complex to use in practice.The problem for Logit model is that the number of image samples was too small so the result was not credible enough.
Comparatively speaking, the crack recognition methods proposed in this paper have balanced classification results and the calculation is simple enough for practical engineering application.

CONCLUSION
In order to automatically detect cracks from pavement images based on BPNN, a complete framework of image processing is proposed in this study to eliminate the influence of environmental interference on pavement images, which includes image preprocessing and crack information extraction.The image preprocessing can correct the non-uniform background and reduce noises as much as possible.The crack information extraction is achieved through two steps: image screening and image segmentation.The former operation can divide an image into several processing units to improve working efficiency and simplify image processing algorithms, while the latter is used to separate the crack from the useless image background.For images after preprocessing, the extreme difference method based on pilot calculation and Otsu method both have good performance.Besides, although the preprocessing operation only involves cracking information in this study, it is actually applicable for all kinds of pavement distress because it has nothing to do with the distress form.
For processed crack images, a BPNN with three layers is proposed to distinguish cracks between the linear and alligator crack images.According to previous studies and actual experiments, the number of nodes in the hidden layer is fixed at ten.The input layer and the output layer both have two nodes.A data set of 400 images obtained by ARAN in Northern China is used to evaluate the BPNN model and the result shows that the BPNN in this study can classify the linear and alligator crack images acceptably and reliably.Fur- thermore, the linear crack images can also be divided into transversal and longitudinal cracks according to the direction angle.Fifty images are used to test this method and the result shows that the direction angle has good performance in classifying the linear cracks.The rates of correct classification for alligator, transversal and longitudinal cracks are 97.5%,100% and 88.0%, respectively.Compared to some previous studies, the method proposed in this paper is effective for all three kinds of cracks and the results are quite acceptable for engineering application.

Figure 1 -
Figure 1 -Crack detection framework based on image processing

Figure 5 -Figure 6 -
Figure 5 -Histogram curve before and after image smoothing

Figure 4 (
b) as example, the result after Gaussian smoothing and histogram transformation is shown in Figure7.Now the preprocessing is finished and the image is ready for segmentation.

Figure 7 -Figure 8 -
Figure 7 -The image after preprocessing Li, Sun, Ning, Tan: Automatic Pavement Crack Recognition Based on BP Neural Network (a) Original linear crack image (b) Linear crack image after segmentation (c) Original alligator crack image (d) Alligator crack image after segmentation

to 1 .
Still taking the images inFigure 9(b) and Figure 9(d) for example, their major parts and corresponding MERs are shown in Figure 11, and their feature parameters are {0.1090,0.0048} and {0.7883, 0.0632}, respectively.Ratio of edge lengths: 0.1090→0 (a) Major part of the linear crack Ratio of edge lengths: 0.7883→1 (b) Major part of the alligator crack

Figure 11 -
Figure 11 -MER of the major part of the cracks Figure 12 -BPNN structure

Figure 13 -Figure 14 -
Figure 13 -Definition of the direction angle

Table 1 -
Classification result by BPNN

Table 2 -
Classification result by direction angle

Table 3 -
Some crack recognition methods used in previous studiesLi, Sun, Ning, Tan: Automatic Pavement Crack Recognition Based on BP Neural Network  INN, image-based neural network;  HNN, histogram based neural network;  PNN, Proximity based neural network;  NIS, number of image samples;  SVM, support vector machine