0 INTRODUCTION
During the lifting operation of port cranes, it is necessary to position
the grab boom at the designated position, accurately control the angle
and direction of the boom, and enable it to complete the task of
grasping and transporting objects. If the position or angle deviation of
the crane grasping the boom is too large, it can lead to low efficiency,
inaccurate operation, and major safety accidents caused by operational
errors. Therefore, by using image processing technology to detect the
corner coordinates of the crane’s grab boom, the position and direction
of the crane’s grab boom can be monitored in real-time, and whether the
boom is at the correct angle and position can be determined, thereby
ensuring efficient and accurate alignment between the crane’s grab boom
and the segment beam body lifting hole. At the same time, the detection
of the corner coordinates of the crane’s grab boom also lays the
foundation for the autonomous control and intelligent application of the
crane[1-2]. By detecting and analyzing the coordinates of corner
points, the crane can achieve automated control and intelligent
decision-making, thereby further improving the operational efficiency
and safety of the crane.
Corner detection algorithms aim to detect all possible planar coordinate
information of corners in an image. During the process of extracting
useful information from an image through feature extraction, edge
detection, and other methods, it is inevitable to lose some useful
information or introduce unnecessary interference. Corner detection
algorithms usually start from the global perspective and detect all
possible corner coordinates in the input image information. Therefore,
this series of corner detection algorithms cannot directly provide
corner coordinates for certain specific positions. Domestic and foreign
scholars have conducted extensive research on the direction of diagonal
point detection. Harris corner detection[3], Shi Tomasi corner
detection[4] based on Harris corner detection, Fast corner
detection[5-6], Sift feature detection algorithm[7-9], etc.
These corner detection algorithms are suitable for matching in massive
feature databases. The above literature mainly focuses on detecting all
corners in the image and cannot be used for corner localization at
certain specified positions.
Semantic segmentation [10-14] based on deep learning can segment
images at the pixel level, achieving precise and accurate segmentation
of images, and distinguishing different objects, object boundaries, and
backgrounds in the image. Semantic segmentation can automatically
segment the crane’s grab boom, but this series of semantic segmentation
methods often make it difficult to accurately recognize and segment
different objects when dealing with complex scenes and small targets. At
the same time, semantic segmentation methods based on deep learning
require processing a large amount of data and require a large amount of
computational resources, which increases costs. Before applying this
series of semantic segmentation methods, it is usually necessary to
annotate and train a large number of high-quality images. The time cost
of collecting images and annotating data is extremely high. If the
quality of annotation and image acquisition is not good, it will lead to
low accuracy of the trained model and even complete segmentation
failure. Compared to semantic segmentation, Otsu algorithm is suitable
for most images and can quickly and accurately classify images into
foreground and background categories. It does not require prior
information and has good robustness to noise. Ashish [15] et al.
introduced an optimal multi-level 3D Otsu image thresholding technique
and proposed a 1-D-Otsu thresholding method based on the CFA cuttlefish
algorithm to reduce noise and weak edge effects, optimizing the
traditional Otsu algorithm for color image segmentation. Jiqing Chen
[16] et al. proposed a navigation extraction method for greenhouse
cucumber harvesting robots using predicted point Hough transform. A new
grayscale factor was used for image segmentation, and finally the
predicted point Hough transform was used to fit the navigation path. The
calculation time of this method was reduced by 35.20ms compared to
traditional Hough transform, but the grayscale factor in this method is
prone to image oversegmentation. Ziwen Chen [17] et al. proposed a
vegetable crop extraction method based on automatic Hough transform
accumulation threshold. The image was segmented using a component
independent of light in the Lab color space, and the feature points of
crop rows were extracted using the dual threshold segmentation vertical
projection method. Finally, the cluster analysis in the accumulator was
clustered into the same number of classes as the number of crop rows
using the k-means clustering method. This method provides a certain
basis for solving the robustness and adaptability problems of algorithms
under multiple environmental variables, but the accuracy of line fitting
needs to be improved.
The application scenarios and issues presented in the above literature
for precise calibration and alignment of the crane’s grab boom and
segmental beam body lifting holes. This article provides a new approach
to corner positioning, which involves calculating the intersection
coordinates of the fitted straight line to locate the three corner
coordinates of the two sides of the crane’s grab boom. Use these three
corner coordinates to fit the plane and determine the position and
direction of the crane’s grab boom in space.
Firstly, in the image preprocessing stage, a grayscale difference map is
constructed through the R and G channels of the RGB color space. The
resulting difference grayscale map avoids the problem of
oversegmentation and undersegmentation of the target object, making the
grayscale histogram of the foreground and background appear bimodal,
which is conducive to Otsu’s threshold segmentation. And use the open
close operation to denoise the small impurities in the Canny edge
detection results. Secondly, in the edge line detection and fitting
process of the crane grabbing the boom, this paper proposes an optimal
adaptive threshold determination method to screen the number of votes in
the clustering results, eliminate interfering straight lines, and then
improve the clustering centroid calculation method by using weight
calculation formulas based on different proportion of votes, replacing
the original clustering centroid as the basis for line fitting. Finally,
in the corner detection and plane fitting process, the coordinates of
the three corner points of the crane’s grab boom are calculated, and the
plane information is determined using the three corner point
coordinates. The research results of this article provide a
methodological basis for solving the algorithm accuracy and robustness
problems of port cranes under multiple environmental variables.