【小白深度教程 1.30】手把手教你使用 Open3D(13)使用 双目视觉 实现目标检测和测距( Yolov8 ,Python 代码)
1. 摘要
这篇文章中,我们尝试使用双目视觉实现
逐像素深度估算
,并通过结合 YOLOv8 实现目标检测和距离测量。通过使用
块匹配
和
半全局块匹配(SGBM)
算法,我们展示了如何利用立体
计算机视觉
技术来准确确定深度信息。块匹配算法高效地在立体相机图像之间建立
对应关系
,而SGBM算法优化了
视差图
估计过程。
算法流程包括:
- 我们接收所有左右的图像
- 计算视差
- 创建深度图
- 运行 YOLOv8 目标检测
- 显示距离及其边界框。
def pipeline(left_image, right_image, object_class):
"""
Performs a pipeline of operations on stereo images to obtain a colored disparity map, RGB frame, and colored depth map.
Input:
- left_image: Left stereo image (RGB format)
- right_image: Right stereo image (RGB format)
- object_class: List of object classes of interest for bounding box retrieval
Output:
- disparity_map_colored: Colored disparity map (RGB format)
- frame_rgb: RGB frame
- depth_map_colored: Colored depth map (RGB format)
"""
global focal_length
# Calculate the disparity map
disparity_map = compute_disparity(left_image, right_image, num_disparities=90, block_size=5, window_size=5,
matcher="stereo_sgbm", show_disparity=False)
# Calculate the depth map
depth_map = calculate_depth_map(disparity_map, baseline, focal_length, show_depth_map=False)
# Get bounding box coordinates for specified object classes
bbox_coordinates = get_bounding_box_center_frame(left_image, model, names, object_class, show_output=False)
# Calculate colored disparity map, RGB frame, and colored depth map
disparity_map_colored, frame_rgb, depth_map_colored = calculate_distance(bbox_coordinates, left_image, depth_map, disparity_map, show_output=False)
return disparity_map_colored, frame_rgb, depth_map_colored
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
2. 数据集
本项目将使用
KITTI数据集
,特别是立体相机图像。数据集包含全面的
3D目标检测基准
,包括
7,481
张训练图像和
7,518
张测试图像。这些图像配有相应的点云,共提供
80,256
个标注对象用于分析。此外,项目推理阶段还将使用KITTI数据集的
同步和校正后的原始数据
,进一步提高结果的准确性和可靠性。丰富的数据集使得对所提出方法的深入探索和评估成为可能,并确保项目结果的稳健性。
3. 简单立体视觉的应用
现在我们将尝试使用KITTI数据集找到物体的距离。为此,我们首先需要了解自动驾驶汽车上摄像头的配置。幸运的是,我们有一个非常详细的解释,描述了摄像头之间的位置和基线,如下所示。
下载用于 3D目标检测 的数据集时,我们有以下文件:
- 左侧图像
- 右侧图像
- 校准数据
- 标签
4.1 提取投影矩阵
下面是校准数据的示例。由于我们使用的是左右彩色图像,因此需要提取
P2
和
P3
以获取投影矩阵。
P0: 7.215377000000e+02 0.000000000000e+00 6.095593000000e+02 0.000000000000e+00 0.000000000000e+00 7.215377000000e+02 1.728540000000e+02 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 0.000000000000e+00
P1: 7.215377000000e+02 0.000000000000e+00 6.095593000000e+02 -3.875744000000e+02 0.000000000000e+00 7.215377000000e+02 1.728540000000e+02 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 0.000000000000e+00
P2: 7.215377000000e+02 0.000000000000e+00 6.095593000000e+02 4.485728000000e+01 0.000000000000e+00 7.215377000000e+02 1.728540000000e+02 2.163791000000e-01 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 2.745884000000e-03
P3: 7.215377000000e+02 0.000000000000e+00 6.095593000000e+02 -3.395242000000e+02 0.000000000000e+00 7.215377000000e+02 1.728540000000e+02 2.199936000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 2.729905000000e-03
R0_rect: 9.999239000000e-01 9.837760000000e-03 -7.445048000000e-03 -9.869795000000e-03 9.999421000000e-01 -4.278459000000e-03 7.402527000000e-03 4.351614000000e-03 9.999631000000e-01
Tr_velo_to_cam: 7.533745000000e-03 -9.999714000000e-01 -6.166020000000e-04 -4.069766000000e-03 1.480249000000e-02 7.280733000000e-04 -9.998902000000e-01 -7.631618000000e-02 9.998621000000e-01 7.523790000000e-03 1.480755000000e-02 -2.717806000000e-01
Tr_imu_to_velo: 9.999976000000e-01 7.553071000000e-04 -2.035826000000e-03 -8.086759000000e-01 -7.854027000000e-04 9.998898000000e-01 -1.482298000000e-02 3.195559000000e-01 2.024406000000e-03 1.482454000000e-02 9.998881000000e-01 -7.997231000000e-01
- 1
- 2
- 3
- 4
- 5
- 6
- 7
注意,我们得到了左投影矩阵和右投影矩阵:
左侧投影矩阵
[[ 721.54 0 609.56 44.857]
[ 0 721.54 172.85 0.21638]
[ 0 0 1 0.0027459]]
右侧投影矩阵
[[ 721.54 0 609.56 -339.52]
[ 0 721.54 172.85 2.1999]
[ 0 0 1 0.0027299]]
RO到校正矩阵
[[ 0.99992 0.0098378 -0.007445]
[ -0.0098698 0.99994 -0.0042785]
[ 0.0074025 0.0043516 0.99996]]
激光雷达到相机矩阵
[[ 0.0075337 -0.99997 -0.0006166 -0.0040698]
[ 0.014802 0.00072807 -0.99989 -0.076316]
[ 0.99986 0.0075238 0.014808 -0.27178]]
IMU到激光雷达矩阵
[[ 1 0.00075531 -0.0020358 -0.80868]
[ -0.0007854 0.99989 -0.014823 0.31956]
[ 0.0020244 0.014825 0.99989 -0.79972]]
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
接下来,我们需要从我们的投影矩阵中提取内参矩阵和外参矩阵。回顾以下公式:
我们将使用
OpenCV
的 decomposeProjectionMatrix 函数:
def decompose_projection_matrix(projection_matrix):
"""
将投影矩阵分解为相机矩阵、旋转矩阵和平移向量。
参数:
projection_matrix (numpy.ndarray): 3x4投影矩阵。
返回:
tuple: 包含分解后的组件的元组:
- camera_matrix (numpy.ndarray): 相机矩阵。 [ fx 0 cx ]
[ 0 fy cy ]
[ 0 0 1 ]
- rotation_matrix (numpy.ndarray): 旋转矩阵。
- translation_vector (numpy.ndarray): 平移向量。
"""
# 分解投影矩阵
camera_matrix, rotation_matrix, translation_vector, _, _, _, _ = cv2.decomposeProjectionMatrix(projection_matrix)
translation_vector = translation_vector/translation_vector[3]
# 返回分解后的组件
return camera_matrix, rotation_matrix, translation_vector
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
输出如下:
# 提取焦距和基线
focal_length_x = camera_matrix_right[0, 0]
focal_length_y = camera_matrix_right[1, 1]
baseline = abs(translation_vector_left[0] - translation_vector_right[0])
- 1
- 2
- 3
- 4
可以观察到旋转矩阵是一个单位矩阵。我们需要从内参矩阵中提取焦距,并从平移向量计算基线,如下所示:
# 提取焦距和基线
focal_length_x = camera_matrix_right[0, 0]
focal_length_y = camera_matrix_right[1, 1]
baseline = abs(translation_vector_left[0] - translation_vector_right[0])
- 1
- 2
- 3
- 4
输出:
焦距(x方向):721.5377
焦距(y方向):721.5377
基线: [ 0.53271]
- 1
- 2
- 3
- 4
- 5
- 6
注意基线是0.53271米,与自动驾驶汽车的配置相同。
4.2 提取标签
我们为每对图像获取一个标签文件。下面是我们的label.txt文件,每个对象有15个值。
Car 0.00 0 -1.56 564.62 174.59 616.43 224.74 1.61 1.66 3.20 -0.69 1.69 25.01 -1.59
Car 0.00 0 1.71 481.59 180.09 512.55 202.42 1.40 1.51 3.70 -7.43 1.88 47.55 1.55
Car 0.00 0 1.64 542.05 175.55 565.27 193.79 1.46 1.66 4.05 -4.71 1.71 60.52 1.56
Cyclist 0.00 0 1.89 330.60 176.09 355.61 213.60 1.72 0.50 1.95 -12.63 1.88 34.09 1.54
DontCare -1 -1 -10 753.33 164.32 798.00 186.74 -1 -1 -1 -1000 -1000 -1000 -10
DontCare -1 -1 -10 738.50 171.32 753.27 184.42 -1 -1 -1 -1000 -1000 -1000 -10
- 1
- 2
- 3
- 4
- 5
- 6
#Values Name Description
----------------------------------------------------------------------------
1 type Describes the type of object: 'Car', 'Van', 'Truck',
'Pedestrian', 'Person_sitting', 'Cyclist', 'Tram',
'Misc' or 'DontCare'
1 truncated Float from 0 (non-truncated) to 1 (truncated), where
truncated refers to the object leaving image boundaries
1 occluded Integer (0,1,2,3) indicating occlusion state:
0 = fully visible, 1 = partly occluded
2 = largely occluded, 3 = unknown
1 alpha Observation angle of object, ranging [-pi..pi]
4 bbox 2D bounding box of object in the image (0-based index):
contains left, top, right, bottom pixel coordinates
3 dimensions 3D object dimensions: height, width, length (in meters)
3 location 3D object location x,y,z in camera coordinates (in meters)
1 rotation_y Rotation ry around Y-axis in camera coordinates [-pi..pi]
1 score Only for results: Float, indicating confidence in
detection, needed for p/r curves, higher is better.
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
注意,DontCare 标签表示未标注物体的区域,例如因为它们距离激光扫描仪太远。我们需要从标签中提取边界框坐标和每个物体的距离。我们将与地面真值进行比较。
def ground_truth_bbox(labels_file, object_class):
lines = labels_file.split('\n') # 将输入字符串拆分为行
bounding_boxes = []
for line in lines:
line = line.strip() # 删除首尾空格
if line:
data = line.split()
# 提取每个对象的相关值
obj_type = data[0]
truncated = float(data[1])
occluded = int(data[2])
alpha = float(data[3])
bbox = [float(val) for val in data[4:8]]
dimensions = [float(val) for val in data[8:11]]
location = [float(val) for val in data[11:14]]
distance = float(data[13])
rotation_y = float(data[14])
if obj_type in object_class: # 检查obj_type是否在目标类中
# 将边界框尺寸和距离添加到列表中
bounding_boxes.append((bbox, distance))
# 打印3D边界框信息
print("物体类型:", obj_type)
print("截断:", truncated)
print("遮挡:", occluded)
print("Alpha:", alpha)
print("边界框:", bbox)
print("尺寸:", dimensions)
print("位置:", location)
print("真实距离:", distance)
print("Y轴旋转:", rotation_y)
print("------------------------")
return bounding_boxes
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
输出:
物体类型: Car
截断: 0.0
遮挡: 0
Alpha: 1.71
边界框: [481.59, 180.09, 512.55, 202.42]
尺寸: [1.4, 1.51, 3.7]
位置: [-7.43, 1.88, 47.55]
真实距离: 47.55
Y轴旋转: 1.55
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
4.3 显示左右图像
我们首先显示KITTI数据集的左右图像。注意,由于基线为54 cm,相较于OAK-D立体相机,这个基线较大,因此会有更大的视差,从而更准确地计算深度图。
4.4 计算视差图
接下来,我们将使用OpenCV的块匹配函数(StereoBM_create)和半全局块匹配函数(StereoSGBM_create)来计算视差。我们需要对一些参数进行微调:num_disparities、block_size和window_size。
def compute_disparity(left_img, right_img, num_disparities=6 * 16, block_size=11, window_size=6, matcher="stereo_sgbm", show_disparity=True):
"""
计算给定立体图像对的视差图。
参数:
image (numpy.ndarray): 左立体图像。
img_pair (numpy.ndarray): 右立体图像。
num_disparities (int): 最大视差减去最小视差。
block_size (int): 块窗口的大小,必须为奇数。
window_size (int): 视差平滑窗口的大小。
matcher (str): 使用的匹配算法("stereo_bm" 或 "stereo_sgbm")。
show_disparity (bool): 是否使用matplotlib显示视差图。
返回:
numpy.ndarray: 计算的视差图。
"""
if matcher == "stereo_bm":
# 创建立体BM匹配器
matcher = cv2.StereoBM_create(numDisparities=num_disparities, blockSize=block_size)
elif matcher == "stereo_sgbm":
# 创建立体SGBM匹配器
matcher = cv2.StereoSGBM_create(
minDisparity=0, numDisparities=num_disparities, blockSize=block_size,
P1=8 * 3 * window_size ** 2, P2=32 * 3 * window_size ** 2,
mode=cv2.STEREO_SGBM_MODE_SGBM_3WAY
)
# 将图像转换为灰度
left_gray = cv2.cvtColor(left_img, cv2.COLOR_BGR2GRAY)
right_gray = cv2.cvtColor(right_img, cv2.COLOR_BGR2GRAY)
# 计算视差图
disparity = matcher.compute(left_gray, right_gray).astype(np.float32) / 16
if show_disparity:
# 使用matplotlib显示视差图
plt.imshow(disparity, cmap="CMRmap_r") #CMRmap_r # cividis
plt.title('SGBM视差图', fontsize=12)
plt.show()
return disparity
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
观察使用块匹配的视差图比SGBM更嘈杂。从现在开始,我们将使用SGBM算法。
注意我们在图像的左侧有一条白色条带,因为在左图像与右图像之间没有可以匹配的特征。这个现象取决于block_size参数。
4.5 计算深度图
接下来,我们将使用视差图来输出深度图。我们将使用以下公式:
def calculate_depth_map(disparity, baseline, focal_length, show_depth_map=True):
"""
从给定的视差图、基线和焦距计算深度图。
参数:
disparity (numpy.ndarray): 视差图。
baseline (float): 摄像头之间的基线。
focal_length (float): 相机的焦距。
返回:
numpy.ndarray: 深度图。
"""
# 用一个较小的最小值替换视差中的所有0和-1(以避免除以0或负数)
disparity[disparity == 0] = 0.1
disparity[disparity == -1] = 0.1
# 初始化深度图以匹配视差图的大小
depth_map = np.ones(disparity.shape, np.single)
# 计算深度
depth_map[:] = focal_length * baseline / disparity[:]
if show_depth_map:
# 使用matplotlib显示深度图
plt.imshow(depth_map, cmap="cividis") #CMRmap_r # cividis
plt.show()
return depth_map
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
输出是一个显示每个像素深度的深度图。
深度图代表逐像素深度估算。然而,我们感兴趣的是获取某些特定障碍物(如汽车或行人)的深度或距离。因此,我们将使用物体检测算法来分割深度图,并仅获取这些物体的距离。
4.6 目标检测
下一步是使用 YOLOv8 检测帧中的物体。我们需要获取边界框的坐标。
def get_bounding_box_center_frame(frame, model, names, object_class, show_output=True):
bbox_coordinates = []
frame_copy = frame.copy()
# 使用指定模型对输入帧执行物体检测
results = model(frame)
# 遍历物体检测的结果
for result in results:
# 遍历结果中检测到的每个边界框
for r in result.boxes.data.tolist():
# 从边界框中提取坐标、分数和类别ID
x1, y1, x2, y2, score, class_id = r
x1 = int(x1)
x2 = int(x2)
y1 = int(y1)
y2 = int(y2)
# 根据类别ID获取类别名称
class_name = names.get(class_id)
# 检查类别名称是否与指定的object_class匹配,并且检测分数高于阈值
if class_name in object_class and score > 0.5:
bbox_coordinates.append([x1, y1, x2, y2])
# 在帧上绘制边界框
cv2.rectangle(frame_copy, (x1, y1), (x2, y2), (0, 255, 0), 2)
if show_output:
# 将帧从BGR转换为RGB
frame_rgb = cv2.cvtColor(frame_copy, cv2.COLOR_BGR2RGB)
# 显示带有边界框的输出帧
cv2.imshow("Output", frame_rgb)
cv2.waitKey(0)
cv2.destroyAllWindows()
# 返回中心坐标列表
return bbox_coordinates
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
我们决定只检测汽车、人和自行车。
4.7 结合物体检测计算距离
由于我们现在有了每帧的边界框坐标和深度图,我们可以使用这两部分数据提取每个障碍物的深度。
我们通过使用边界框中心的深度图索引来获取障碍物的距离。
def calculate_distance(bbox_coordinates, frame, depth_map, disparity_map, show_output=True):
frame_copy = frame.copy()
# Normalize the disparity map to [0, 255]
disparity_map_normalized = cv2.normalize(disparity_map, None, 0, 255, cv2.NORM_MINMAX, cv2.CV_8U)
# Apply a colorful colormap to the disparity map
colormap = cv2.COLORMAP_JET
disparity_map_colored = cv2.applyColorMap(disparity_map_normalized, colormap)
# Normalize the depth map to [0, 255]
depth_map_normalized = cv2.normalize(depth_map, None, 0, 255, cv2.NORM_MINMAX, cv2.CV_8U)
# Apply a colorful colormap to the depth map
colormap = cv2.COLORMAP_BONE
depth_map_colored = cv2.applyColorMap(depth_map_normalized, colormap)
for bbox_coor in bbox_coordinates:
x1, y1, x2, y2 = bbox_coor
center_x = (x1 + x2) // 2
center_y = (y1 + y2) // 2
distance = depth_map[center_y][center_x]
print("Calculated distance:", distance)
# Convert distance to string
distance_str = f"{distance:.2f} m"
# Draw bounding box on the frame
cv2.rectangle(frame_copy, (x1, y1), (x2, y2), (0, 255, 0), 2)
# Draw bounding box on the frame
cv2.rectangle(disparity_map_colored, (x1, y1), (x2, y2), (0, 255, 0), 2)
# Draw bounding box on the frame
cv2.rectangle(depth_map_colored, (x1, y1), (x2, y2), (0, 255, 0), 2)
# Calculate the text size
text_size, _ = cv2.getTextSize(distance_str, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 2)
# Calculate the position for placing the text
text_x = center_x - text_size[0] // 2
text_y = y1 - 10 # Place the text slightly above the bounding box
# Calculate the rectangle coordinates
rect_x1 = text_x - 5
rect_y1 = text_y - text_size[1] - 5
rect_x2 = text_x + text_size[0] + 5
rect_y2 = text_y + 5
# Draw white rectangle behind the text
cv2.rectangle(frame_copy, (rect_x1, rect_y1), (rect_x2, rect_y2), (255, 255, 255), cv2.FILLED)
# Put text at the center of the bounding box
cv2.putText(frame_copy, distance_str, (text_x, text_y),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 0), 2, cv2.LINE_AA)
# Draw white rectangle behind the text
cv2.rectangle(disparity_map_colored, (rect_x1, rect_y1), (rect_x2, rect_y2), (255, 255, 255), cv2.FILLED)
# Put text at the center of the bounding box
cv2.putText(disparity_map_colored, distance_str, (text_x, text_y),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 0), 2, cv2.LINE_AA)
# Draw white rectangle behind the text
cv2.rectangle(depth_map_colored, (rect_x1, rect_y1), (rect_x2, rect_y2), (255, 255, 255), cv2.FILLED)
# Put text at the center of the bounding box
cv2.putText(depth_map_colored, distance_str, (text_x, text_y),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 0), 2, cv2.LINE_AA)
if show_output:
# Convert frame from BGR to RGB
#frame_rgb = cv2.cvtColor(frame_copy, cv2.COLOR_BGR2RGB)
# Show the output frame with bounding boxes
cv2.imshow("Output disparity map", disparity_map_colored)
cv2.imshow("Output frame", frame_copy)
cv2.imshow("Output depth map", depth_map_colored)
cv2.waitKey(0)
cv2.destroyAllWindows()
return disparity_map_colored, frame_copy, depth_map_colored
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
我们绘制边界框,物体到摄像机的距离以米为单位显示在顶部。
完整代码:
import numpy as np
import cv2
import matplotlib.pyplot as plt
from ultralytics import YOLO
import os
def display_image_pair(index, left_image_folder, right_image_folder, show_picture=True):
"""
Load and display a pair of left and right images from the specified folders.
Args:
index (int): Index of the image pair to display.
left_image_folder (str): Path to the folder containing the left images.
right_image_folder (str): Path to the folder containing the right images.
show_picture (bool): Whether to display the image pair using matplotlib.
"""
# Load the left image
left_image_path = os.path.join(left_image_folder, os.listdir(left_image_folder)[index])
left_image = cv2.cvtColor(cv2.imread(left_image_path), cv2.COLOR_BGR2RGB)
# Load the right image
right_image_path = os.path.join(right_image_folder, os.listdir(right_image_folder)[index])
right_image = cv2.cvtColor(cv2.imread(right_image_path), cv2.COLOR_BGR2RGB)
if show_picture:
# Create subplots and display the image pair
f, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 5))
ax1.imshow(left_image)
ax1.set_title('Image Left', fontsize=15)
ax2.imshow(right_image)
ax2.set_title('Image Right', fontsize=15)
# Show the plot
plt.show()
# Return the left and right images
return left_image, right_image
#--- Function to compute and display disparity
def compute_disparity(left_img, right_img, num_disparities=6 * 16, block_size=11, window_size=6, matcher="stereo_sgbm", show_disparity=True):
"""
Compute the disparity map for a given stereo image pair.
Args:
image (numpy.ndarray): Left image of the stereo pair.
img_pair (numpy.ndarray): Right image of the stereo pair.
num_disparities (int): Maximum disparity minus minimum disparity.
block_size (int): Size of the block window. It must be an odd number.
window_size (int): Size of the disparity smoothness window.
matcher (str): Matcher algorithm to use ("stereo_bm" or "stereo_sgbm").
show_disparity (bool): Whether to display the disparity map using matplotlib.
Returns:
numpy.ndarray: The computed disparity map.
"""
if matcher == "stereo_bm":
# Create a Stereo BM matcher
matcher = cv2.StereoBM_create(numDisparities=num_disparities, blockSize=block_size)
elif matcher == "stereo_sgbm":
# Create a Stereo SGBM matcher
matcher = cv2.StereoSGBM_create(
minDisparity=0, numDisparities=num_disparities, blockSize=block_size,
P1=8 * 3 * window_size ** 2, P2=32 * 3 * window_size ** 2,
mode=cv2.STEREO_SGBM_MODE_SGBM_3WAY
)
# Convert the images to grayscale
left_gray = cv2.cvtColor(left_img, cv2.COLOR_BGR2GRAY)
right_gray = cv2.cvtColor(right_img, cv2.COLOR_BGR2GRAY)
# Compute the disparity map
disparity = matcher.compute(left_gray, right_gray).astype(np.float32) / 16
if show_disparity:
# Display the disparity map using matplotlib
plt.imshow(disparity, cmap="CMRmap_r") #CMRmap_r # cividis
plt.title('Disparity map with SGBM', fontsize=12)
plt.show()
return disparity
def display_text_file(index, folder_path):
"""
Display the contents of a text file based on the specified index.
Args:
index (int): Index of the text file to display.
folder_path (str): Path to the folder containing the text files.
Returns:
str: Contents of the text file.
"""
# Get the list of text files in the folder
txt_files = [f for f in os.listdir(folder_path) if f.endswith('.txt')]
if index >= 0 and index < len(txt_files):
# Get the file path of the text file based on the index
file_path = os.path.join(folder_path, txt_files[index])
# Open the file and read its contents
with open(file_path, 'r') as file:
contents = file.read()
# Display the contents of the text file
print(f"Text file at index {index}:\n{contents}")
# Return the contents of the text file
return contents
else:
print("Invalid index.")
return None
#--- Function to retrieve callibration parameters
def get_calibration_parameters(file_contents):
"""
Retrieve calibration parameters from the contents of a calibration file.
Args:
file_contents (str): Contents of the calibration file.
Returns:
tuple: Tuple containing the calibration parameters:
- p_left (numpy.ndarray): Projection matrix for the left camera.
- p_right (numpy.ndarray): Projection matrix for the right camera.
- p_ro_rect (numpy.ndarray): Rectification matrix.
- p_velo_to_cam (numpy.ndarray): Transformation matrix from velodyne to camera coordinates.
- p_imu_to_velo (numpy.ndarray): Transformation matrix from IMU to velodyne coordinates.
"""
fin = file_contents.split('\n')
for line in fin:
if line[:2] == 'P2':
p_left = np.array(line[4:].strip().split(" ")).astype('float32').reshape(3, -1)
elif line[:2] == 'P3':
p_right = np.array(line[4:].strip().split(" ")).astype('float32').reshape(3, -1)
elif line[:7] == 'R0_rect':
p_ro_rect = np.array(line[9:].strip().split(" ")).astype('float32').reshape(3, -1)
elif line[:14] == 'Tr_velo_to_cam':
p_velo_to_cam = np.array(line[16:].strip().split(" ")).astype('float32').reshape(3, -1)
elif line[:14] == 'Tr_imu_to_velo':
p_imu_to_velo = np.array(line[16:].strip().split(" ")).astype('float32').reshape(3, -1)
return p_left, p_right, p_ro_rect, p_velo_to_cam, p_imu_to_velo
def decompose_projection_matrix(projection_matrix):
"""
Decompose a projection matrix into camera matrix, rotation matrix, and translation vector.
Args:
projection_matrix (numpy.ndarray): 3x4 projection matrix.
Returns:
tuple: Tuple containing the decomposed components:
- camera_matrix (numpy.ndarray): Camera matrix. [ fx 0 cx ]
[ 0 fy cy ]
[ 0 0 1 ]
- rotation_matrix (numpy.ndarray): Rotation matrix.
- translation_vector (numpy.ndarray): Translation vector.
"""
# Decompose the projection matrix
camera_matrix, rotation_matrix, translation_vector, _, _, _, _ = cv2.decomposeProjectionMatrix(projection_matrix)
translation_vector = translation_vector/translation_vector[3]
# Return the decomposed components
return camera_matrix, rotation_matrix, translation_vector
def calculate_depth_map(disparity, baseline, focal_length, show_depth_map=True):
"""
Calculates the depth map from a given disparity map, baseline, and focal length.
Args:
disparity (numpy.ndarray): Disparity map.
baseline (float): Baseline between the cameras.
focal_length (float): Focal length of the camera.
Returns:
numpy.ndarray: Depth map.
"""
# Replace all instances of 0 and -1 disparity with a small minimum value (to avoid div by 0 or negatives)
disparity[disparity == 0] = 0.1
disparity[disparity == -1] = 0.1
# Initialize the depth map to match the size of the disparity map
depth_map = np.ones(disparity.shape, np.single)
# Calculate the depths
depth_map[:] = focal_length * baseline / disparity[:]
if show_depth_map:
# Display the disparity map using matplotlib
plt.imshow(depth_map, cmap="cividis") #CMRmap_r # cividis
plt.title('Depth map', fontsize=12)
plt.show()
return depth_map
def save_disparity_maps(left_image_folder, right_image_folder, output_folder):
# Get the list of image files in the left image folder
left_image_files = os.listdir(left_image_folder)
# Iterate over the image files
for image_file in left_image_files:
# Construct the paths for the left and right images
left_image_path = os.path.join(left_image_folder, image_file)
right_image_path = os.path.join(right_image_folder, image_file)
# Read the left and right images
left_image = cv2.imread(left_image_path)
right_image = cv2.imread(right_image_path)
# Calculate the disparity map
disparity_map = compute_disparity(left_image, right_image, num_disparities=90, block_size=5, window_size=5,
matcher="stereo_sgbm", show_disparity=False)
# Normalize the disparity map to [0, 255]
disparity_map_normalized = cv2.normalize(disparity_map, None, 0, 255, cv2.NORM_MINMAX, cv2.CV_8U)
# Apply a colorful colormap to the disparity map
colormap = cv2.COLORMAP_JET
disparity_map_colored = cv2.applyColorMap(disparity_map_normalized, colormap)
# Construct the output file path
output_file = os.path.join(output_folder, image_file)
# Save the disparity map as an image
cv2.imwrite(output_file, disparity_map_colored)
print(f"Disparity map saved: {output_file}")
def save_depth_maps(left_image_folder, right_image_folder, output_folder, baseline, focal_length):
# Get the list of image files in the left image folder
left_image_files = os.listdir(left_image_folder)
# Iterate over the image files
for image_file in left_image_files:
# Construct the paths for the left and right images
left_image_path = os.path.join(left_image_folder, image_file)
right_image_path = os.path.join(right_image_folder, image_file)
# Read the left and right images
left_image = cv2.imread(left_image_path)
right_image = cv2.imread(right_image_path)
# Calculate the disparity map
disparity_map = compute_disparity(left_image, right_image, num_disparities=90, block_size=5, window_size=5,
matcher="stereo_sgbm", show_disparity=False)
# Calculate the depth map
depth_map = calculate_depth_map(disparity_map, baseline, focal_length, show_depth_map=False)
# # Normalize the depth map to [0, 255]
# depth_map_normalized = cv2.normalize(depth_map, None, 0, 255, cv2.NORM_MINMAX, cv2.CV_8U)
#
# # Apply a colorful colormap to the depth map
# colormap = cv2.COLORMAP_BONE
# depth_map_colored = cv2.applyColorMap(depth_map_normalized, colormap)
# Construct the output file path
output_file = os.path.join(output_folder, image_file)
# Save the depth map as an image
cv2.imwrite(output_file, depth_map)
print(f"Depth map saved: {output_file}")
def get_bounding_box_center_frame(frame, model, names, object_class, show_output=True):
bbox_coordinates = []
frame_copy = frame.copy()
# Perform object detection on the input frame using the specified model
results = model(frame)
# Iterate over the results of object detection
for result in results:
# Iterate over each bounding box detected in the result
for r in result.boxes.data.tolist():
# Extract the coordinates, score, and class ID from the bounding box
x1, y1, x2, y2, score, class_id = r
x1 = int(x1)
x2 = int(x2)
y1 = int(y1)
y2 = int(y2)
# Get the class name based on the class ID
class_name = names.get(class_id)
# Check if the class name matches the specified object_class and the detection score is above a threshold
if class_name in object_class and score > 0.5:
bbox_coordinates.append([x1, y1, x2, y2])
# Draw bounding box on the frame
cv2.rectangle(frame_copy, (x1, y1), (x2, y2), (0, 255, 0), 2)
if show_output:
# Convert frame from BGR to RGB
frame_rgb = cv2.cvtColor(frame_copy, cv2.COLOR_BGR2RGB)
# Show the output frame with bounding boxes
cv2.imshow("Output", frame_rgb)
cv2.waitKey(0)
cv2.destroyAllWindows()
# Return the list of center coordinates
return bbox_coordinates
def calculate_distance(bbox_coordinates, frame, depth_map, disparity_map, show_output=True):
frame_copy = frame.copy()
# Normalize the disparity map to [0, 255]
disparity_map_normalized = cv2.normalize(disparity_map, None, 0, 255, cv2.NORM_MINMAX, cv2.CV_8U)
# Apply a colorful colormap to the disparity map
colormap = cv2.COLORMAP_JET
disparity_map_colored = cv2.applyColorMap(disparity_map_normalized, colormap)
# Normalize the depth map to [0, 255]
depth_map_normalized = cv2.normalize(depth_map, None, 0, 255, cv2.NORM_MINMAX, cv2.CV_8U)
# Apply a colorful colormap to the depth map
colormap = cv2.COLORMAP_BONE
depth_map_colored = cv2.applyColorMap(depth_map_normalized, colormap)
for bbox_coor in bbox_coordinates:
x1, y1, x2, y2 = bbox_coor
center_x = (x1 + x2) // 2
center_y = (y1 + y2) // 2
distance = depth_map[center_y][center_x]
print("Calculated distance:", distance)
# Convert distance to string
distance_str = f"{distance:.2f} m"
# Draw bounding box on the frame
cv2.rectangle(frame_copy, (x1, y1), (x2, y2), (0, 255, 0), 2)
# Draw bounding box on the frame
cv2.rectangle(disparity_map_colored, (x1, y1), (x2, y2), (0, 255, 0), 2)
# Draw bounding box on the frame
cv2.rectangle(depth_map_colored, (x1, y1), (x2, y2), (0, 255, 0), 2)
# Calculate the text size
text_size, _ = cv2.getTextSize(distance_str, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 2)
# Calculate the position for placing the text
text_x = center_x - text_size[0] // 2
text_y = y1 - 10 # Place the text slightly above the bounding box
# Calculate the rectangle coordinates
rect_x1 = text_x - 5
rect_y1 = text_y - text_size[1] - 5
rect_x2 = text_x + text_size[0] + 5
rect_y2 = text_y + 5
# Draw white rectangle behind the text
cv2.rectangle(frame_copy, (rect_x1, rect_y1), (rect_x2, rect_y2), (255, 255, 255), cv2.FILLED)
# Put text at the center of the bounding box
cv2.putText(frame_copy, distance_str, (text_x, text_y),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 0), 2, cv2.LINE_AA)
# Draw white rectangle behind the text
cv2.rectangle(disparity_map_colored, (rect_x1, rect_y1), (rect_x2, rect_y2), (255, 255, 255), cv2.FILLED)
# Put text at the center of the bounding box
cv2.putText(disparity_map_colored, distance_str, (text_x, text_y),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 0), 2, cv2.LINE_AA)
# Draw white rectangle behind the text
cv2.rectangle(depth_map_colored, (rect_x1, rect_y1), (rect_x2, rect_y2), (255, 255, 255), cv2.FILLED)
# Put text at the center of the bounding box
cv2.putText(depth_map_colored, distance_str, (text_x, text_y),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 0), 2, cv2.LINE_AA)
if show_output:
# Convert frame from BGR to RGB
#frame_rgb = cv2.cvtColor(frame_copy, cv2.COLOR_BGR2RGB)
# Show the output frame with bounding boxes
cv2.imshow("Output disparity map", disparity_map_colored)
cv2.imshow("Output frame", frame_copy)
cv2.imshow("Output depth map", depth_map_colored)
cv2.waitKey(0)
cv2.destroyAllWindows()
return disparity_map_colored, frame_copy, depth_map_colored
def ground_truth_bbox(labels_file, object_class):
lines = labels_file.split('\n') # Split the input string into lines
bounding_boxes = []
for line in lines:
line = line.strip() # Remove leading/trailing whitespace
if line:
data = line.split()
# Extract relevant values for each object
obj_type = data[0]
#print(obj_type)
truncated = float(data[1])
occluded = int(data[2])
alpha = float(data[3])
bbox = [float(val) for val in data[4:8]]
dimensions = [float(val) for val in data[8:11]]
location = [float(val) for val in data[11:14]]
distance = float(data[13])
rotation_y = float(data[14])
if obj_type in object_class: # Check if obj_type is in the desired classes
# Append bounding box dimensions and distance to the list
bounding_boxes.append((bbox, distance))
# Print the 3D bounding box information
print("Object Type:", obj_type)
print("Truncated:", truncated)
print("Occluded:", occluded)
print("Alpha:", alpha)
print("Bounding Box:", bbox)
print("Dimensions:", dimensions)
print("Location:", location)
print("True Distance:", distance)
print("Rotation Y:", rotation_y)
print("------------------------")
return bounding_boxes
def display_ground_truth(frame, bounding_boxes, show_output=True):
frame_copy = frame.copy()
for bbox, distance in bounding_boxes:
# Extract bounding box coordinates
x1, y1, x2, y2 = map(int, bbox)
center_x = (x1 + x2) // 2
# Convert distance to string
distance_str = f"{distance:.2f} m"
# Draw bounding box on the frame
cv2.rectangle(frame_copy, (x1, y1), (x2, y2), (0, 255, 0), 2)
# Calculate the text size
text_size, _ = cv2.getTextSize(distance_str, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 2)
# Calculate the position for placing the text
text_x = center_x - text_size[0] // 2
text_y = y1 - 10 # Place the text slightly above the bounding box
# Calculate the rectangle coordinates
rect_x1 = text_x - 5
rect_y1 = text_y - text_size[1] - 5
rect_x2 = text_x + text_size[0] + 5
rect_y2 = text_y + 5
# Draw white rectangle behind the text
cv2.rectangle(frame_copy, (rect_x1, rect_y1), (rect_x2, rect_y2), (255, 255, 255), cv2.FILLED)
# Put text at the center of the bounding box
cv2.putText(frame_copy, distance_str, (text_x, text_y),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 0, 0), 2, cv2.LINE_AA)
if show_output:
# Convert frame from BGR to RGB
image_rgb = cv2.cvtColor(frame_copy, cv2.COLOR_BGR2RGB)
# Display the image with bounding boxes
cv2.imshow("Bounding Boxes", image_rgb)
cv2.waitKey(0)
cv2.destroyAllWindows()
def pipeline(left_image, right_image, object_class):
"""
Performs a pipeline of operations on stereo images to obtain a colored disparity map, RGB frame, and colored depth map.
Input:
- left_image: Left stereo image (RGB format)
- right_image: Right stereo image (RGB format)
- object_class: List of object classes of interest for bounding box retrieval
Output:
- disparity_map_colored: Colored disparity map (RGB format)
- frame_rgb: RGB frame
- depth_map_colored: Colored depth map (RGB format)
"""
global focal_length
# Calculate the disparity map
disparity_map = compute_disparity(left_image, right_image, num_disparities=90, block_size=5, window_size=5,
matcher="stereo_sgbm", show_disparity=False)
# Calculate the depth map
depth_map = calculate_depth_map(disparity_map, baseline, focal_length, show_depth_map=False)
# Get bounding box coordinates for specified object classes
bbox_coordinates = get_bounding_box_center_frame(left_image, model, names, object_class, show_output=False)
# Calculate colored disparity map, RGB frame, and colored depth map
disparity_map_colored, frame_rgb, depth_map_colored = calculate_distance(bbox_coordinates, left_image, depth_map, disparity_map, show_output=False)
return disparity_map_colored, frame_rgb, depth_map_colored
def process_pipeline_images(left_image_folder, right_image_folder, output_folder_distance, object_class=['car', 'bicycle']):
# Get the list of image files in the left image folder
left_image_files = os.listdir(left_image_folder)
# Iterate over the image files
for image_file in left_image_files:
# Construct the paths for the left and right images
left_image_path = os.path.join(left_image_folder, image_file)
right_image_path = os.path.join(right_image_folder, image_file)
# Read the left and right images
left_image = cv2.imread(left_image_path)
right_image = cv2.imread(right_image_path)
disparity_map_colored, frame_rgb, depth_map_colored = pipeline(left_image, right_image, object_class)
# Construct the output file path
output_file = os.path.join(output_folder_distance, image_file)
# Save the disparity map as an image
cv2.imwrite(output_file, disparity_map_colored)
print(f"Disparity map saved: {output_file}")
def frames_to_video(frame_folder, output_folder, output_filename):
"""
Converts a sequence of frames in a folder into an MP4 video and saves it in the specified output folder.
Args:
frame_folder (str): Path to the folder containing the frames.
output_folder (str): Path to the output folder where the video will be saved.
output_filename (str): Name of the output video file (including the .mp4 extension).
"""
# Get the list of frame filenames in the folder
frame_filenames = sorted(os.listdir(frame_folder))
# Read the first frame to get frame size information
first_frame_path = os.path.join(frame_folder, frame_filenames[0])
first_frame = cv2.imread(first_frame_path)
height, width, channels = first_frame.shape
# Define the video codec and create a VideoWriter object
fourcc = cv2.VideoWriter_fourcc(*"mp4v")
output_path = os.path.join(output_folder, output_filename)
video_writer = cv2.VideoWriter(output_path, fourcc, 15, (width, height))
# Iterate over the frame filenames and write each frame to the video
for frame_filename in frame_filenames:
frame_path = os.path.join(frame_folder, frame_filename)
frame = cv2.imread(frame_path)
video_writer.write(frame)
# Release the VideoWriter and print completion message
video_writer.release()
print("Video saved successfully at:", output_path)
def concatenate_videos_vertical(video1_path, video2_path, output_path):
# Read the input videos
video1 = cv2.VideoCapture(video1_path)
video2 = cv2.VideoCapture(video2_path)
# Get video properties
width = int(video1.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(video1.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = video1.get(cv2.CAP_PROP_FPS)
# Create a VideoWriter object to save the concatenated video
output_video = cv2.VideoWriter(output_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (width, height * 2))
while True:
# Read frames from both videos
ret1, frame1 = video1.read()
ret2, frame2 = video2.read()
if not ret1 or not ret2:
# Either video1 or video2 has ended
break
# Resize the frames to match the width of the output video
frame1_resized = cv2.resize(frame1, (width, height))
frame2_resized = cv2.resize(frame2, (width, height))
# Concatenate the frames vertically
concat_frame = cv2.vconcat([frame1_resized, frame2_resized])
# Write the concatenated frame to the output video
output_video.write(concat_frame)
# Release resources
video1.release()
video2.release()
output_video.release()
print("Concatenation complete. Output video saved:", output_path)
if __name__ == '__main__':
# Get the current directory
current_directory = os.getcwd()
# Go back to the parent directory
parent_directory = os.path.dirname(current_directory)
# Set input directory
left_image_folder = os.path.join(parent_directory, 'Data', 'Left', 'image_2')
right_image_folder = os.path.join(parent_directory, 'Data', 'Right', 'image_3')
labels_folder = os.path.join(parent_directory, 'Data', 'Labels', 'training')
calibration_folder = os.path.join(parent_directory, 'Data', 'Callibration', 'training', 'calib')
output_disparity_folder = os.path.join(parent_directory, 'Data', 'Output_Disparity_1')
output_depth_folder = os.path.join(parent_directory, 'Data', 'Output_Depth_1')
# Choose index of image
index = 13
# Display the image pair and get the left and right images
left_image, right_image = display_image_pair(index=index, left_image_folder=left_image_folder, right_image_folder=right_image_folder, show_picture=True)
print("\nImage shape: ", left_image.shape)
# Compute disparity map
disparity_map = compute_disparity(left_image, right_image, num_disparities=90, block_size=5, window_size=5, matcher="stereo_sgbm", show_disparity=True) #stereo_bm #stereo_sgbm
print(disparity_map)
print("\nDisparity map shape: ", disparity_map.shape)
# Display the text file contents
calibration_file = display_text_file(index, calibration_folder)
label_file = display_text_file(index, labels_folder)
# Extract calibration parameters
p_left, p_right, p_ro_rect, p_velo_to_cam, p_imu_to_velo = get_calibration_parameters(calibration_file)
print("\nLeft P Matrix")
print(p_left)
print("\nRight P Matrix")
print(p_right)
print("\nRO to Rect Matrix")
print(p_ro_rect)
print("\nVelodyne to Camera Matrix")
print(p_velo_to_cam)
print("\nIMU to Velodyne Matrix")
print(p_imu_to_velo)
# Decompose the projection matrix
camera_matrix_left, rotation_matrix_left, translation_vector_left = decompose_projection_matrix(p_left)
camera_matrix_right, rotation_matrix_right, translation_vector_right = decompose_projection_matrix(p_right)
# Print the decomposed components
print("\nCamera Matrix Left:")
print(camera_matrix_left)
print("\nRotation Matrix Left:")
print(rotation_matrix_left)
print("\nTranslation Vector Left:")
print(translation_vector_left)
print("\nCamera Matrix Right:")
print(camera_matrix_right)
print("\nRotation Matrix Right:")
print(rotation_matrix_right)
print("\nTranslation Vector Right:")
print(translation_vector_right)
# Extract the focal length and baseline
focal_length_x = camera_matrix_right[0, 0]
focal_length_y = camera_matrix_right[1, 1]
baseline = abs(translation_vector_left[0] - translation_vector_right[0])
# Print the extracted values
print("\nFocal Length (x-direction):", focal_length_x)
print("\nFocal Length (y-direction):", focal_length_y)
print("\nBaseline:", baseline, "\n")
depth_map = calculate_depth_map(disparity=disparity_map, baseline=baseline, focal_length=focal_length_x, show_depth_map=True)
print("Depth Map: ", depth_map)
print("\nDepth map shape: ",depth_map.shape)
print("\nDepth map distance: ", depth_map[200,600], " m")
#save_disparity_maps(left_image_folder, right_image_folder, output_disparity_folder)
#save_depth_maps(left_image_folder, right_image_folder, output_depth_folder, baseline, focal_length_x)
# Instantiate model
weights_path = os.path.join(parent_directory, 'Weights', 'yolov8m.pt')
model = YOLO(weights_path)
names = model.names
print(names)
# Process the frame to get bounding box centers
bbox_coordinates = get_bounding_box_center_frame(left_image, model, names, object_class=['car', 'bicycle'], show_output=True)
# Calculate distance from camera to objects
disparity_map_distance, frame_distance, depth_map_distance = calculate_distance(bbox_coordinates, left_image, depth_map, disparity_map,show_output=True)
###-------COMPARE WITH GROUND TRUTH
# Get data from labels.txt
bounding_boxes_ground_truth = ground_truth_bbox(label_file, object_class=['Car', 'Cyclist', 'Pedestrian'])
# Display ground truth image with distance
display_ground_truth(left_image, bounding_boxes_ground_truth, show_output=True)
###-------------PIPELINE ON SINGLE IMAGE
focal_length = focal_length_x
index = 7
# Load the left image
left_image_path = os.path.join(left_image_folder, os.listdir(left_image_folder)[index])
left_image = cv2.cvtColor(cv2.imread(left_image_path), cv2.COLOR_BGR2RGB)
print(left_image)
# Load the right image
right_image_path = os.path.join(right_image_folder, os.listdir(right_image_folder)[index])
right_image = cv2.cvtColor(cv2.imread(right_image_path), cv2.COLOR_BGR2RGB)
disparity_map_colored, frame_rgb, depth_map_colored = pipeline(left_image, right_image, object_class=['car', 'bicycle'])
# # Show the output frame with bounding boxes
# frame_rgb = cv2.cvtColor(frame_rgb, cv2.COLOR_BGR2RGB)
# cv2.imshow("Output disparity map", disparity_map_colored)
# cv2.imshow("Output frame", frame_rgb)
# cv2.imshow("Output depth map", depth_map_colored)
# cv2.waitKey(0)
# cv2.destroyAllWindows()
### ------------ PIPELINE ON FOLDER ----------------------------- ###
# Set input and output directory
left_image_folder = os.path.join(parent_directory, 'Data', 'Left_4')
right_image_folder = os.path.join(parent_directory, 'Data', 'Right_4')
output_folder_distance = os.path.join(parent_directory, 'Data', 'Output_Disparity_Distance_4')
output_video_folder = os.path.join(parent_directory, 'Data', 'Output_Video')
# ##--- Process all images in folder and save disparity map
# process_pipeline_images(left_image_folder, right_image_folder, output_folder_distance, object_class=['car', 'bicycle', 'person'])
#
# ### ---- Convert frames to video
# frames_to_video(output_folder_distance, output_video_folder, 'output_disparity_4.mp4')
# frames_to_video(left_image_folder, output_video_folder, 'output_4.mp4')
### - Concat videos
# video_1 = output_video_folder = os.path.join(parent_directory, 'Data', 'Output_Video', 'output_1.mp4')
# video_2 = output_video_folder = os.path.join(parent_directory, 'Data', 'Output_Video', 'output_disparity_1.mp4')
# output_video_folder = os.path.join(parent_directory, 'Data', 'Output_Video', 'concat_1.mp4')
# concatenate_videos_vertical(video_1, video_2, output_video_folder)
#
#
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
- 86
- 87
- 88
- 89
- 90
- 91
- 92
- 93
- 94
- 95
- 96
- 97
- 98
- 99
- 100
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
- 118
- 119
- 120
- 121
- 122
- 123
- 124
- 125
- 126
- 127
- 128
- 129
- 130
- 131
- 132
- 133
- 134
- 135
- 136
- 137
- 138
- 139
- 140
- 141
- 142
- 143
- 144
- 145
- 146
- 147
- 148
- 149
- 150
- 151
- 152
- 153
- 154
- 155
- 156
- 157
- 158
- 159
- 160
- 161
- 162
- 163
- 164
- 165
- 166
- 167
- 168
- 169
- 170
- 171
- 172
- 173
- 174
- 175
- 176
- 177
- 178
- 179
- 180
- 181
- 182
- 183
- 184
- 185
- 186
- 187
- 188
- 189
- 190
- 191
- 192
- 193
- 194
- 195
- 196
- 197
- 198
- 199
- 200
- 201
- 202
- 203
- 204
- 205
- 206
- 207
- 208
- 209
- 210
- 211
- 212
- 213
- 214
- 215
- 216
- 217
- 218
- 219
- 220
- 221
- 222
- 223
- 224
- 225
- 226
- 227
- 228
- 229
- 230
- 231
- 232
- 233
- 234
- 235
- 236
- 237
- 238
- 239
- 240
- 241
- 242
- 243
- 244
- 245
- 246
- 247
- 248
- 249
- 250
- 251
- 252
- 253
- 254
- 255
- 256
- 257
- 258
- 259
- 260
- 261
- 262
- 263
- 264
- 265
- 266
- 267
- 268
- 269
- 270
- 271
- 272
- 273
- 274
- 275
- 276
- 277
- 278
- 279
- 280
- 281
- 282
- 283
- 284
- 285
- 286
- 287
- 288
- 289
- 290
- 291
- 292
- 293
- 294
- 295
- 296
- 297
- 298
- 299
- 300
- 301
- 302
- 303
- 304
- 305
- 306
- 307
- 308
- 309
- 310
- 311
- 312
- 313
- 314
- 315
- 316
- 317
- 318
- 319
- 320
- 321
- 322
- 323
- 324
- 325
- 326
- 327
- 328
- 329
- 330
- 331
- 332
- 333
- 334
- 335
- 336
- 337
- 338
- 339
- 340
- 341
- 342
- 343
- 344
- 345
- 346
- 347
- 348
- 349
- 350
- 351
- 352
- 353
- 354
- 355
- 356
- 357
- 358
- 359
- 360
- 361
- 362
- 363
- 364
- 365
- 366
- 367
- 368
- 369
- 370
- 371
- 372
- 373
- 374
- 375
- 376
- 377
- 378
- 379
- 380
- 381
- 382
- 383
- 384
- 385
- 386
- 387
- 388
- 389
- 390
- 391
- 392
- 393
- 394
- 395
- 396
- 397
- 398
- 399
- 400
- 401
- 402
- 403
- 404
- 405
- 406
- 407
- 408
- 409
- 410
- 411
- 412
- 413
- 414
- 415
- 416
- 417
- 418
- 419
- 420
- 421
- 422
- 423
- 424
- 425
- 426
- 427
- 428
- 429
- 430
- 431
- 432
- 433
- 434
- 435
- 436
- 437
- 438
- 439
- 440
- 441
- 442
- 443
- 444
- 445
- 446
- 447
- 448
- 449
- 450
- 451
- 452
- 453
- 454
- 455
- 456
- 457
- 458
- 459
- 460
- 461
- 462
- 463
- 464
- 465
- 466
- 467
- 468
- 469
- 470
- 471
- 472
- 473
- 474
- 475
- 476
- 477
- 478
- 479
- 480
- 481
- 482
- 483
- 484
- 485
- 486
- 487
- 488
- 489
- 490
- 491
- 492
- 493
- 494
- 495
- 496
- 497
- 498
- 499
- 500
- 501
- 502
- 503
- 504
- 505
- 506
- 507
- 508
- 509
- 510
- 511
- 512
- 513
- 514
- 515
- 516
- 517
- 518
- 519
- 520
- 521
- 522
- 523
- 524
- 525
- 526
- 527
- 528
- 529
- 530
- 531
- 532
- 533
- 534
- 535
- 536
- 537
- 538
- 539
- 540
- 541
- 542
- 543
- 544
- 545
- 546
- 547
- 548
- 549
- 550
- 551
- 552
- 553
- 554
- 555
- 556
- 557
- 558
- 559
- 560
- 561
- 562
- 563
- 564
- 565
- 566
- 567
- 568
- 569
- 570
- 571
- 572
- 573
- 574
- 575
- 576
- 577
- 578
- 579
- 580
- 581
- 582
- 583
- 584
- 585
- 586
- 587
- 588
- 589
- 590
- 591
- 592
- 593
- 594
- 595
- 596
- 597
- 598
- 599
- 600
- 601
- 602
- 603
- 604
- 605
- 606
- 607
- 608
- 609
- 610
- 611
- 612
- 613
- 614
- 615
- 616
- 617
- 618
- 619
- 620
- 621
- 622
- 623
- 624
- 625
- 626
- 627
- 628
- 629
- 630
- 631
- 632
- 633
- 634
- 635
- 636
- 637
- 638
- 639
- 640
- 641
- 642
- 643
- 644
- 645
- 646
- 647
- 648
- 649
- 650
- 651
- 652
- 653
- 654
- 655
- 656
- 657
- 658
- 659
- 660
- 661
- 662
- 663
- 664
- 665
- 666
- 667
- 668
- 669
- 670
- 671
- 672
- 673
- 674
- 675
- 676
- 677
- 678
- 679
- 680
- 681
- 682
- 683
- 684
- 685
- 686
- 687
- 688
- 689
- 690
- 691
- 692
- 693
- 694
- 695
- 696
- 697
- 698
- 699
- 700
- 701
- 702
- 703
- 704
- 705
- 706
- 707
- 708
- 709
- 710
- 711
- 712
- 713
- 714
- 715
- 716
- 717
- 718
- 719
- 720
- 721
- 722
- 723
- 724
- 725
- 726
- 727
- 728
- 729
- 730
- 731
- 732
- 733
- 734
- 735
- 736
- 737
- 738
- 739
- 740
- 741
- 742
- 743
- 744
- 745
- 746
- 747
- 748
- 749
- 750
- 751
- 752
- 753
- 754
- 755
- 756
- 757
- 758
- 759
- 760
- 761
- 762
- 763
- 764
- 765
- 766
- 767
- 768
- 769
- 770
- 771
- 772
- 773
- 774
- 775
- 776
- 777
- 778
- 779
- 780
- 781
- 782
- 783
- 784
- 785
- 786
- 787
- 788
- 789
- 790
- 791
- 792
- 793