Chapter 26 | Video and Image Analysis
Hurry! The End-of-Year Sale is almost over – get 10% off new licenses, additional licenses, and upgrades, plus FREE training while it lasts!
Introduction
Virtual CRASH 6 features powerful video analysis tools that make it easy to position objects in both space and time, helping you better understand time-distance relationships and improve your analysis. Read on to learn more about these amazing features.
Simple Video Orthorectification
A longstanding feature in the Virtual CRASH toolkit, the video rectifier tool is a versatile and user-friendly solution for various applications. In the video below, we demonstrate the workflow for orthorectifying a perspective video to use as an underlay for your analysis. This same workflow can be applied for simple images as well. Note, as of the Winter 2024 Software Update, you can now drag and drop your video directly into your scene without the need for a “dummy” image.
Calibrating and Positioning Cameras
Users can leverage scene data, such as point clouds, orthomosaic images, or even Google Maps imagery, to correlate reference points in space with those visible in a subject video or image. This allows Virtual CRASH to calculate the camera's position, orientation, and lens specifications. The process is demonstrated below in various examples (1, 2).
Simple practice
It is best to practice with a simple test case which you can easily generate within Virtual CRASH (download here). Here we start with a simple scene with various points placed on the xy plane using the point array tool.
Below is the output generated by our subject camera. We will import this into our scene as though it were captured by a real-world security camera. Our goal is to test whether we can accurately match the new test camera's position, orientation, and calibration parameters to those of the original subject camera.
Next, we create our test camera. Its color has been changed to red in the "misc" menu to make it easier to distinguish from the subject camera.
Load “background” image
Open the camera’s “background” menu and left-click “file” to load the image. Note that the workflow remains the same whether you’re loading a single image, such as an exported video frame, or an entire video.
Once an image or video is loaded into the camera object’s background, new input fields will appear in the camera’s “background” and “calibration” menus.
The background > “distance” setting controls the distance from the camera at which the image plane is projected into the scene. The background > “opacity” setting adjusts the opacity of the image plane as it is projected into the scene. Lowering the opacity can help you compare the image or video projection with evidence in the scene or other reference points more easily.
Objects with “receive projection” enabled (such as the example box in the scene shown below) will display the projection of the camera’s background image or video. You can disable this projection process by toggling off “use projection” in the camera’s background menu. The “bring to front” option ensures that the camera’s background image has projection priority over other elements such as CAD objects, orthomosaics, and albedo channels.
Match reference points
To solve for the test camera’s radial distortion coefficients (k1, k2, k3) and the tangential distortion coefficients (p1, p2), we’ll left-click on “add point” in the calibration menu.
When using “add point”, you’ll want to break out of your test camera view and switch to perspective interactive camera (NumPad[5]). Note: after left-clicking “add point”, your mouse cursor will be locked to this function, preventing you from using hold+left-click to drag the interactive camera across the scene. To navigate your scene, hover your cursor over the opposite side of the scene where you want to move the camera, zoom out, reposition the cursor over the area of interest, and zoom in.
Next, left-click on the reference point in your scene.
Virtual CRASH will then switch to a view mode for the camera’s background image or video. Next, left-click the pixel within the image or video that corresponds to the point in space you clicked on in the previous step. In this example, we select the pixel matching point 1.
To assist with this process, you can use the magnifier tool by scrolling with your mouse’s scroll wheel. The magnifier’s lens properties can be adjusted by using:
Scroll wheel: changes magnification
Ctrl + scroll wheel: changes size of magnifier
Shift + scroll wheel: changes magnification across full magnifier circle
After left-clicking, “add point” mode remains active, and so the next point can be selected.
If you select the wrong point or have finished the point selection process, you can exit “add point” mode by right-clicking.
After right-clicking to exit “add point” mode, you can make adjustments by switching to “Select, Move and Manipulate” [F3] cursor control. Hover over the point that needs adjustment. To delete the point, click the “x” icon. To adjust the point position in 3D space, hold+left-click and drag the point marker itself. To adjust the point’s pixel location in the background image or video, hold left-click and drag the double translation arrows.
Continue holding the left-mouse button down as you adjust the pixel location for the point, then release.
To use the classic translation control grips to modify a point’s location in 3D space, switch to “Vertices” selection type [Shift+V] and press [0] at the top of the keyboard. Remember to switch back to “Object” [Shift+O] mode afterward.
Once you are satisfied with your point and pixel selections, left-click on “calibrate”.
The calibration process will complete quickly, and the camera’s position, orientation, fov, k1, k2, k3, p1, and p2 values will automatically update. In the image below, we see that the test and subject camera positions align extremely closely. For this solution, the distance between the cameras is 4.5 inches, with yaw and roll differences of 0.1 degrees and a pitch difference of 0.3 degrees. The terrain projected image is superimposed on the test scene with excellent agreement.
To test the camera placement in a 3D view (rather than projected on the terrain), switch to the camera’s view and use the opacity option in the background menu, or toggle the camera’s visibility.
Often, finding the right calibration for the camera is an iterative process. New points can be added, and camera calibration constants can be manually adjusted as needed or fixed (held constant) while other parameters are solved.
When performing calibration, it is recommended to use distortion coefficients conservatively. If possible, leave k2, k3, p1, and p2 at their default values of 0. Disable the “fix” option for these parameters if needed. In particular, p1 and p2 should remain at 0 unless absolutely necessary.
The simple practice illustrated above is shown in this video:
Single photograph of subject scene with Scene Data
Returning to the example first illustrated for the image/video rectifier tool (learn more), we now explore a use case involving a perspective photograph with reference points and witness marks. Revisiting the crash scene a year later, we use our survey data—containing still-visible reference points—to calibrate the camera to the single image and determine its position and orientation within the scene at the time the photograph was taken. This process also accurately places the witness marks at their correct positions within the scene for analysis.
The advantage of the workflow demonstrated below, compared to the image/video rectifier tool, is that the terrain does not need to be flat. Additionally, any points, even those above the terrain surface, can be used as reference points for camera calibration.
Here we have a photograph taken after crash test #3 at the 2018 IPTM Symposium on Traffic Safety.
In the photograph above we see a mark created by the passenger side rear tire (the mark was made by a combination of the tire and a chain that was used to lock the rear wheels during the crash test). Using the image rectifier tool, we’ll try to estimate the position of this tire mark in space by using the image rectifier tool. Below is a video of the crash.
Let’s suppose we conduct a scene inspection a couple of years after the subject accident. We expect the witness marks no longer exist, but other features in the environment may remain, such as the road striping. In our test case, a year after the 2018 crash test, we returned to the scene for the 2019 IPTM Symposium, where other tests were done in the same parking lot. This gave us an opportunity to create a new point cloud and orthomosaic of the scene without the evidence from 2018 test #3.
During the 2019 Symposium, a DJI Mavic 2 Pro was flown over the scene and Pix4D was used to process the drone photographs, resulting in an orthomosaic, point cloud, and tfw file. Using the Smart Alignment Tool, we can load this data and ground control points into Virtual CRASH. The 2019 scene data is imported into the scene automatically aligned, and from this we can obtain our reference measurements. The faint witness mark from the prior year is still visible.
Using the same workflow shown above, the “add point” button was used to match the ends and corners of the parking lot stripes between the 3D environment and those visible in the photograph. The “calibrate” function was then used to solve for the camera settings and to position and orient the camera.
Below, we illustrate the effect of varying the distance of the image plane through the scene. Notice how the striping pattern in the 2018 photograph aligns exceptionally well with the 2019 survey data.
Switching to the camera’s view, we use the “show” toggle to demonstrate the alignment of the witness mark from the perspective view.
Google maps with Background from Google Street View
Using the workflow shown above, one can superimpose Google Maps imagery (learn more) with Google Street View images screen-captured from a web browser. Note, the inpaint and clone tools can even be used to remove vehicles from camera background images such as the one shown below from Google Street view.
Below we show a simulated vehicle collision within this environment.
photogrammetry/scanner data with Background from Google Street View
This same process, of course, can be used to align Google Street View imagery (or any perspective photograph) with survey data, including photogrammetry and/or scanner data, as long as reference points are consistent. Below, this is illustrated.
Below is an example video sequence using this Street View image as a background. Notice that because our calibrated camera is projecting the image back into the scene, any object with “receive projection” enabled—such as the sign to the left—will display the background image as a texture. This is why the vehicles appear to go behind the sign: there are two 3D box objects set up in the foreground, which are being textured by the camera’s background image.
Video Analysis
The exact same point matching proceedure can also be applied to video files. In the example below, reference points in the post-crash orthomosaic are easily correlated with the video. Note, when a video is used used a camera’s background, the time-min, time-max, time-scale (playback speed), and time-offset attributes become available. In this case, we fix the camera parameters (k1, k2, k3, p1, p2) to 0 and calibrate only for fov, position, and orientation. Oftentimes, it is easier to find a good camera calibration solution by restricting these values or a subset of them.
Below, we show an overlayed animated motorcycle motion based on a manual analysis of the video by overlaying a 3D motorcycle object in the corresponding locations indicated by the projected video (automatic tracking is discussed below). The video is being projected back into the scene and is texture mapping the terrain mesh. The camera view is that of the calibration solution.
Example 2
In the example below, we use the WREX 2023 Crash Test 3 orthomosaic and align it with one of the videos using clear reference points. Additionally, we have the scanner-based point cloud of the 1956 Ford Victoria. By combining the Ford point cloud with the video background of our calibrated video, we can determine the Ford's positions as a function of the frame number. This is demonstrated in the video below.
Using the axis tool, measuring the position of the point cloud in each frame is straightforward. The implied position versus time can be graphed, providing an estimate of the vehicle's speed over time. In this case, the estimated pre-impact speed, determined by projecting the video into the scene, was within 1 mph of the actual pre-impact speed. Automated speed analysis is also possible using point tracking. Continue reading below to learn how to implement point tracking.
Manual POINT Tracking
The manual point tracking procedure can be applied to both to imported videos and videos projected from cameras. In the example below, the process is applied to an imported (overhead) video that was scaled to the scene orthomosaic using the “Select, Move and Align” tool.
To start, left-click to select the imported video. Next, left-click tools > object tracking > “manual track point”.
Your workspace window will switch to a special viewer model. Use your scroll wheel to zoom into the region of interest. Left-click on the point you would like to track.
Next, use Ctrl+scroll wheel to advanced forward and backward in time. Left-click once again on the point of interest. Right-click to terminate the point selection procedure and to exit.
After exiting, a graph showing speed versus time will appear. For each track generated from the given video, a separate graph will be available for viewing. Speed is calculated by dividing the displacement vector between two consecutive track points by the time interval between them (3).
min. delta time
To reduce volatility caused by jitter when tracking a point of interest in the video, the time interval can be adjusted using the “min delta time” setting. Increasing this interval modifies which past and future track points (relative to the current time-step) are used to calculate the displacement vector, resulting in a smoother speed graph. At a given time-step, the initial position for the displacement vector will be at the track point nearest in time to (current time-step - 0.5 * min delta time) - dt, and the final position will be at the track point nearest in time to (current time-step + 0.5 * max delta time), where dt is the time increment for one frame. The time interval used for the average speed calculation will be the difference in time associated with the final and initial track points.
Additional time-averaging can also be performed using the linear and Butterworth filters (enable the “use” toggle in the “filter” menu).
automatic POINT Tracking
The automatic point tracking procedure can be applied to both to imported videos and videos projected from cameras. In the example below, the process is applied to an imported (overhead) video that was scaled to the scene orthomosaic using the “Select, Move and Align” tool.
To start, left-click to select the imported video. Next, left-click tools > object tracking > “auto track point”.
Your workspace window will switch to a special viewer model. Use your scroll wheel to zoom into the region of interest. Hold+left-click and drag your cursor, starting on the point you would like to track.
Continue moving cursor to define a bounding box around point of interest. Release left mouse button once the point is enclosed.
After exiting, a graph showing speed versus time will appear. For each track generated from the given video, a separate graph will be available for viewing. As discussed above, speed is calculated by dividing the displacement vector between two track points by the time interval between them. To reduce volatility caused by jitter when tracking a point of interest in the video, the time interval can be adjusted using the “min delta time” setting. Increasing this interval modifies which future track point is used to calculate the displacement vector, resulting in a smoother speed graph. Additional time-averaging can also be performed using the linear and Butterworth filters.
automatic object Tracking
The automatic object tracking procedure is typically applied to background videos of calibrated cameras. To run the automatic motion tracking process, left-click to select the camera. Next, left-click tools > object tracking > “analyze”. The video will be analyzed for all trackable objects.
There will be a corresponding entry in the graph list for each trackable object."
Whether tracked automatically, through manual point tracking, or via auto point tracking, all tracks will be available in the track list for each camera or video where tracking is applied.
Automatic object tracking is demonstrated in the following video. Additionally, the video highlights the advantage of using the "pick path" feature, as described below.
Reducing projection-related deviations
Constraining position variability orthogonal to user-defined path
Point tracking using perspective cameras, as discussed above, may introduce unwanted position deviations related to how a given point is projected in space from the camera. In the example shown below, a ray is projected outward from the camera, intersecting with the terrain mesh at the selected point’s position vector. Because the selected point is on the helmet, which is elevated above the ground, the projection onto the terrain occurs approximately 4.5 feet away from the true motorcycle trajectory. As the z-position of the camera moves closer to the tracked point, these deviations can become more significant.
To address this issue, the “pick path” feature can be used as a countermeasure. By clicking “pick path” below the track list, you can manually select a more accurate path for your tracked point.
Next, left-click on the unfrozen path, which can be a line, polyline, curve, or similar. Once the path is selected, the tracked points will be positioned in space based on the intersection of the corresponding point directly above the chosen path. The selected path can be determined using reasonable estimates, witness marks, or simply based on logical boundary conditions.
Using aligned point clouds to determine path
In the section above, Video Analysis Example 2, we reviewed the process of using a point cloud to position objects in space by fitting them to their apparent positions within a video. Building on that example, we can draw a line through the center of the point cloud representations of our subject vehicle and then move it laterally. To do this, switch to “Axis Local” mode and press [2] at the top of the keyboard to position the line underneath the passenger side headlight.
Using the “auto point track” feature, we select the passenger side headlight. Because the camera and this point are close in z, the terrain projected positions in space are far from the true positions.
By selecting the “pick path” option for the headlight track, Virtual CRASH calculates the point projection intersection along the specified path. As shown below, the red trajectory line intersects with the point cloud representation of the subject vehicle’s front headlight as expected. With the displacement vectors versus time now aligned along the correct trajectory, the vehicle speed versus time graph behaves as expected and aligns well with the known test data.
Testing boundary conditions
Note that in cases where the exact x-y projected point path is uncertain, boundary condition tests can still provide valuable insights. In the example below, we test the extreme cases for the passenger-side headlight trajectories, assuming the subject vehicle is in either the far-left position (red line) or the far-right position (yellow line) within a typical lane. Selecting “pick path” for each scenario updates the speed graph accordingly. The resulting speed graphs are superimposed using a simple photo editor and displayed below.
This video demonstrates how to use the "auto track point" feature with the "pick path" option as shown above.
Constraining position variability orthogonal to surface geometry
In cases where the object path is unknown or cannot be inferred, users can restrict track points to a surface. This surface aligns with the underlying terrain mesh but includes a z-offset specified by the “height” input value. In the example below, the track points are constrained to a plane parallel to the x-y plane, thereby reducing variability in the z direction. This option can be particularly useful when the track point feature has a known height, such as a vehicle’s headlight or a pedestrian’s head height.
Exporting points
Note, track point position versus time data can be accessed in the dynamics report.
dashcam videos
Camera tracking
In the example below, a dashcam’s video frames were extracted and used as camera background images. The camera calibration constants (k1, k2, k3, p1, p2, fov) were based on the first frame (ideally, the frame with the most identifiable features) and then applied to subsequent frames. Additional cameras were placed within the scene using frames corresponding to every 2 seconds of video. Since each camera can project its frame back into the scene, it is easy to either position and orient the camera manually by comparing a semi-transparent projection to the scene data or recalibrate each additional camera for position and orientation while keeping the calibration constants fixed using the “remove points” feature.
A simple time-distance analysis can be performed using the known camera positions over time. The corresponding implied speed-versus-time data can then be utilized for simulation via ADS (learn more) or for path animation (learn more).
Speed Analysis
Vehicle ground speed can be estimated using dashcam video by utilizing both manual and automatic track points. In the example below, we analyzed dashcam footage from WREX 2023 Crash Test 3. By specifying “time-min” and “time-max”, we isolated a frame containing a sufficient number of identifiable reference points centered about the time-window of interest. These reference points, combined with site survey data, were used to calibrate the camera for distortion coefficients, position, and orientation.
In the video below, we use the calibrated camera along with both manual and automatic point tracking features to estimate speed. In this case, the estimated speeds are within 10% of the actual ground speed.
When using point tracking features for dashcam speed analysis, it is recommended to calibrate the camera at a frame centered on the video time-window of interest. This is because the pixel positions of selected features depend not only on the camera's relative speed but also on its orientation, which is dynamic in a moving vehicle. Additionally, it is advisable to select features that are well-distributed across the video at terrain height if possible. This approach provides a better understanding of the possible range of speeds.
Volitility reduction
Even for a camera with a fixed orientation relative to the reference frame containing tracked points, there may still be volatility in object positions determined via “analyze” and “auto track point” due to underlying frame-to-frame pixel uncertainty of the given object’s location. This volatility can affect speed calculations, as they are derived from displacement vectors between points in 3D space which are themselves derived from pixel locations. To reduce uncertainties in position and corresponding speed, several countermeasures can be applied:
min. delta time: This directly modifies the displacement vector calculation. At a given time-step, the initial position for the displacement vector is set to the track point nearest in time to (current time-step - 0.5 * min delta time) - dt, and the final position is set to the track point nearest in time to (current time-step + 0.5 * max delta time), where dt is the time increment for one frame. The time interval used for the average speed calculation is the difference in time between the final and initial track points.
Linear average and Butterworth filters: These filters directly modify the speed graph. Enable the “use” toggle in the “filter” menu to apply them. The linear average filter smooths the speed graphs by averaging values within a specified window. The “radius” input determines the window size. For instance, if radius = 1, the speed value at the current frame time is calculated as the average of the current frame time and the frame times immediately before and after it.
Screen data: This directly modifies track point positions in 3D space. It calculates the pixel average position over the current frame and neighboring frames contained within the “radius.” At the current frame time, the average pixel location is then used to find the track point in 3D space.
These options are illustrated in the video below.
Notes:
(1) Camera calibration, object tracking, and automatic point tracking are based on OpenCV, an open-source computer vision library (opencv.org).
(2) The camera calibration procedure uses a classic pinhole camera model to solve for intrinsic parameters (such as focal length and distortion coefficients for radial and decentering distortions) and extrinsic parameters (position and orientation of the camera). For more details about this type of approach, see: J. Weng, P. Cohen, and M. Herniou, “Camera Calibration with Distortion Models and Accuracy Evaluation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, no. 10, pp. 965–980, Oct. 1992.
(3) The time associated with each frame is determined based on the video frame rate, which is obtained by examining the video file.
© 2024 Virtual CRASH, LLC. All Rights Reserved