OpenCV is an open-source computer vision library that has an extensive collection of great algorithms. Since one of the latest mergers, OpenCV contains an easy-to-use interface for implementing Super Resolution (SR) based on deep learning methods. The interface contains pre-trained models that can be used for inference very easily and efficiently. So far, it works with C++ and Python.
OpenCV often works together with different frameworks such as Caffe, Pytorch, and Tensorflow in Artificial Intelligence, in particular, Computer Vision. In this post, I will explain what it can do and show how to build a simple Python-based program using OpenCV to read, display, and save frames.
A video is a sequence of fast moving images. The obvious question that follows is how fast are the pictures moving? The measure of how fast the images are transitioning is given by a metric called frames per second (FPS). When someone says that the video has an FPS of 40, it means that 40 images are being displayed every second. Alternatively, after every 25 milliseconds, a new frame is displayed.
The script below shows a demo of how to read frames from a video source, to display frames on a window, and finally to save frames in a video file as an output.
import cv2 # Input sources cap = cv2.VideoCapture(0) #camera cap = cv2.VideoCapture('./input.mp4') #video source # Reset the size of input frames, 3 stands for width; 4 stands for height cap.set(3,640) cap.set(4,480) frame_width = int(640) frame_height = int(480) # Decoder # fourcc = cv2.VideoWriter_fourcc('M','J','P','G') #Mjpeg fourcc = cv2.VideoWriter_fourcc(*'XVID') #FFmpeg MPEG-4 # framerates is set to 10.0 out = cv2.VideoWriter('./out.avi',fourcc,10.0,(frame_width,frame_height)) while True: ret, frame = cap.read() # Display frames cv2.imshow('Cap',frame) # Reset window position cv2.moveWindow('Cap', 0, 0,) # Write frames out.write(frame) if cv2.waitKey(1)==ord('q'): break # Destroy all the frames cap.release() out.release() cv2.destroyAllWindows()
Dig into the code
In OpenCV, a video can be read either by using the feed from a camera connected to a computer or by reading a video file. The first step towards reading a video file is to create a VideoCapture object. Its argument can be either the device index or the name of the video file to be read.
In most cases, only one camera is connected to the system. So, all we do is pass ‘0’ and OpenCV uses the only camera attached to the computer. When more than one camera is connected to the computer, we can select the second camera by passing ‘1’, the third camera by passing ‘2’ and so on.
# Create a VideoCapture object and read from the input file # If the input is taken from the camera, pass 0 instead of the video file name. cap = cv2.VideoCapture('test.mp4') cap = cv2.VideoCapture(0) #webcam
After reading a video file, we can display the video frame by frame. A frame of a video is simply an image and we display each frame the same way we display images, i.e., we use the function imshow().
# Read until video is completed while(cap.isOpened()): # Capture frame-by-frame ret, frame = cap.read() if ret == True: # Display the resulting frame cv2.imshow('Frame',frame) # Break the loop else: break # When everything done, release the video capture object cap.release() # Closes all the frames cv2.destroyAllWindows()
After we are done with capturing and processing the video frame by frame, the next step we would want to do is to save the video.
For images, it is straightforward. We just need to use cv2.imwrite(). But for videos, we need to toil a bit harder. We need to create a VideoWriter object. First, we should specify the output file name with its format (eg: output.avi). Then, we should specify the FourCC code and the number of frames per second (FPS). Lastly, the frame size should be passed.
# Define the codec and create VideoWriter object.The output is stored in 'out.avi' file. # Define the fps to be equal to 10. Also, frame size is passed. out = cv2.VideoWriter('out.avi',cv2.VideoWriter_fourcc('M','J','P','G'), 10, (frame_width,frame_height))
FourCC is a 4-byte code used to specify the video codec. The list of available codes can be found at fourcc.org. There are many FOURCC codes available, but in this post, we will work only with MJPG.
This post is just quick demo demonstrating the basic usage of OpenCV, a powerful tool for Compution Vision in AI. I will be posting more articles about more advanced usage of OpenCV in Object Detection and Object Tracking in the future. Stay tuned !