This is a group project that I have involved under Image Processing and Computer Graphics course in UCSC. The main intention of this project is to come up with an algorithm in order to reduce the size of a video. We have developed a method where we achieved 2% output size as the best results.
What is a CoDec
The job of a Video enCOder and DECoder pair is to make a video data playable and recordable on a device using less resources than using raw video data with bearable quality loss or no loss at all. So a video is encoded into file with a specific structure using an encoder and the corresponding decoder to correctly read that data.
Examples are MPG, AVI, MP4 etc.
Current Approaches
- Accumulative Images
Detecting the differences between frames and adding them during playback. - Motion Detection
Detect motion of the video and use morphing and moving techniques to rebuild video. - Image compression
This is similar to JPEG compression where the quality of the frames are reduced. - Key-framing
Adding full frame data where difference percentage is very high.
What we’ve done
- Block based difference structure
- A single frame is divided into blocks (with fixed dimensions).
- Storage and processing is done on a per block basis
- Use accumulative differences between frames
- Each block of the previous frame are matched with blocks of the next frame with a threshold value
- If it’s nearly the same, no data is written about that block
- Key-framing
- The whole frame will be stored if the number of replaced blocks are more than a given threshold.
- Motion detection
- Phase Correlation
- Measure how much a block has been moved from its previous position in the next frame
- By comparing with the new location’s image data if the differences are minor, we can go about with just moving the block without storing any image data within the block
- The lesser the number of edges in a block, the more we can leave it there as it is since movement is detected through edges of an image
- Phase Correlation
- Image Processing Techniques
- Artifacts are left over by prior processing need to be removed
- Blurring is to be used, but only at places where edges are not nearby so we can keep the image sharpened
- JPEG like compression using Discrete cosine transform (DCT)
Implementation
We have used C/C++ to implement the base algorithm. We have used OpenCV libraries in order to read the source video data as well as phase correlation of a frame.
- Phase Correlation
- The prototype produces its output image only using data that that’s meant to be written to the output file. And results in a >80% compression rate for most video files and around 98% at it’s best.
- Canny Algorithm
- We have used the inbuilt Canny algorithm in OpenCV to detect edge complexity.
- JPEG compression
- Compresses the full frames and blocks with JPEG compression algorithm.
- It gives an enormous compression percentage.!!
File structure
Screenshots
Limitations
- No Audio
We couldn’t encode the audio track with the file, so the any audio data will be lost during the encoding process. - Gives better compression ratio for videos with less variations.
Eg. Interviews (since the most of the parts are not changing) - Works well for videos with static frame rates
There is no mechanism to get the current frame rate of the source video unless using another codec. So we assume that the frame rate is 30 fps and it can be changed in compile time. - JPEG
We tried with and without using JPEG in our method and got enormously better compression ratio can be obtained. But the problem is, JPEG is a commercial and we can’t use that for a product.
Reblogged this on milanharindu and commented:
This is a group project which was done in terms of completing an assignment of Image Processing and Computer Graphic course module in UCSC