Maksym Yemelyanov - Fotolia
Compression is one of the data reduction methods that has been around the longest. In fact, the old PKZIP utility that was so popular back in the 1980s was based on the use of data compression technology.
By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.
Data compression continues in use today because it is both simple and effective. It works by scanning the bits that make up a file in an effort to locate long bit strings that are repeated more than once. If such bit strings are located, the file is rewritten with the long, repetitive bit strings replaced by much shorter strings. The end result is that, once compressed, the file consumes less storage space.
As well as data compression technology works, there are a few potential disadvantages to compression. First, data can only be compressed if it contains recurring bit strings. After all, it is those recurring strings that are eliminated in an effort to reduce the size of the file.
The problem is that many modern data types are already compressed. You might, for example, have heard the term compressed media used to refer to digital video or JPEG images. Such files contain very little -- if any -- redundancy, and therefore cannot be further compressed.
Another potential disadvantage of data compression technology is that it can be a CPU-intensive process, although there are offloading techniques that can be used for network compression. CPU cycles are consumed by the process of parsing files in search of redundant bit patterns.
Taneja Group analyst Mike Matchett explores the topic of compression and deduplication.
In most cases, the CPU overhead isn't problematic, but it is something to consider if processing is occurring on a system that is already CPU bound. This is especially true if the data is nonredundant and CPU cycles are being wasted trying to compress data that is already compressed.
Also, watch out for the potential of data loss. Data compression technology effectively removes good data from a file and replaces it with a marker. In most cases, this is not a problem. However, the removal of redundant bit strings opens the door to the potential for extreme corruption. A minor disk error that might normally result in a small problem for an uncompressed file could cause a compressed file to become completely unreadable.
Dig Deeper on Data reduction and deduplication
Related Q&A from Brien Posey
Having a strategy to back up SAP HANA is a must. It's important to decide exactly what you'll be backing up, along with which method best suits your ...continue reading
Picking an NVMe drive is an important decision. Consider thermal control, proprietary software and drive architecture to make the right choice.continue reading
XenApp and XenDesktop shops should use these four Citrix VDI monitoring tools to get better control over their deployments, user sessions and ...continue reading
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.