Sure, I'll hit that first and then talk about the other aspect of dedupe; inline versus post-process. Basically, reverse referencing means that your newest backup as it's written into a dedupe environment is going to be within that environment made up of pointers that really go back to the older backups, maybe even back to the original pull if you have a full and incremental-forever approach.
Your newest backup as a result is really going to have a lot of pointers back to older data. So when you "redupe" the data back to its original state, you're going to be going back and pointing to lots and lots of pointers because you have a reverse-referencing-type methodology.
Forward referencing is kind of the opposite and completely different. There is only one vendor out there that does this and that's Sepaton. With forward referencing, the newest backup is maintained in its entirety. So, your old backups reference via pointers to the newest data.
As a result, when you do a restore with forward referencing, the newest backup is going to be readily acceptable and you don't have to do a lot of reconstitution or reduplicate that data. But when you have older backups that you need to restore and retrieve, you will have to do some reconstitution, so that will take longer.
When you look at a restore histogram for your average backup production environment, there's always a big peak around day one or two, and it goes up and up and then drops down around day seven. Then you have little spikes on the radar after that. So, most of your restores are usually happening within a week of the time the data was backed up.
Forward referencing from a technology point of view actually fits that histogram very well. If you think about the deduplication process, deduplication itself is definitely a CPU-intensive process and scale and performance are the big limiters. How much data can you crank through these devices, keeping in mind that backup is the most I/O-intensive application in the data center? You are definitely creating some CPU-intensive workloads when you're opening up the fire hose into a VTL device, so to speak.
You have the initial read, the analysis, comparisons, indexes, pointers and the writing of the data itself. Plus, the pointer or maybe just writing the pointer, but all that has to happen in a rapid fashion. So, how do the vendors do that? There a couple things. One is inline deduplication, where your data is inbound into a device and is being deduplicated in real time as it's written to usable storage. Post-process is a different methodology where your data is written into a disk cache that's non-deduplicated and later post-process is deduplicated and written out from your initial cache to a deduped area, kind of like a de-staging-type strategy.
A few of vendors, notably NetApp and Quantum, are really looking at this from a very interesting perspective of variable processing -- being able to toggle or configure respectively between inline and post-process, depending on CPU and workload.
That's some of the more innovative technology that is created. The jury is definitely out on which is better and there are tradeoffs to both.
Click here to check out the entire VTL Deduplication FAQ.