There are two approaches to this. There's the post-processing architecture that accepts all the data incoming and then stores the data on disk. Then, there is the more common in-line architecture.
From a tactical perspective, I like the post-processing approach. As long as you keep buying disk, you can keep doing backups. It may not be as elegant or nicely designed as the in-line approach, but you can always do the backup and the data can be deduplicated later.
However, my opinion is in a state of flux in this area.
Check out the entire Data Deduplication FAQ.